University of Oslo University of Heidelberg

salCORPORA


© 2005, Database Consult GmbH, Version 3.4.2

 Welcome

 Topics

salCORPORA
The text database salCORPORA contains written corpora of  South Asian Languages. Presently only Hindi is processed, but more languages are expected to be added. The database is located at the Department of Modern South Asian Studies, South Asia Institute, Universtity of Heidelberg. An office in New Delhi serves for the acquisition and input of the texts.

salCORPORA main features
  • extension and maintenance of corpora of written Hindi
  • texts from the commencements of Hindi till today
  • permanent upgrading
  • works of fiction, scientific and popular scientific texts, newspaper texts and many other textforms
  • empirical basis for language-related research
  • Financed by: Deutsche Forschungsgemeinschaft
    (German Research Foundation)
    Head: Monika Boehm-Tettelbach
    Organisation: Claus Peter Zoller
    Engineering: Günter Unbescheid

    Please note: This site is still under construction, so not all of the links below are active at the moment.

  • application areas
  • corpus acquisition
  • the salCORPORA text model
  • current corpus archive
  • morphological tagging
  • operational availability