langtech:corpora
This is an old revision of the document!
Back to Linguistic Research Infrastructure (LiRI)
Corpora & Assistive Technology
The Language Technology group has expertise in handling various types of corpora. We are building tailor-made applications to explore large and structurally complex collections of language data. In particular, we are competent in:
- The design of databases to hold application-relevant data
- Generating interactive visualizations
- Efficiently querying large data collections (in particular corpora)
- Anonymisation of large data sets
- Data crawling/scraping and processing of web sources, batch download of documents
- Data extraction and conversion
langtech/corpora.1675169495.txt.gz · Last modified: by Gerold Schneider
