langtech:lcp:corpora:start
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| langtech:lcp:corpora:start [2024/04/22 07:05] – Igor Mustac | langtech:lcp:corpora:start [Unknown date] (current) – removed - external edit (Unknown date) 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== Corpora in LCP ====== | ||
| - | |||
| - | In LCP corpora is modeled as connected layers: at least three layers must represent (i) ordered units, (ii) ordered collections of said units, and (iii) unordered collections of the latter. | ||
| - | |||
| - | Layers can have any number of attributes for annotation purposes, and corpus authors can define additional layers to model further embedding or dependency relations. | ||
| - | |||
| - | The diagram in the figure below shows the structure of a corpus created from the Open Subtitles database, that anotates tokens (layer i) with a form, a lemma and part-of-speech, | ||
| - | |||
| - | {{: | ||
| - | |||
| - | A [[langtech: | ||
langtech/lcp/corpora/start.1713769522.txt.gz · Last modified: by Igor Mustac
