Difference between revisions of "Tagger"
m (→Definition) |
m (→Comments) |
||
Line 9: | Line 9: | ||
It is common practice to distinguish between rule-based and stochastic tagger, though some tagger combine rules and stochastic information. | It is common practice to distinguish between rule-based and stochastic tagger, though some tagger combine rules and stochastic information. | ||
− | + | In general, state-of-the-art tagger achieve a precision of at least 95% for morpho-syntactic tagging. | |
==Subtypes== | ==Subtypes== |
Revision as of 12:55, 7 July 2007
Definition
A tagger is a device which assigns symbolic labels (tags) to linguistics units. The labels are taken from a predefined set of symbols (the so-called tag-set).
Comments
In most cases, a tagger assigns tags representing morpho-syntactic information to single word-forms or token. But there are tagger which have been designed to identify semantic role of noun phrases or prepositional phrases (sense tagging) and sometimes identifying the structure of a text is considered as a king of tagging (discourse structure tagging).
Conceptually, tagging can be considered as a three step process: (i). identification of the relevant units (ii). assigning all possible labels (e.g. by lexical look-up, applying heuristics, etc.) (iii). disambiguation.
It is common practice to distinguish between rule-based and stochastic tagger, though some tagger combine rules and stochastic information.
In general, state-of-the-art tagger achieve a precision of at least 95% for morpho-syntactic tagging.
Subtypes
Other Languages
- German Tagger (de)