In the BNC World Edition each word or "multiword unit" (such as of course) is "tagged" according to its word-class or part of speech (PoS) with one of the codes reproduced below. "Fused" forms – contractions and possessives written with apostrophe as well as the form cannot and a few others – are "tokenized" as separate units and receive tags as well. PoS tags are assigned automatically by computer and thus are subject to ambiguity (almost 4%), errors (about 1%) and inconsistencies. Consequently some occurrences of the same word form representing the same word class may appear under different PoS codes.
Normalization conventions were adopted for this database to limit its overall size and to allow important patterns to emerge more clearly. For more detailed study of forms and tags which are not distinguished in this database, please consult the British National Corpus directly. In cases of "portemanteau" or ambiguous word tagging, the BNC shows two possible tags; here the more likely first one has been chosen. All capital letters are converted to lower case; proper nouns are recognized by the NP0 tag. Numerals are mapped onto a single "#" regardless of their magnitude or precision. Multiword_units identified by the parser are joined with an underscore into a single "word". Refer to the normalization conventions for further details on how the database was compiled. These links lead to the BNC's lists of multiword units and fused forms. Geoffrey Leech and Nicholas Smith's Manual to accompany The British National Corpus (Version 2) with Improved Word-class Tagging describes the CLAWS parser and explains the PoS codes in greater detail.
No. | PoS Tag | Description |
1 | AJ0 | adjective (general or positive) e.g. good, old |
2 | AJC | comparative adjective e.g. better, older |
3 | AJS | superlative adjective, e.g. best, oldest |
4 | AV0 | adverb (general, not sub-classified as AVP or AVQ), e.g. often, well, longer, furthest. |
5 | AVP | adverb particle, e.g. up, off, out. |
6 | AVQ | wh-adverb, e.g. when, how, why, whether the word is used interrogatively or to introduce a relative clause. |
7 | CJC | coordinating conjunction, e.g. and, or, but. |
8 | CJS | subordinating conjunction, e.g. although, when. |
9 | CJT | the subordinating conjunction that, when introducing a relative clause, as in the day that follows Christmas. |
10 | CRD | cardinal numeral, e.g. one, 3, fifty-five, 6609. |
11 | ORD | ordinal numeral, e.g. first, sixth, 77th, next, last. |
12 | AT0 | article, e.g. the, a, an, no. |
13 | DPS | possessive determiner form, e.g. your, their, his. |
14 | DT0 | general determiner: a determiner which is not a DTQ e.g. this both in This is my house and This house is mine. |
15 | DTQ | wh-determiner, e.g. which, what, whose, which, whether used interrogatively or to introduce a relative clause. |
16 | NN0 | common noun, neutral for number, e.g. aircraft, data, committee. |
17 | NN1 | singular common noun, e.g. pencil, goose, time, revelation. |
18 | NN2 | plural common noun, e.g. pencils, geese, times, revelations. |
19 | NP0 | proper noun, e.g. London, Michael, Mars, IBM. |
20 | PNI | indefinite pronoun, e.g. none, everything, one (pronoun), nobody. |
21 | PNP | personal pronoun, e.g. I, you, them, ours. possessive pronouns such as ours and theirs are included in this category. |
22 | PNQ | wh-pronoun, e.g. who, whoever, whom. |
23 | PNX | reflexive pronoun, e.g. myself, yourself, itself, ourselves. |
24 | POS | the possessive or genitive marker 's or ', tagged as a distinct word. |
25 | PRF | the preposition of. |
26 | PRP | preposition, other than of, e.g. about, at, in, on behalf of, with. Prepositional phrases like on behalf of or in spite of treated as single words. |
27 | VBB | the present tense forms of the verb be, except for is or 's: am, are 'm, 're, be (subjunctive or imperative), ai (as in ain't). |
28 | VBD | the past tense forms of the verb be: was, were. |
29 | VBG | -ing form of the verb be: being. |
30 | VBI | the infinitive form of the verb be: be. |
31 | VBN | the past participle form of the verb be: been |
32 | VBZ | the -s form of the verb be: is, 's. |
33 | VDB | the finite base form of the verb do: do. |
34 | VDD | the past tense form of the verb do: did. |
35 | VDG | the -ing form of the verb do: doing. |
36 | VDI | the infinitive form of the verb do: do. |
37 | VDN | the past participle form of the verb do: done. |
38 | VDZ | the -s form of the verb do: does. |
39 | VHB | the finite base form of the verb have: have, 've. |
40 | VHD | the past tense form of the verb have: had, 'd. |
41 | VHG | the -ing form of the verb have: having. |
42 | VHI | the infinitive form of the verb have: have. |
43 | VHN | the past participle form of the verb have: had. |
44 | VHZ | the -s form of the verb have: has, 's. |
45 | VM0 | modal auxiliary verb, e.g. can, could, will, 'll, 'd, wo (as in won't) |
46 | VVB | the finite base form of lexical verbs, e.g. forget, send, live, return. This tag is used for imperatives and the present subjunctive forms, but not for the infinitive (VVI). |
47 | VVD | the past tense form of lexical verbs, e.g. forgot, sent, lived, returned. |
48 | VVG | the -ing form of lexical verbs, e.g. forgetting, sending, living, returning. |
49 | VVI | the infinitive form of lexical verbs , e.g. forget, send, live, return. |
50 | VVN | the past participle form of lexical verbs, e.g. forgotten, sent, lived, returned. |
51 | VVZ | the -s form of lexical verbs, e.g. forgets, sends, lives, returns. |
52 | EX0 | existential there, the word there appearing in the constructions there is..., there are .... |
53 | ITJ | interjection or other isolate, e.g. oh, yes, mhm, wow. |
54 | TO0 | the infinitive marker to. |
55 | UNC | unclassified items which are not appropriately classified as items of the English lexicon. |
56 | XX0 | the negative particle not or n't. |
57 | ZZ0 | alphabetical symbols, e.g. A, a, B, b, c, d. |
58 | -*- | "wildword" matching any PoS tag (non-standard extension for phrase-frame queries and result sets). |