| |
COMP348 Document Processing and the Semantic Web
Tutorial Week 8
Part of Speech Tagging
-
Suppose that the following probabilities have been estimated from a corpus:
| P(time|V) = 0.2 | P(flies|N) = 0.3 | P(D|P) = 0.6 |
| P(flies|V) = 0.4 | P(arrow|N) = 0.3 | P(N|P) = 0.4 |
| P(like|V) = 0.4 | P(V|{}) = 0.2 | P(V|N) = 0.7 |
| P(an|D) = 0.5 | P(N|{}) = 0.8 | P(P|N) = 0.2 |
| P(the|D) = 0.5 | P(N|V) = 0.3 | P(N|N) = 0.1 |
| P(like|P) = 1 | P(D|V) = 0.4 | P(N|D) = 1 |
| P(time|N) = 0.4 | P(P|V) = 0.3 |
- With these probabilities, draw the associated Hidden Markov Model.
- Compute the probability of the following PoS tagging:
time/N
flies/V like/P an/D arrow/N
-
You have the following two sentences: consider the first to be
from a tagged corpus and the second the output from the
initialisation step of a tagger.
Fruit/NN flies/NNS like/VBP the/DT oranges/NNS like/IN giraffes/NNS
like/VBP the/DT leaves/NNS.
Fruit/NN flies/VBZ like/IN the/DT oranges/NNS like/VBP giraffes/NNS
like/IN the/DT leaves/NNS.
Trace through the operation of the Brill tagger if the following rules
were available:
- VBP IN PREVTAG NNS (interpretation: Change VBP to IN when previous tag is NNS)
- NN VP PREVTAG TO
- VBZ NNS NEXTTAG VB
- VBZ NNS NEXTTAG VBP
- VBZ NNS PREV1OR2TAG IN
- VBZ NNS NEXTWD of
- IN VBP NEXTTAG DT
- IN VBP PREVTAG NNS
Parsing
In the following exercises consider this grammar:
| S -> NP VP | V -> "read" |
| NP -> Pron | Det N | Det N PP | P -> "in" |
| VP -> V NP | V NP PP | Det -> "the" |
| Pron -> "I" | N -> "book | park" |
| PP -> P NP |
- Write the trace of a top-down, left-right, depth-first parser for
the sentence "I read the book in the park".
- Write the trace of a shift-reduce parser that performs reduction
whenever it is possible for the same sentence "I read the book in the
park".
- Populate the chart table of a chart parser that uses the bottom-up
strategy for the same sentence "I read the book in the park".
Mark Dras or
|