| |
COMP348 Document Processing and the Semantic Web
Assignment 1, Part 3: FAQ
-
Again, does this mean the hardcopy is due some time on Monday
morning or somesuch?
Monday by noon for the hardcopy is fine.
-
I was wondering if you could clarify what the files we are meant to
submit actually do, and how they are meant to interact with each
other. I found the assignment specs rather confusing in regards to
this. To me, it seems as if the first 2 files are meant to do the same
thing, but with different output? What are we meant to do with
features.dat? It seems a bit excessive...etc.
The three functions process.py, learn.py, classify.py all have distinct purposes.
I'll use as example the Dutch-English classification problem of the
week 7 practical.
There, the training and test data are in single files: train.txt
and test.txt.
-
process.py, when run on train.txt, should extract the appropriate
features (the 10 most common letter triples), turn them into SVM format, and save them in a file
(say train.dat); similarly, when run on test.txt, it should
produce test.dat. (These output files train.dat and
test.dat are two specific instances, when run on different
inputs, of what I generically called features.dat in the assignment specs.)
-
learn.py, which calls process.py to produce train.dat,
runs SVM-light's svm_learn using this train.dat
to produce the model file model.dat.
-
classify.py, which calls process.py to produce test.dat,
runs SVM-light's svm_classify using this test.dat
and the model file model.dat, and returns the predictions file
predictions.dat produced by svm_classify and the output of
svm_classify results.dat.
-
Where are svm_learn and svm_classify meant to reside, when we call
them? Are we meant to include them in our submission, or do you
provide them somewhere?
Assume they'll be in the same directory as your Python scripts. I'll provide them.
Mark Dras or
|