Please note: You are viewing the unstyled version of this web site. Either your browser does not support CSS (cascading style sheets) or it has been disabled.

Department of Computing

Computing >> CLT >> COMP348 home >> Assigments >> Assigment 1, Part 3 >> Assignment 1, Part 3: FAQ
 
 

COMP348 Document Processing and the Semantic Web

Assignment 1, Part 3:
FAQ

  • Again, does this mean the hardcopy is due some time on Monday morning or somesuch?

    Monday by noon for the hardcopy is fine.

  • I was wondering if you could clarify what the files we are meant to submit actually do, and how they are meant to interact with each other. I found the assignment specs rather confusing in regards to this. To me, it seems as if the first 2 files are meant to do the same thing, but with different output? What are we meant to do with features.dat? It seems a bit excessive...etc.

    The three functions process.py, learn.py, classify.py all have distinct purposes. I'll use as example the Dutch-English classification problem of the week 7 practical. There, the training and test data are in single files: train.txt and test.txt.

    • process.py, when run on train.txt, should extract the appropriate features (the 10 most common letter triples), turn them into SVM format, and save them in a file (say train.dat); similarly, when run on test.txt, it should produce test.dat. (These output files train.dat and test.dat are two specific instances, when run on different inputs, of what I generically called features.dat in the assignment specs.)

    • learn.py, which calls process.py to produce train.dat, runs SVM-light's svm_learn using this train.dat to produce the model file model.dat.

    • classify.py, which calls process.py to produce test.dat, runs SVM-light's svm_classify using this test.dat and the model file model.dat, and returns the predictions file predictions.dat produced by svm_classify and the output of svm_classify results.dat.

  • Where are svm_learn and svm_classify meant to reside, when we call them? Are we meant to include them in our submission, or do you provide them somewhere?

    Assume they'll be in the same directory as your Python scripts. I'll provide them.


Comments to: Mark Dras or Diego Molla

Computing | Division ICS | Macquarie University

Last Modified:
Copyright Macquarie University
CRICOS provider no. 00002J