Please note: You are viewing the unstyled version of this web site. Either your browser does not support CSS (cascading style sheets) or it has been disabled.

Department of Computing

Computing >> CLT >> COMP348 home >> Practicals >> Practical Week 7
 
 

COMP348 Document Processing and the Semantic Web

Practical, Week 7

The focus this week is on making sure you understand how to use SVM-light before the break, as you'll only have one more prac with Aung after that, where you can ask him questions, before Part 3 of Assignment 1 is due.

For the rest of the prac, you can work on completing Part 2 of Assignment 1.

More on Using SVM-Light

This follows on from last week's practical, basically combining together the two parts.

You are working with the same Dutch and English data, broken into train.txt (training) and test.txt (test) sets.

You should construct a Python program that:

  1. reads in the training data;

  2. determines the features, i.e. the 10 most common letter triples from each langage;

  3. calculates the feature counts for each sentence in the training corpus;

  4. translates these feature counts into SVM-light format;

  5. runs svm_learn;

  6. reads in the test data;

  7. calculates the feature counts for each sentence in the test corpus;

  8. translates these feature counts into SVM-light format;

  9. runs svm_classify, saving its output to a file; and

  10. returns the accuracy rate of the classification.

 


Comments to: Mark Dras or Diego Molla

Computing | Division ICS | Macquarie University

Last Modified:
Copyright Macquarie University
CRICOS provider no. 00002J