COMP348
COMP348 explores the issues involved in building natural language processing (NLP) applications that operate on large bodies of real text such as are found on the World Wide Web (WWW).
With the Web being full of unstructured and largely text-based data, the applications needed to handle this have their own particular characteristics. In this unit we discuss some core applications for dealing with data on the Web, such as spam filtering and search engines. The unit also explores some developments of Web, such as emerging semantic web technologies which support the exchange of XML metadata on the Web, and Web 2.0 technologies (e.g. social networking, folksonomies, wikis and blogs). Application areas covered include information retrieval, web search, document summarisation, machine translation and information extraction.
The unit focuses on the concepts and techniques required to process real natural language text. Students gain practical experience in using the Python programming language to develop language processing systems.
What's New
-
[29/5/2008]: Assignment 1 Part 3 to be handed back tonight. I've posted some general feedback already.
-
[15/5/2008]: Assignment 1 Part 2 to be handed back tonight. I've posted some general feedback already.
-
[14/5/2008]: Assignment 2 problem posted.
-
[7/4/2008]: Assignment 1 Part 3 problem posted.
-
[3/4/2008]: Assignment 1 Part 1 to be handed back tonight. I've posted some general feedback already.
-
[27/3/2008]: Some small fixes to Assignment 1 Part 2; see version history.
-
[20/3/2008]: Assignment 1 Part 2 problem posted.
-
[6/3/2008]: Assignment 1 Part 1 problem posted; you'll be getting your individual data to do the assignment on Monday.