Home Page National StatisticsAbout National Statistics & ONS
- Search   - About National Statistics - About ONS - About data  
- Filling in a Survey - Serving the public - Getting users involved  

* clamour
- introduction
- project co-ordination
- foundations
- linguistics
- users' needs
- electronic questionnaire
- assessment and dissemination
- back to methods & quality
* clamour
 

Context Based Semantic Disambiguation - Workpackage 2

Objectives

One word (a same string of characters) can, potentially, have different meanings. The specific meaning in a specific description is selected depending on the co-text (the set of surrounding terms) and context (the domain) it occurs in.

For example, in a description such as Nail clippers, the word Nail is disambiguated because of its co-occurrence with the word clippers. In another descriptions, say hammers and nails, disambiguation can be based on the fact that hammers and nails share a common domain property (they both refer to tools). Considering the co-text in which a word occurs is, therefore, an effective way of differentiating between different meanings. A global solution to the disambiguation problem is the main objective.

Description of work

The selection of the correct meaning for a term can be decided by the semantic properties of the classification to which it refers. Classifications are usually organised in chapters, each chapter corresponding to a specific set of activities or products. Hence, each chapter can be associated with a set of domains and this can assist with the disambiguation of terms in descriptions. For example, a classification chapter which describes products related to "industrial artefacts" could be associated with such domains as "tools, machines, vehicles…". The word Nail appearing within this context would be correctly disambiguated.

Once this disambiguation has been performed, descriptions can be labelled to reflect the correct meaning for the purpose of full text retrieval.

This task will be achieved by firstly describing the major existing difficulties encountered by the classification systems in this field and secondly by listing ambiguous entities and then inferring how such ambiguities can be eliminated on the basis of co-textual and contextual information.

Deliverables

The deliverable for this part is the specification of a procedure which takes as input a set of ambiguous descriptions and gives as outputs the corresponding descriptions, annotated with domain information. The domain information will be computed on the basis of co-textual and contextual information.

Milestones and expected result

Lists of ambiguous entities. Inference work to determine how such ambiguities can be eliminated, using co-textual and contextual information. Specification of disambiguation procedure.

  • Report describing disambiguation in local and global contexts - Due August 2001

CLAMOUR feedback

This page last revised: Wednesday, 28 June 2000

           FAQs and Contact Us | Copyright | Terms and Conditions | Privacy Statement | Link to Directgov