Context Based Semantic Disambiguation - Workpackage 2
Objectives
One word (a same string of characters) can, potentially, have different meanings. The specific meaning in a specific description is selected depending on the co-text (the set of surrounding terms) and context (the domain) it occurs in.
For example, in a description such as Nail clippers, the word Nail is disambiguated because of its co-occurrence with the word clippers. In another descriptions, say hammers and nails, disambiguation can be based on the fact that hammers and nails share a common domain property (they both refer to tools). Considering the co-text in which a word occurs is, therefore, an effective way of differentiating between different meanings. A global solution to the disambiguation problem is the main objective.
Description of work
The selection of the correct meaning for a term can be decided by the semantic properties of the classification to which it refers. Classifications are usually organised in chapters, each chapter corresponding to a specific set of activities or products. Hence, each chapter can be associated with a set of domains and this can assist with the disambiguation of terms in descriptions. For example, a classification chapter which describes products related to "industrial artefacts" could be associated with such domains as "tools, machines, vehicles…". The word Nail appearing within this context would be correctly disambiguated.
Once this disambiguation has been performed, descriptions can be labelled to reflect the correct meaning for the purpose of full text retrieval.
This task will be achieved by firstly describing the major existing difficulties encountered by the classification systems in this field and secondly by listing ambiguous entities and then inferring how such ambiguities can be eliminated on the basis of co-textual and contextual information.
Deliverables
The deliverable for this part is the specification of a procedure which takes as input a set of ambiguous descriptions and gives as outputs the corresponding descriptions, annotated with domain information. The domain information will be computed on the basis of co-textual and contextual information.
Milestones and expected result
Lists of ambiguous entities. Inference work to determine how such ambiguities can be eliminated, using co-textual and contextual information. Specification of disambiguation procedure.
- Report describing disambiguation in local and global contexts - Due August 2001
CLAMOUR feedback
This page last revised: Wednesday, 28 June 2000