Having in mind both the importance that semantic information plays nowadays in natural language processing, as well as the work involved in creating lexical resources from the scratch, this research aims the semi-automatic creation of a lexical ontology for Portuguese.

While, for English, WordNet [1] established as the standard model of a lexical ontology, for Portuguese, the few existing similar resources, created manually, are either on earlier stages of development or not publicly available for download and entire use. Therefore, as an alternative to manual creation and maintenance of such resources, the work proposed is concerned with the development of computational tools capable of extracting lexico-semantic knowledge from Portuguese textual resources. The knowledge acquired will then be structured into a public domain lexical ontology.

The extraction procedures will be based on the detection of textual patterns that are indicative of lexico-semantic relations between lexical items. Machine-readable dictionaries (MRDs) will be used as the primary source of knowledge, since they are already structured around words and their meanings, they typically use simple vocabulary, they were created by experts and they are the main source of general knowledge. The project PAPEL [2, 3] has shown the first steps considering the automatic extraction of semantic information from a general Portuguese MRD , using handcrafted semantic grammars. Therefore, the results and conclusions obtained in PAPEL will be used as a starting point. However, this research is also concerned with the exploration of other available Portuguese MRDs.

Moreover, this work will not be limited by processing dictionaries so, textual corpora will be used as the second source of knowledge, in order to enrich the the ontology in several more specific domains. Furthermore, the quality and utility of the resources developed will be assessed. Besides manual evaluation, and considering the time needed to perform the latter, automatic evaluation methodologies will be devised. In the end of this research, important contributions to Portuguese NLP are expected, such as a new public domain lexical resource and computational tools capable of learning lexico-semantic information from text.

