|
|
Projects
I-CAB
DESCRIPTION:
I-CAB is an annotated corpus consisting of 525 news stories taken from the local newspaper L'Adige, for a total of around 180,000 words.
It is annotated with semantic information at different levels:
- temporal expressions
- entities (i.e. persons, organizations, geo- physical locations, and geo-political entities)
- relations between entities (e.g. the affiliation relation connecting a person to an organization)
I-CAB version 4.1 has been used as data set for the training and the evaluation of participating systems in the following tasks of
the Evalita 2007 evaluation campaign:
- Temporal Expression Normalization and Recognition
- Named Entity Recognition
I-CAB version 4.1 is freely available for research purposes: Obtain I-CAB.
I-CAB has been used also in the Entity Recognition task at Evalita 2009.
CELCT’S ROLE:
CELCT developed I-CAB in conjunction with FBK-irst, carrying out the manual annotation of the corpus and editing the Italian extension of the English annotation guidelines used in the Automatic Content Extraction Program (ACE), promoted by NIST.
This adaptation to Italian has been presented as an invited talk during the ACE 2007 workshop at the University of Maryland (USA).
LINK:
http://ontotext.fbk.eu/icab.html
|
|