ACEtoWIKI is the result of a joint effort between FBK and CELCT. The resource has been created by adding a manual annotation layer connecting the English ACE-2005 Corpus to Wikipedia.
ACEtoWiki has been produced by manually annotating the non-pronominal mentions i.e. the named (NAM) and nominal (NOM) mentions contained in the English ACE 2005 corpus with links to appropriate Wikipedia articles.
Each mention of type NAM is annotated with a link to a Wikipedia page describing the referred entity. For instance, “George Bush” is annotated with a link to the Wikipedia page George_W._Bush.
NOM mentions are annotated with a link to the Wikipedia page which provides a description of its appropriate sense. Note that the object of linking is the textual description of an entity, and not the entity itself.
Moreover, mentions of type NOM can often be linked to more than one Wikipedia page. In such cases, links are sorted in order of relevance, where the first link corresponds to the most specific sense for that term in its context. For instance, for the NOM mention “President” which in the context identifies the United States President George Bush the following links are selected as appropriate: President_of_the_ United_States and President.
For more details concerning the annotation, please read the paper:
Luisa Bentivogli, Pamela Forner, Claudio Giuliano, Alessandro Marchetti, Emanuele Pianta, Kateryna Tymoshenko
“Extending English ACE 2005 Corpus Annotation with Ground-truth Links to Wikipedia”
Proceedings of COLING 2010 Workshop on “The People's Web Meets NLP: Collaboratively Constructed Semantic Resources”, Beijing, China, August 28, 2010.
The ACE 2005 corpus is distributed by LDC (http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T06
) and the annotation is available by filling in the license agreement at Obtain ACEtoWiki