English Italiano
Partnership of Paisà
The project is a joint effort of:
- University of Bologna - Sergio Scalise with colleagues Claudia Borghetti and Francesca Masini (in charge of the research unit during 2012/2013)
- CNR Pisa - Vito Pirrelli with colleagues Alessandro Lenci, and Felice Dell'Orletta
- European Academy of Bozen/Bolzano - Andrea Abel with colleagues Chris Culy, Henrik Dittmann, and Verena Lyding
- University of Trento - Marco Baroni with colleagues Marco Brunello, Sara Castagnoli, and Egon Stemle
Project Lead
- Sergio Scalise (Università di Bologna): during 2009-2011
- Vito Pirrelli (CNR Pisa): during 2012/2013
Responsibilities are divided among the partners as follows:
- [corpus creation]
- The corpus collection is done by the University of Trento. Copyright-free text materials are bootstrapped from the web. The harvested texts are automatically cleaned by stripping of html tags and other formatting and navigation data (for more information see construction steps).
- [corpus annotation]
- The linguistic annotation of the corpus is done by combining manual and automatic annotation procedures. Manually annotated data is used to refine the computational linguistic methods and tools used for corpus annotation (for more information see construction steps). The manual annotation of corpus texts and the evaluation of analysis tools is done by researchers of the University of Bologna, the University of Trento, and CNR Pisa. Tools are developed, adjusted and applied by the CNR Pisa.
- [corpus interface]
- The corpus is made available to the public via a free online interface. The creation of a multi-facetted user interface for language learners and researchers is accomplished by the European Academy of Bozen/Bolzano.