Coling 2000

Tutorial : Trends in Robust Parsing

Jacques Vergne

GREYC
Université de Caen
FRANCE

https://lucasn01.users.greyc.fr/JacquesVergne/




Aim of this tutorial :

  The aim of this tutorial is to outline and understand fundamental trends of the evolution of robust parsing, among the variety of concepts, techniques, and parsing processes, and to get a synthetic view of the topic, while stressing the evolution of concepts and methods.

  Today, robust parsing is changing rapidly from tagging to chunking and clause bracketing. And partial parsing is becoming less and less partial, with computational properties which allow a good integration into industrial contexts, where linear complexity is a prerequisite : robust parsers are able to process raw linguistic material at a constant and foreseeable rate with foreseeable results.

  This tutorial is designed for PhD. students and researchers in NLP. The expected prerequisites are basic knowledge on parsing and tagging.


Downloading the documents :


Tutorial outline :

Session 1 : Course (3h)        (slides of the course)

Session 2 : Practical (3h)            (practical guidelines)

  The aim of the practical was to illustrate the course and to give participants the opportunity to practice on the "GREYC parser", which is a general platform to design and build parsers.
  The "GREYC parser" is described (in French) on : https://lucasn01.users.greyc.fr/JacquesVergne/analyseur_GREYC/analyseur_du_GREYC.html
  The practical consisted in :   For still more details, ask me by mail :  mailto:Jacques.Vergne@unicaen.fr


Tutorial speakers :

  Jacques Vergne (Jacques.Vergne@unicaen.fr) is a lecturer and researcher in computer science and NLP at the GREYC, the computer science laboratory of the university of Caen (France). His research domain is robust and accurate parsing. He has built the 1998 parser which obtained the best results in the GRACE contest (http://limsi.fr/TLP/grace/), an international evaluation which had the aim to compare taggers for French in a unique protocol.
  Emmanuel Giguet acted as project manager of the team which realized the "GREYC parser". His PhD. thesis has given the 1998 parser a more general design which now is implemented in the "GREYC parser".


Main References and Links


Some references for a preliminary insight into the topic :

Abney S. (1991). "Parsing By Chunks". In: Robert Berwick, Steven Abney and Carol Tenny (eds.), Principle-Based Parsing. Kluwer Academic Publishers, Dordrecht. http://www.sfs.nphil.uni-tuebingen.de/~abney/Abney_90e.ps.gz

Abney S. (1995). "Chunks and Dependencies: Bringing Processing Evidence to Bear on Syntax". In: Computational Linguistics and the Foundations of Linguistic Theory. CSLI. pp. 145-164.
http://www.sfs.nphil.uni-tuebingen.de/~abney/Abney_91i.ps.gz

Abney S. (1996b). "Partial Parsing via Finite-State Cascades". In Proceedings of the ESSLLI '96 Robust Parsing Workshop.
http://www.sfs.nphil.uni-tuebingen.de/~abney/96h.ps.gz

Aït-Mokhtar S. and Chanod J.-P. (1997). "Incremental Finite-State Parsing". In Proceedings of ANLP'97, Washington, pp.72-79.
http://www.xrce.xerox.com/publis/mltt/mltt-97-01.ps

Brill E. (1992). "A simple rule-based part-of-speech tagger". In Proceedings of the Third Conference on Applied Natural Language Processing, Trento. ACL.

Church K. and Mercer R. (1993). "Introduction of the special issue of Computational Linguistics Using large corpora". Computational Linguistics, volume 19, number 1, pp.1-24.

Computational Linguistics (1993). "Special issue on Using large corpora". Volume 19, number 1 and 2.

Ejerhed E. (1996). "Finite state segmentation of discourse into clauses". In Proceedings of ECAI'96 Workshop Extended finite state models of language, A. Kornai (Ed.) pp.24-33. http://www.kornai.com/ECAI/ejerhed.html

Giguet E., Vergne J. (1997). "From Part-of-Speech Tagging to Memory-based Deep Syntactic Analysis". In Proceedings of the International Workshop on Parsing Technologies (IWPT'97), MIT, Boston, Massachussets.
https://giguete.users.greyc.fr/iwpt97/GiguetIwpt97.pdf

Giguet E. (1998). "Méthode pour l'analyse automatique de structures formelles sur documents multilingues". Ph.D thesis, Université de Caen.
https://giguete.users.greyc.fr/these/

Grefenstette G. (1996). "Light Parsing as Finite-State Filtering". ECAI'96 workshop on "Extended finite state models of language". Aug. 11-12, Budapest. http://www.xrce.xerox.com/publis/mltt/mltt-96-12.ps

Vergne J. and Giguet E. (1998).  "Regards Théoriques sur le Tagging". Cinquième conférence annuelle : Le Traitement Automatique des Langues Naturelles, TALN'98, Paris, pp. 22-31.
https://lucasn01.users.greyc.fr/JacquesVergne/VergneGiguetTaln98.pdf