The Computer Science Colloquium

Thursday, October 1, 4:15pm, room 9204/9205



Heng Ji
(Queens College/Graduate Center)

"Cross-document Cross-lingual Information Extraction and Tracking"

Most current information extraction analyzes documents in isolation. The net result is a set of disconnected and often redundant annotations, because events are repeated in many news stories. In this talk we will present a new task of cross-document cross-lingual information extraction and tracking and its evaluation metrics. From enormous multi-lingual documents we identify important person entities which are frequently involved in events as ‘centroid entities’. Then we link the events involving the same centroid entity along a time line. We will also present a system performing this task and our current approaches to address the main research challenges. We will discuss how we can take advantage of the redundancy to improve the accuracy of relation and event annotation, by means of

  • Cross-document event coreference resolution
  • Event ranking by salience and novelty, and
  • Event organization by participant, time, and place
  • Name translation
  • Knowledge Discovery from Google Ngrams

Bio:

Heng Ji, Assistant Professor and Doctoral Faculty in Computer Science at Queens College and the Graduate Center of City University of New York, received her B.A. and M.A. in Computational Linguistics from Tsinghua University, M.S. and Ph.D. in Computer Science from New York University in 2005 and 2007 respectively. Her research interests focus on Natural Language Processing (NLP), especially Cross-document Cross-lingual Information Extraction, Event Tracking, Information-aware Machine Translation and Spoken Language Understanding. She has published several book chapters and many papers at the most prestigious NLP conferences and journals. She is the recipient of Google Faculty Research Award in 2009. She was awarded the Sandra Bleistein Prize from Courant Institute of Mathematical Sciences of NYU. She is a member of Association of Computational Linguistics, IEEE and Sigma Xi. She has served as a committee member and reviewer for National Science Foundation, many conferences and journals in the field of NLP, Information Processing, Digital Libraries and Artificial Intelligence.




The Colloquium is supported by generous contributions from the Bloomberg, Information Builders, Inc., and Netlogic, Inc.

       


365 Fifth Ave, New York City 10016 | Room 4319 | Phone: 212.817.8190 | Fax: 212.817.1510 | compsci@gc.cuny.edu