Guest Speaker: Ron Zacharski

Presentation by Ron Zacharski, entitled "Finding relevant information: From statistical methods to advanced question answering."


This talk will describe methods for assisting a user in finding relevant information in a collection of documents. First, I will describe very preliminary work on improving a user's interaction with a traditional web-based search engine. For this task we extract a cluster of related documents from the Science Citation Index (SCI) and attempt to summarize the cluster by identifying key words. We are also examining methods to allow the user to refine his search by presenting collocations of the initial search terms. For example, the search string 'carbon fix' returns approximately 800,000 abstracts from SCI. We identify collocates of the search terms in this abstract collection (for example, for carbon: carbon dioxide, organic carbon, carbon sequestration, and carbon fixation), and display these collocations to the user to enable him to refine his search. However, naive statistical methods will only get you so far and I will illutrate this by looking at advanced question answering systems. Simple techniques combined with lots of data can be used to answer factoid questions (Who killed Abraham Lincoln?). However, these techniques are less effective when used in systems designed for information professionals. I will briefly describe a Wizard of Oz study used to determine how a user interacts with a advanced question answering system. The results of this study suggest (unsurprisingly) that entity tracking and reference resolution techniques are of major importance in such systems.

Dr. Zacharski is a graduate of the University of Minnesota Graduate Program in Computer Science and is currently a Computational Linguist at the New Mexico State University Computing Research Laboratory. More information about his publications, essays, and personal interests can be found on his web site:

Copyright: © 2004 by the Regents of the University of Minnesota
Department of Computer Science and Engineering. All rights reserved.
Comments to: Maria Gini