Accession Number : ADA462538


Title :   Instance-Based Question Answering


Descriptive Note : Doctoral thesis


Corporate Author : CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE


Personal Author(s) : Lita, Lucian V


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a462538.pdf


Report Date : Dec 2006


Pagination or Media Count : 232


Abstract : During recent years, question answering (QA) has grown from simple passage retrieval and information extraction to very complex approaches that incorporate deep question and document analysis, reasoning, planning, and sophisticated uses of knowledge resources. Most existing QA systems combine rule-based, knowledge-based and statistical components, and are highly optimized for a particular style of questions in a given language. Typical question answering approaches depend on specific ontologies, resources, processing tools, document sources, and very often rely on expert knowledge and rule-based components. Furthermore, such systems are very difficult to re-train and optimize for different domains and languages, requiring considerable time and human effort. We present a fully statistical, data-driven, instance-based approach to question answering (IBQA) that learns how to answer new questions from similar training questions and their known correct answers. We represent training questions as points in a multi-dimensional space and cluster them according to different granularity, scatter, and similarity metrics. From each individual cluster we automatically learn an answering strategy for finding answers to questions. When answering a new question that is covered by several clusters, multiple answering strategies are simultaneously employed. The resulting answer confidence combines elements such as each strategy's estimated probability of success, cluster similarity to the new question, cluster size, and cluster granularity. The IBQA approach obtains good performance on factoid and definitional questions, comparable to the performance of top systems participating in official question answering evaluations.


Descriptors :   *REASONING , *RULE BASED SYSTEMS , *INFORMATION RETRIEVAL , *KNOWLEDGE BASED SYSTEMS , STRATEGY , PROBABILITY , CLUSTERING , DOCUMENTS , THESES , STATISTICAL DATA , TRAINING


Subject Categories : Information Science
      Cybernetics


Distribution Statement : APPROVED FOR PUBLIC RELEASE