Text Summarization Evaluation: Correlating Human Performance on an Extrinsic Task with Automatic Intrinsic Metrics
MARYLAND UNIV COLLEGE PARK INST FOR ADVANCED COMPUTER STUDIES
Pagination or Media Count:
This research describes two types of summarization evaluation methods, intrinsic and extrinsic, and concentrates on determining the level of correlation between automatic intrinsic methods and human task-based extrinsic evaluation performance. Suggested experiments and preliminary findings related to exploring correlations and factors affecting correlation method of summarization, quality of summary, type of intrinsic method used, and genre of source documents are detailed. A new measurement technique for task-based evaluations, Relevance Prediction, is introduced and contrasted with the current gold-standard based measurements of the summarization evaluation community. Preliminary experimental findings suggest that the Relevance Prediction method yields better performance measurements with human summaries than that of the LDC-Agreement method and that small correlations are seen with one of the automatic intrinsic evaluation metrics and human task-based performance results.
- Information Science