Distributed Multisearch and Resource Selection for the TREC Million Query Track
ALASKA UNIV ANCHORAGE ARTIC REGION SUPERCOMPUTING CENTER
Pagination or Media Count:
A distributed information retrieval system with resource-selection and result-set merging capability was used to search subsets of the GOV2 document corpus for the 2008 TREC Million Query Track. The GOV2 collection was partitioned into host-name subcollections and distributed to multiple remote machines. The Multisearch demonstrations, application restricted each search to a fraction of the available sum-collections that was pre-determined by a resource-selection algorithm. Experiment results from topic-by-topic resource selection and aggregate topic resource selection are compared. The sensitivity of Multisearch retrieval performance to variations in the resource selection algorithm is discussed.
- Information Science