Accession Number:

AD1061874

Title:

Towards Locating and Exploring Hard-to-Find Information on the Web

Descriptive Note:

Technical Report,01 Sep 2014,31 Mar 2018

Corporate Author:

New York University New York United States

Report Date:

2018-09-01

Pagination or Media Count:

47.0

Abstract:

This work developed new methods and tools to empower subject matter experts to effectively discover and track information on the Web that is relevant to a given task or domain. Our approach consists of two main components that address these challenges 1 Domain discovery and 2 Crawling and information gathering. For each of these components we have designed new methods, and developed open-source tools that implement these methods. Notably, we have designed a new framework that facilitates domain discovery, organization and presentation. We have also developed a general and extensible crawling infrastructure that substantially extends the ACHE open-source focused crawler to support complex crawling tasks and multiple crawling strategies to discover new content in a timely manner.

Subject Categories:

  • Information Science

Distribution Statement:

APPROVED FOR PUBLIC RELEASE