Domain-Specific Insight Graphs
Technical Report,01 Sep 2014,30 Sep 2018
University of Southern California Marina del Rey United States
Pagination or Media Count:
Developing scalable, semi-automatic approaches to derive insights from a domain-specific Web corpus is a longstanding research problem in the knowledge discovery and web communities. The problem is particularly challenging in illicit fields, such as human trafficking, where traditional assumptions concerning information representation are frequently violated. In the Domain-Specific Insight Graphs project DIG,we developed technology to build end-to-end investigative knowledge discovery and search systems, focused primarily on illicit Web domains. The technologies include components for information extraction, semantic modeling and query execution, and was tested in on a variety of real world domains, including a human trafficking Web corpus containing over 100 million pages. The prototype includes a GUI that was used by US law enforcement agencies to combat illicit activity. The research results were widely disseminated in multiple publications in journals and conferences, and the software produced is publicly available on Github under the MIT license.
- Statistics and Probability
- Computer Programming and Software