Large-Scale Paraphrasing for Natural Language Understanding

reportActive / Technical Report | Accession Number: AD1050977 | Open PDF

Abstract:

In this project, we researched and developed technologies to automatically extract large-volumes of paraphrases to aid in natural language understanding NLU tasks. We developed three core algorithms to 1 generate extremely large paraphrase databases, and 2 adapt paraphrase databases to new domains, and 3 augment paraphrase rules with fine-grained semantic entailment relations. Our work introduced the paraphrase database PPDB, the largest paraphrase resource developed to date. The resource contains over 100 million paraphrases for English. We generated paraphrase databases for 23 foreign languages.

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:
Approved For Public Release
Distribution Statement:
Approved For Public Release;

RECORD

Collection: TR
Identifying Numbers
Subject Terms