Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table.

Church, Kenneth; Patil, Ramesh

Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table.

Active / Technical Report | Accession Number: ADA114500 |

Open PDF

Abstract:

Sentences are far more ambiguous than one might have thought. There may be hundreds, perhaps thousands of syntactic parse trees for certain very natural sentences of English. This fact has been a major problem confronting natural language processing because it indicated that it may require a long time to construct a list of all the parse trees, and furthermore, it isnt clear what to do with the list once it has been constructed. This list may be so numerous that it is probably not the most convenient representation for communication with the semantic and pragmatic processing modules. In this paper we propose some methods for dealing with syntactic ambiguity in ways that take advantage of certain regularities among the alternative parse trees. These regularities will be expressed as linear combinations of ATN networks, and also as sums and products of formal power series. We will suggest some ways that practical processor can take advantage of this modularity in order to deal more efficiently with combinatoric ambiguity. In particular, we will show how a processor can efficiently compute the ambiguity of an input sentence or any portion therof. Furthermore, we will show how to compile certain grammers into a form that can be processed more efficiently. In some cases, including the every way ambiguous grammar e.g., conjuction, prepositional phrases, noun-noun modification, processing time will be reduced from 0 n superscript 3 to 0 n. Finally, we will show how to uncompile certain highly optimized grammars into a form suitable for linguistic analysis. Author

Author(s):

Church, Kenneth ; Patil, Ramesh

Author Organization(s):

MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR COMPUTER SCIENCE

Descriptive Note:

Memorandum rept.,

Pagination:

0040

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:

Approved For Public Release

RECORD

Collection: TR

Identifying Numbers

Report Number(s):

MIT/LCS/TM-216

Contract/Grant Number(s):

N00014-75-C-0661, PHS-1P01-LM-03374-02

Subject Terms

Joint Capability Areas:

JCA_5_Command and Control; JCA_5.3_Planning; JCA_6.1_Information Transport; JCA_6_Net Centric; JCA_1.2.5_Lessons Learned

Modernization Areas:

AI and Machine Learning

Communities of Interest:

Energy and Power Technologies

Descriptor(s):

*Natural language, Linguistics, Syntax, Probability, Power series

Field(s)/Group(s):

Linguistics

Keyword(s):

Catalan numbers, Parsing, Ambiguity

Report Date:

1982 Apr 01