TOWARD EXPLOITATION OF A FILE OF RUSSIAN TEXT WITH SYNTACTIC ANNOTATIONS,
RAND CORP SANTA MONICA CALIF
Pagination or Media Count:
The final report of RANDs work in linguistics for Rome Air Development Center, 1965-66. The work included the compilation of a million-word File of Russian Text with Syntactic Annotations, stored on magnetic tape, which can be duplicated for qualified requesters and the design of a computer program called COLLECT for retrieving data from the File. The annotations, which are based on dependency theory, include not only systematic connections between dependent and governing words, but grammatical functions, indications of negation, pointers to antecedents of pronouns, and special features. Methods of automatic statistical classification usable in reducing File data are discussed, and steps toward ambiguity reduction and automatic parsing are described. Tentative research designs for investigating modal constructions and sentential apposition in Russian with computer assistance are outlined.