Report Number: CS-TR-72-264
Institution: Stanford University, Department of Computer Science
Title: An artificial intelligence approach to machine translation.
Author: Wilks, Yorick A.
Date: February 1972
Abstract: The paper describes a system of semantic analysis and
generation, programmed in LISP 1.5 and designed to pass from
paragraph length input in English to French via an
interlingual representation. A wide class of English input
forms will be covered, but the vocabulary will initially be
restricted to one of a few hundred words. With this subset
working, and during the current year (71-72), it is also
hoped to map the interlingual representation onto some
predicate calculus notation so as to make possible the
answering of very simple questions about the translated
matter. The specification of the translation system itself is
complete, and its main points of interest that distinguish it
from other systems are:
i) It translated phrase by phrase -- with facilities for
reordering phrases and establishing essential semantic
connectivities between them -- by mapping complex semantic
structures of "message" onto each phrase. These constitute
the interlingual representation to be translated. This
matching is done without the explicit use of a conventional
syntax analysis, by taking as the appropriate matched
structure the "most dense" of the alternative structures
derived. This method has been found highly successful in
earlier versions of this analysis system.
ii) The French output strings are generated without the
explicit use of a generative grammar. That is done by means
of STEREOTYPES: strings of French words, and functions
evaluating to French words, which are attached to English
word senses in the dictionary and built into the interlingual
representation by the analysis routines. The generation
program thus receives an interlingual representation that
already contains both French output and implicit procedures
for assembling the output, since the stereotypes are in
effect recursive procedures specifying the content and
production of the ouput word strings. Thus the generation
program at no time consults a word dictionary or inventory of
grammar rules.
It is claimed that the system of notation and translation
described is a convenient one for expressing and handling the
items of semantic information that are ESSENTIAL to any
effective MT system, I discuss in some detail the semantic
information needed to ensure the correct choice of output
prepositions in French, a vital matter inadequately treated
by virtually all previous formalisms and projects.
http://i.stanford.edu/pub/cstr/reports/cs/tr/72/264/CS-TR-72-264.pdf