LISTSERV - LLTI Archives - LISTSERV.DARTMOUTH.EDU

--- Forwarded Message from "Philip A. Bralich, Ph.D." <[log in to unmask]> ---

>Date: Mon, 4 Jan 1999 15:38:02 -1000 (HST)
>To: [log in to unmask], [log in to unmask], [log in to unmask],        [log in to unmask], [log in to unmask], [log in to unmask]
>From: "Philip A. Bralich, Ph.D." <[log in to unmask]>
>Subject: Ergo's 1st ANNUAL PARSING CONTEST

------------------
Ergo Linguistic Technologies would like to announce its first
annual parsing contest based on a fixed set of sentences and a
fixed set of tasks to be performed on that set of sentences.
The area of NLP to be explored is that of increased syntactic
analysis to provide: 1) improvements in navigation and control
technology through more complex grammar, 2) improvements in the
implementation of question/answer, statement/response dialogs
with computers and computer characters, and 3) improvements in
web and database searching using natural anguage queries.  

The contest will be based on a comparison of results for parses of
a fixed set of sentences (included at end of this message) and
various tasks that can be performed as a result of those parses.
That is, the comparison will be based on the actual parse tree and
the ability to use that parsed output to generate theory independent
parse trees and output and to perform various NLP tasks.  The
judging will be based on the standards for evaluating NLP that have
been proposed previously on this list by myself and Derek Bickerton
and which are currently being developed into an ISO standard for the
Virtual Reality Modeling Language (VRML) as part of the VRML
Consortium's development efforts (http://www.vrml.org/WorkingGroups/
NLP-ANIM).  The standards proposed are theory and field independent
standards which allow both linguists and non-linguists to evaluate NLP
systems in the areas of navigation and control, question/answer
dialogues, and database and web searching.  I will also be at the
annual meeting of the Linguistic Society of America this week in Los
Angeles for those who would like to discuss this in more detail.  

The sentences chosen for this contest are rather simple, but as we find
more and more parsers that can accomplish the tasks on this list, we
will add more complex sentences and tasks to the list.  Please, be aware
that systems that may be designed for large corpora of unrestricted text
actually cannot work in this domain.  Thus, while such systems may be
useful for certain searching tasks, they are not useful in the domain
explored in this contest $F7 and this is evidenced by their inability to
perform on tests such as the one provide here.  

The full contest instructions and an HTML document of Ergo's results in
this area can be found at http://www.ergo-ling.com.  The standards were
designed to allow the developers of a parsing system (statistical or
syntactic) to demonstrate the thoroughness and accuracy of the parses they
produce by using the parsed output to perform a number of straightforward,
traditional syntactic tasks such as changing a statement to a question or
an active to a passive as well as demonstrating an ability to create
standard trees (Using the Penn Treebank II guidelines) and standard
grammatical analyses.  All the standards chosen were chosen to be theory
independent measures of the accuracy of a parse through the use of standard
and ordinary grammatical and syntactic output.  

The contest officially begins on January 15th and will be closed on March
31st.  This will allow developers 2.5 months to develop tools and to work
with trouble spots that they may have with the set of sentences offered in
this contest.  The contest will be offered in subsequent years from January
to March.  As time develops we hope the parsers, the contest rules, and the
test sentences will all grow in sophistication and scope.  However, as most
parsers have existed many more years than ours, it is reasonable to think
these tools exist already.  

THE CONTEST RULES:
Anyone who joins must submit an HTML document and the parser (source code
only) that created it.  The parser can be in any format but it must require
a minimum of effort for the contest judges to set up and run. For example,
a WIN95 Interface that takes input files and produces the html output file
would be considered a minimum effort parser.  There will be tests to ensure
that the output is genuine parsed output rather than a synthesis such as a
series of print calls that merely present the correct output for a
particular string rather than generating it.  

The HTML files of all contestants will be made available at the Ergo web
site (http://www.ergo- ling.com).  Those who wish to join even though their
parsing system is not robust or complete enough for all the tasks or all
the sentences in the contest are also welcome to join.  Reviewers will then
look at these documents as promising parsers for future contests.  Their
results will be posted on our web site as well.  

Judging will be based on the percentage of sentences that parsed, the
percentage of the tasks that are completed and on the accuracy of the parses
that result and the success on the parsing tasks.  Currently, the judges will
be Derek Bickerton and myself, but we will welcome others to join in the task.
Because of the home court advantage of the judges, there will be printed
reports of the judging available on the Ergo web site for review by the
overall community of professionals in this area.  Complaints or criticisms
will also be posted.  

Anyone who would like to review the judging and the comments on the judging
are welcome to do so.  Anyone who wishes to be a volunteer judge may also
contact us.  However, the criteria for all judging will be the accuracy of
the parser in creating a correct parse of all the sentences and completing
all the tasks set forth in the test materials.  

We would like this contest to remain open not only to challengers but also
to those who would like to design and improve the contest itself through the
addition of more sentences or more tasks added to the parsing task.  There is
one condition, however, on being able to this, we will hold rigidly to the
rule that those who would improve on or add to the contest must first meet
the original challenge at a minimum level of 75% accuracy before being
allowed to contribute.   We are starting with a small set of relatively simple
sentences to make this as available as possible to as many people as possible.
In this manner researchers in industry, academia, and government will be able
to compare their results without exposing any proprietary or confidential
information.  We also do not want the contest to be unduly influenced by those
who would like to target some ideal of parsing that is not thoroughly grounded
in what is currently possible in these domains.  

At a Virtual Reality and Multi-Media Conference in Japan (VSMM OE98), Ergo was
awarded the "Best Technical Award" for its NLP technology.  I believe the
main reason that judges and others were able to notice this is because I was
able to point out that "THE ENTIRE FIELD OF VIRTUAL REALITY AN D MULTI-MEDIA
IS BEING HELD HOSTAGE BY GRAMMAR."  And then I went on to explain that the
main reason many VR and Multi-Meida sites and programs are not catching on
is because their users cannot ask even a simple question of the characters or
about the objects they encounter.  Thus, a UNESCO virtual world such as
reconstructed cathedral will receive many visitors but they will not stay
and explore because they cannot ask even the simplest questions like "How
many stairs in this Cathedral?"  "When was the Nave built?"   and so on. I
then pointed out that while speech and graphics were actually ready to work
with such projects, the fact that their grammatical abilities is so limited,
no one is using them with these products. The missing link between speech, VR
and multi- media and users actually talking to avatars and sites is GRAMMAR.
When I then demonstrated that this was so with the use of the Ergo tools, we
won the award.   The main reason I am sponsoring this contest is so that all
linguists and NLP researchers who would like to paticipate in this very large
future source of jobs can do so as soon as possible.  So in order to
stimulate research and interest this contest is proposed.  

WE WOULD ESPECIALLY LIKE TO INVITE PROFESSORS, STUDENTS, AND STAFF AT CARNEGIE
MELON, STANFORD, XEROX PARC, MICROSOFT, IBM, DRAGON, LEARNOUT AND HAUSPIE,
PHILIPS, MIT, SUN MICROSYSTEMS (JAVASPEECH GROUP), NEW YORK UNIVERSITY, AND
GEORGETOWN UNIVERSITY TO SUBMIT ENTRIES TO THIS CONTEST.  WE WILL BE HAPPY TO
POST THEIR RESULTS AND WOULD ALSO BE HAPPY TO TELL THE WORLD IF THEY CAN
GENERATE A PARSE THAT IS BETTER THAN OURS ON THE STANDARDS PROVIDED HERE.
THIS IS A GREAT OPPORTUNITY FOR STUDENTS AND JUNIOR STAFF TO WORK WITH EXTANT
PARSERS TO COMBINE AND EXTEND TOOLS INTO THESE VERY USEFUL AND PRACTICAL
AREAS.  

        THE SENTENCES
The full set of sentences for the contest is available at the 
http://www.ergo-ling.com web site.  This list contains five from each of the
three sections: 1) theory independent parsing, 2) navigation and control, and
3) Question/answer, statement/response repartee.  The full list contains 105
sentences and will grow and be modified over the years as this annual contest
takes root.  

Section 1:      Theory independent parsing.  
        1.      there is a dog on the porch
        2.      John's house is bigger than mary's house
        3.      the tall thin man in the office is reading a technical report
        4.      the man who mary likes is reading the book that john gave her
        5.      learning how to cope with stress is of primary importance in the work world
Section 2:      Navigation and Control.
        1.      Erase all files that end in .doc
        2.      print the file called teach.doc
        3.      send an email to bob that says "meeting at eight"
        4.      send a fax to bob that says "there is a meeting at eight tonight"
        5.      go to yahoo and find information about golf courses in Georgia
Section 3:      Question and Answer/Statement Response Repartee.  
        1.      bill's email is [log in to unmask]
                what is bill's email address
                what is bill's email
        2.      john has romantic books
                what kind of books does john have
        3.      My appointment with bob is at six o'clock
                what time is my appointment
                what time are my appointments
        4.      the tall thin man in the office is reading a technical report book
                what is the man reading
                what is the man doing
                is the man reading a report
                who is reading a report
        5.      John gave mary a book because it was her birthday
                who gave mary a book
                who did john give a book
                what did john give mary
                why did john give mary a book
                did john give mary a book
                did john give mary a book because it was her birthday
                did john give mary a pencil
                did john give mary a book because it was bob's birthday



Philip A. Bralich, President
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822
tel:(808)539-3920
fax:(880)539-3924