| |

See
enlarged screenshot of MontyLingua at work!
|
Recent
bugfixes
Version
2.1 (6 Aug 2004)
- includes new MontyNLGenerator component generates sentences and
summaries
Version
2.0.1
- fixes API bug in version 2.0 which prevents java api from being
callable
What
is MontyLingua? [top]
MontyLingua
is a free*, commonsense-enriched, end-to-end natural language understander
for English. Feed raw English text into MontyLingua, and the output
will be a semantic interpretation of that text. Perfect for information
retrieval and extraction, request processing, and question answering.
From English sentences, it extracts subject/verb/object tuples,
extracts adjectives, noun phrases and verb phrases, and extracts
people's names, places, events, dates and times, and other semantic
information. MontyLingua makes traditionally
difficult language processing tasks trivial!
Version
2.0 is substantially FASTER, MORE ACCURATE, and MORE RELIABLE than
version 1.3.1. It has now been tested across Windows, many flavors
of UNIX, and Mac OS X, and several flavors of Java, and is in use
by several university research projects and under several commercial
settings.
MontyLingua
differs from other natural language processing tools because:
- it
is complete end-to-end.. input raw_text; output semantic
interpretation
- not
many dated tools and implementations sewn together; it is one
well-integrated implementation
- it
does not require "training" and other fidgetting,
and will work right out-of-the-box
- it
is enriched with "common
sense" knowledge about the everyday world, allowing
it to escape many stupid interpretive mistakes. e.g.:
- "(NX
the/DT mosquito/NN bit/NN NX) (NX the/DT boy/NN NX)"
==corrected==>
- "(NX
the/DT mosquito/NN NX) (VX bit/VBD VX) (NX the/DT boy/NN
NX)"
- it
is lightweight and portable across platforms, written in portable
Python and also available as
a compiled Java library
- it
is easy to customize by allowing for a user lexicon
MontyLingua
performs the following tasks over text:
- MontyTokenizer
- Tokenizes raw English text (sensitive to abbreviations), and
resolve contractions, e.g. "you're" ==> "you
are"
- MontyTagger
- Part-of-speech tagging based on Brill94, enriched with common
sense.
- MontyChunker
- Lightning fast regular expression chunker
- MontyExtractor
- Extracts phrases and subject/verb/object triplets from sentences
- MontyLemmatiser
- Strips inflectional morphology, i.e. changes verbs to infinitive
form and nouns to singular form
- MontyNLGenerator
- Uses MontyLingua's concise predicate-arg representation to
generate naturalistic English sentences and text summaries
* free
for non-commercial use. please see MontyLingua
Version 2.0 License
Terms
of Use [top]
Author:
Hugo Liu <hugo@media.mit.edu>
Project Page: <http://web.media.mit.edu/~hugo/montylingua/>
Terms
of Use
Copyright
(c) 2002-2004 by Hugo Liu, MIT Media Lab
All rights reserved.
Non-commercial use is free, as provided in
the MontyLingua version 2.0 License.
By downloading and using MontyLingua, you agree to abide by
the additional copyright and licensing information in "license.txt",
included in this distribution.
If you use this software in your research, please acknowledge
MontyLingua and its author, and link to back to the project
page http://web.media.mit.edu/~hugo/montylingua.
Please cite montylingua in academic publications as:
Liu, Hugo (2004). MontyLingua: An end-to-end natural
language
processor with common sense. Available
at: web.media.mit.edu/~hugo/montylingua. |
Documentation
[top]
Documentation
and License |
 |
python
documentation and api (html) [.html]
|
 |
java
documentation and api [.html] |
 |
MontyLingua
license
[.txt]
by downloading and using
MontyLingua you must agree to these terms |
Version
2.1 (6 Aug 2004)
- includes new MontyNLGenerator component generates sentences and
summaries
Version
2.0.1
- fixes API bug in version 2.0 which prevents java api from being
callable
New
in version 2.0 (29 Jul 2004)
-
2.5X speed enhancement for whole system, 2X speed enhancement
for tagger component
-
rule-based chunker replaced with much faster and more accurate
regular expression chunker
-
common sense added to MontyTagger component improves word-level
tagger accuracy to 97%
-
updated and expanded lexicon for English
-
added a user-customizable lexicon CUSTOMLEXICON.MDF
-
improvements to MontyLemmatiser incorporating exception cases
-
html documentation added
-
speed optimizations to all code
-
improvements made to semantic extraction
-
expanded Java API
Download
MontyLingua [top]
Please
fill out the following information to proceed to the download of
Version 2.1 for Java and Python.
This information will not be shared beyond the author.
READ
THIS if you are running ML on Mac OS X, or Unix
- The
distribution ZIP includes datafiles designed for windows. If
you are running MontyLingua on Unix or Mac OS X, and the phrase
"I love you" is tagged incorrectly, then the datafiles
need to be rebuilt. This is simple:
-
delete
all files of the form, FASTLEXICON_n.MDF, where n is a number.
- re-run
the MontyLingua program, either from Python, or Java, and the
correct datafiles will be rebuilt. If running Java and you run
out of memory during the rebuild process, use the -MX or -Xmx
option in Java to increase the memory size. You will only need
to rebuild these datafiles once.
Research
and Industry Applications which use MontyLingua [top]
These
are some of the research and industry projects which use MontyLingua
and MontyTagger. To submit your project, email a web url and short
description to the author.
William W. Cohen (2004) Minorthird: Methods for Identifying Names
and Ontological Relations in Text using Heuristics for Inducing
Regularities from Data, http://minorthird.sourceforge.net (website)
Jacob
Eisenstein and Randall Davis. Visual and Linguistic Information
in Gesture Classification. Accepted to International Conference
on Multimodal Interfaces (ICMI'04) (paper)
L.
Xie, L. Kennedy, S.-F. Chang, A. Divakaran, H. Sun, C.-Y. Lin (2004).
"Discovering Meaningful Multimedia Patterns with Audio-visual
Concepts and Associated Text." IEEE International Conference
on Image Processing (ICIP 2004), Singapore, October 2004. (paper)
Ashwani
Kumar, Sharad C. Sundararajan, Henry Lieberman (2004). Common Sense
Investing: Bridging the Gap Between Expert and Novice. Conference
on Human Factors in Computing Systems (CHI 04), Vienna, Austria.
(paper)
(website)
Hugo Liu and Push Singh (2004) ConceptNet: A Practical Commonsense
Reasoning Toolkit. BT Technology Journal, upcoming. Kluwer
Academic Publishers. (website)
Google for MontyLingua
and MontyTagger
to see who else has been using this software.
|
|