DARPA's Machine Reading program pumps some more money into what should be recognized as a long term basic research initiative. Cynics could easily argue that BBN -- the recipient of this latest infusion, as well as many a contract award in natural language processing going back to the 70's -- should have more to show for its work considering the amount of investment.BBN will develop a universal text engine that captures knowledge from text and transforms it into the formal representations used by AI systems. A central goal of the research effort is to develop techniques that can generalize across the linguistic structure and content of documents to extract relations and axioms directly from text, rather than relying on a person to encode such information. A related goal is to develop techniques capable of performing automatic extraction of text on the Web. Over the course of the 5-year program, BBN’s system will be tested against increasingly complex targets, including its ability to learn axioms from text and to read and digest vast quantities of Web text.
The program's goals are not significantly different from those of a similar SPAWAR-spearheaded (it was NOSC then) program that began in the 80's and lasted into the 90's, or Japan's ambitious machine translation project that began around the same time. Cynics would be correct in claiming that more narrowly framed objectives with explicit use cases -- and greater reuse of existing technology -- would move the ball down the field. As with the Nuance/Dragon speech recognition commercial product, progress is impressive but incremental.
Others beyond BBN have contributions to make. AFRL's $30M award to BBN is putting a lot of eggs in a well-worn basket. Institutional memory may be lapsing as the stories from heyday of a certain high profile Texas-based consortium fade.
◦



