Two things can make a difference in your products which deal with text: semantics and multilinguality. Comprehendo enables both by providing a system which understands text by associating explicit meanings with words, multiword expressions and phrases. Such meanings come from WordAtlas and are multilingual by design.
Comprehendo is based on state-of-the-art Word Sense Disambiguation and Entity Linking and can be applied to any language and text genre on a large scale. As a result, users can process large amounts of texts, articles, blogs, posts, etc. in multiple languages and aggregate this information in any way they like (for instance, using Extraggo for text analytics).
Comprehendo works both with standard text (sentences, paragraphs, documents, etc.) , and text snippets (such as tags, word clouds, user queries, etc.) and brings two main advantages:
Word Sense Disambiguation means identifying the proper meaning of a dictionary word in a given context (a sentence or a bag of words), a task tightly related to Entity Linking, which instead aims at resolving the ambiguity of a proper name. For instance, Piano might denote the musical instrument or the well-known archistar. However, things can easily get more complex in the presence of multiple dictionary and entity meanings. For instance, spring can refer to the season, the elastic device or a water source, as well as a programming framework, a place in Texas and a song. Comprehendo surpasses the performance of Babelfy, a state-of-the-art disambiguation and entity linking system developed in prof. Navigli’s NLP lab at the Sapienza University of Rome, and provides high accuracy thanks to innovative algorithms and its linkage to WordAtlas multilingual concept and entities.
Thanks to its disambiguation API, Comprehendo understands text in
hundreds of languages and tags ambiguous words explicitly with concepts and entities
in WordAtlas with high performance and speed.
Comprehendo works with full text, but also snippets,
posts, query logs or term banks,
among others.
Comprehendo is not a translator: it understands text and associates concepts and named entities with words and phrases. Such concepts and entities are provided by WordAtlas, our multilingual knowledge graph, which makes it possible to scale to arbitrary languages at any time. For example, given the following sentence:
Comprehendo produces the following output:
Note that by just selecting a different language, the concepts and entities involved are lexicalized in the target language. For instance, when reading them in Italian we would get:
Thanks to the tight integration of Comprehendo and WordAtlas, you can unify textual content expressed in different languages. For instance, consider the following queries in your search engine:
All the above queries convey the same semantics, which is identified by Comprehendo and given below:
Comprehendo is not a machine translation system: note that, even though the two concepts are expressed and explained in English, they are actually multilingual and aggregate all the above lexical realizations (and many more). As a result, web search is made semantic and sparsity is greatly reduced.
Thank you for your interest in Babelscape. Please fill out this inquiry form to receive more information about our products.