Data Ecosystem
for your Organization

PRODUCTS • SERVICES

The Reveal framework is made of a set of modular services that allow cost-effective customization of the final solution. Each module is devoted to a specific task, ranging from the processing of the input documents to the semantic elaboration of texts to the implementation of retrieval functions. The final application is typically released as a Service Oriented Architecture that can be released on-premise, or in the Reveal’s cloud.

The Automatic Natural Language Processor

The Reveal Natural Language Toolkit (RevNLT) is an accurate system for automatic natural language processing

RevNLT is a cascade of high-performance language processors that works in Italian and English, is easy to customize, adapt to specific application domains and integrate into complex or cloud-based application environments. Based on a service-based architecture, RevNLT is highly efficient (On a single thread from a standard laptop CPU it can process something like 300 tweets per second) and robust in enabling morphosyntactic text analysis that is essential to enable advanced semantic text processing processes.

EXAMPLES

Linguistic information extraction

Given a sentence:
“Il presidente Gianluca Gialli ha comunicato le sue dimissioni durante il congresso in Via Manzoni”
RevNLT extracts linguistic information as depicted in the following picture.

The input sentence is tokenized and the main syntactic structures are automatically derived, such as the main verb “ha comunicato” or its logical subject “Il presidente Gianluca Gialli” which is connected to the verb through the link “V_Sog”. In this syntactic analysis, all phrases expressing some logical function in the input sentence are grouped in the so-called “syntactic chunk”, such as “le sue dimissioni”, “durante il congresso” or “in Via Manzoni”. Each chunk can be “exploded” to derive its shallower grammatical and morphological information as well as the extracted Named Entities. Each input token is in fact associated with a grammatical function (here the Part-of-Speech such as Noun or Verb), its basic form (the lemma, e.g., dichiarato vs dichiarare) and morphological information (e.g., “ha” is the third person of the present tense). For example, “Gianluca Gialli” and “Via Manzoni” are Named Entities (whose category NeCat is Person and Location respectively), while the chunk “ha dichiarato” is made of an auxiliary verb (ha) in its 3rd singular person and the past participle of comunicare. All this information is packed in a dedicated data structure and provided by RevNLT to each of the remaining modules in the Reveal ecosystem which require linguistic processing.

NLP State-of-art Techniques

RevNLT is designed to be a plug&play component for any of the other Reveal’s or Customer’s engines.

It is entirely implemented in JAVA and can be included in an existing application as a standard library or invoked as a Service in any Service Oriented Architecture. RevNLT implements state-of-art techniques for the natural language processing workflows, enabling fine-grained morpho-syntactic analysis of text contained in documents.

Some examples of the typically supported workflows of analysis are:

  • the segmentation of documents in sentences.
  • the detection of grammatical classes of words in sentences (e.g. Nouns, Verbs of Adjectives).
  • the syntactic analysis of texts for the extraction of linguistic patterns like Subject – Verb- Object triples in sentences.

In Reveal’s service ecosystem, RevNLT implements all functions devoted to Linguistic Processing, being the Language Processing Chain shown in the following figure in the green box.

Simply Customization and Robustness

RevNLT was already successfully applied in banking, media industry, system engineering as well as turism.

While many off-the-shelf (and free) solutions exist for language processing, one of the greatest advantages is its simplicity in being customized to the “sublanguage” used in the customers’ domain. Reveal is the owner of the customization infrastructure that applies statistical and neural approaches, in order to guarantee robustness against heterogeneous domains.

TRUSTED BY CUSTOMERS & PARTNERS