10 years

Solutions
Large Language Models
Empower your content with human-like text generation across languages.
Minerva AI
Agentic AI
Retrieval-Augmented Generation (RAG)
Model Fine Tuning
AI Evaluation
More about LLM
Text Analytics
Unlock actionable insights from unstructured text for strategic decision-making.
Named Entity Recognition
Keyword Extraction
Relation Extraction
Entity Linking
Sentiment Analysis
Word Sense Disambiguation
More about Text Analytics
Knowledge Graphs
Search, visualize and explore data connections for deep insights and complex queries.
Next-Generation KG
Rich Semantic Information
Custom Enterprise KG Development
More about Knowledge Graphs
Semantic Search
Refine searches with context-aware results that understand user intent multilingually.
Advanced Query Understanding
Contextual Results Ranking
Customizable Search Framework
Semantic Annotation
More about Semantic Search
Minerva AI
Agentic AI
Retrieval-Augmented Generation (RAG)
Model Fine Tuning
AI Evaluation
Named Entity Recognition
Keyword Extraction
Relation Extraction
Entity Linking
Sentiment Analysis
Word Sense Disambiguation
Next-Generation KG
Rich Semantic Information
Custom Enterprise KG Development
Advanced Query Understanding
Contextual Results Ranking
Customizable Search Framework
Semantic Annotation
More about LLM
More about Text Analytics
More about Knowledge Graphs
More about Semantic Search
Solutions
Products
Babelscape Vera
LLM-powered, grounded fact-checking
WordAtlas
the next-generation multilingual knowledge graph
Comprehendo
disambiguate and semantically tag text in hundreds of languages
Extraggo
extract knowledge from text and analyze key concepts, entities and domains
Emotionary
the next-generation language abuse and emotion detection AI
NLP Pipeline
large-scale, parallel, multilingual and modularized
Semantic Paths
A multilingual semantic search engine and information monitor
LexTag
Create your semantically-annotated datasets with ease
myKnowledgeGraph
organize your enterprise documents into a structured knowledge base
TraDeInterpret
Revolutionize the way you work with trademark denominations
Products
Research
About
News
API & Demos
Explore Babelscape's API Console

Register to get a free API key or purchase one to access our powerful multilingual AI solutions. Test live demos, experience entity linking, semantic search, and more - unlocking the full potential of AI-powered text understanding for your industry.

APIs Console
Discover Babelscape's techology in action

See firsthand how our products can transform your business by providing advanced multilingual understanding, entity linking, semantic search, and more. Explore the demos below and unlock the potential of AI-driven solutions tailored to your needs.

Live demos
API & Demos
Contact us Contact us

More about Text Analytics

More about Knowledge Graphs

More about Semantic Search

Babelscape Vera

LLM-powered, grounded fact-checking

the next-generation multilingual knowledge graph

disambiguate and semantically tag text in hundreds of languages

extract knowledge from text and analyze key concepts, entities and domains

the next-generation language abuse and emotion detection AI

large-scale, parallel, multilingual and modularized

A multilingual semantic search engine and information monitor

Create your semantically-annotated datasets with ease

myKnowledgeGraph

organize your enterprise documents into a structured knowledge base

Revolutionize the way you work with trademark denominations

Explore Babelscape's API Console

Register to get a free API key or purchase one to access our powerful multilingual AI solutions. Test live demos, experience entity linking, semantic search, and more - unlocking the full potential of AI-powered text understanding for your industry.

Discover Babelscape's techology in action

See firsthand how our products can transform your business by providing advanced multilingual understanding, entity linking, semantic search, and more. Explore the demos below and unlock the potential of AI-driven solutions tailored to your needs.

Header shape illustration 1

Header shape illustration 2

Back

ID10M: Idiom Identification in 10 Languages

Simone Tedeschi, Federico Martelli, Roberto Navigli

Abstract

Idioms are phrases which present a figurative meaning that cannot be (completely) derived by looking at the meaning of their individual components. Identifying and understanding idioms in context is a crucial goal and a key challenge in a wide range of Natural Language Understanding tasks. Although efforts have been undertaken in this direction, the automatic identification and understanding of idioms is still a largely under-investigated area, especially when operating in a multilingual scenario. In this paper, we address such limitations and put forward several new contributions: we propose a novel multilingual Transformer-based system for the identification of idioms; we produce a high-quality automatically-created training dataset in 10 languages, along with a novel manually-curated evaluation benchmark; finally, we carry out a thorough performance analysis and release our evaluation suite at https://github.com/Babelscape/ID10M.

https://aclanthology.org/2022.findings-naacl.208.pdf
Simone Tedeschi, Federico Martelli, Roberto Navigli. 2022. ID10M: Idiom Identification in 10 Languages. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2715-2726, Seattle, United States. Association for Computational Linguistics.