pybabelnet package

Submodules

pybabelnet.about module

Information about PyBabelNet.

pybabelnet.about.TITLE = 'PyBabelNet'
pybabelnet.about.VERSION = '4.0.1'
pybabelnet.about.AUTHOR = 'Francesco Cecconi, Roberto Navigli, Emilian Postolache and Daniele Vannella'
pybabelnet.about.AUTHOR_EMAIL = 'info@babelscape.com'
pybabelnet.about.DOCUMENTATION_URL = 'http://babelnet.org/guide'
pybabelnet.about.DESCRIPTION = 'Python APIs for BabelNet 4.0.1'
pybabelnet.about.header(api_type=None)

Return an initialization string for the BabelNet API.

Keyword Arguments:
 api_type (BabelAPIType) – The API type (default None).
Returns:The initialization string.
Return type:str
class pybabelnet.about.BabelAPIType(*args, **kwds)

Bases: aenum.Enum

Type of BabelNet API.

get_type(self)

Return the corresponding type of API.

Return type:str
OFFLINE = 'offline'

Fully offline, with all indices on the machine.

ONLINE = 'online RESTful'

Fully online, with no indices on the machine.

pybabelnet.api module

This module contains functions that implement the BabelNet API.

pybabelnet.api.get_senses_containing(word, language: pybabelnet.language.Language = None, pos: pybabelnet.pos.POS = None)

Get the senses of synsets containing the word with the given constraints.

Parameters:

word (str) – The word whose senses are to be retrieved.

Keyword Arguments:
 
  • language (Optional[Language]) – The language of the input word.
  • pos (Optional[POS]) – The Part of Speech of the word.
Returns:

The senses of the word.

Return type:

List[BabelSense]

pybabelnet.api.get_senses_from(word, language: pybabelnet.language.Language = None, pos: pybabelnet.pos.POS = None)

Get the senses of synsets from the word with the given constraints.

Parameters:

word (str) – The word whose senses are to be retrieved.

Keyword Arguments:
 
  • language (Optional[Language]) – The language of the input word.
  • pos (Optional[POS]) – The Part of Speech of the word.
Returns:

The senses of the word.

Return type:

List[BabelSense]

pybabelnet.api.get_senses(*args, **kwargs)

Get the senses of synsets searched by words or by ResourceIDs, satisfying the optional constraints.

Parameters:

args (Union[Tuple[str, ..], Tuple[ResourceID, ..]]) – A homogeneous collection of words (str) or ResourceIDs used to search for senses in synsets.

Keyword Arguments:
 
  • containing (bool) – Used if the senses are searched by words: if it is True, the words have to be contained in the sense (default False).
  • from_langs (Iterable[Language]) – An iterable collection of Languages Languages used for searching the senses.
  • to_langs (Iterable[Language]) – An iterable collection of Languages in which the senses are to be retrieved.
  • poses (Iterable[POS]) – An iterable collection of Parts of Speech of the senses.
  • normalized (bool) – True if the search is insensitive to accents, etc. (default True).
  • sources (Iterable[BabelSenseSource]) – An iterable collection of BabelSenseSources used to restrict the search.
  • synset_filters (Iterable[Callable[[BabelSynset], bool]]) – An iterable collection of filters (functions accepting a BabelSynset and returning bool) to be applied to each synset retrieved.
  • sense_filters (Iterable[Callable[[BabelSynset], bool]]) – An iterable collection of filters (functions accepting a BabelSense and returning bool) to be applied to each sense retrieved.
Returns:

The resulting senses.

Return type:

List[BabelSense]

pybabelnet.api.get_synsets(*args, **kwargs)

Get synsets by words or by ResourceIDs, satisfying the optional constraints.

Parameters:

args (Union[Tuple[str, ..], Tuple[ResourceID, ..]]) – A homogeneous collection of words (str) or ResourceIDs used to search for synsets.

Keyword Arguments:
 
  • from_langs (Iterable[Language]) – An iterable collection of Languages used for searching the synsets.
  • to_langs (Iterable[Language]) – An iterable collection of Languages used for populating results.
  • poses (Iterable[POS]) – An iterable collection of Parts of Speech of the synsets.
  • normalized (bool) – True if the search is insensitive to accents, etc. (default True).
  • sources (Iterable[BabelSenseSource]) – An iterable collection of BabelSenseSources of the senses of the synsets.
  • synset_filters (Iterable[Callable[[BabelSynset], bool]]) – An iterable collection of filters (functions accepting a BabelSynset and returning bool) to be applied to each synset retrieved.
  • sense_filters (Iterable[Callable[[BabelSynset], bool]]) – An iterable collection of filters (functions accepting a BabelSense and returning bool) to be applied to each sense retrieved.
Returns:

The resulting synsets.

Return type:

List[BabelSynset]

pybabelnet.api.get_synset(resource_id) → Union[pybabelnet.synset.BabelSynset, NoneType]

Return the synset identified by the ResourceID in input.

Parameters:resource_id – ResourceID The resource identifier.
Returns:The synset identified by the ResourceID.
Return type:Optional[BabelSynset]

Examples: Some examples that can be used follow:

import pybabelnet as pb

# Retrieving BabelSynset from a Wikipedia page title:

synset = pb.get_synset(WikipediaID('BabelNet', Language.EN, POS.NOUN))

# Retrieving BabelSynset from a Wikiquote page title:

synset =  pb.get_synset(WikiquoteID('Home', Language.EN, POS.NOUN))

# Retrieving BabelSynset from a WordNet id:

synset = pb.get_synset(WordNetSynsetID('wn:03544360n'))

# Retrieving BabelSynset from a Wikidata page id:

synset = pb.get_synset(WikidataID('Q4837690'))

# Retrieving BabelSynset from a OmegaWiki page id:

synset = pb.get_synset(OmegaWikiID('1499705'))
pybabelnet.api.version()

Get the version of loaded BabelNet indices.

Returns:The BabelVersion of BabelNet indices.
Return type:BabelVersion
pybabelnet.api.to_synsets(resource_id, languages: Iterable[pybabelnet.language.Language] = None)

Convert from ResourceID to the corresponding BabelSynset.

Parameters:resource_id (ResourceID) – The input ID.
Keyword Arguments:
 languages (Optional[Iterable[Language]]) – The target languages to populate synsets with (default None).
Returns:The list of corresponding synsets.
Return type:List[BabelSynset]
pybabelnet.api.iterator()

Create a new instance of BabelSynset iterator.

Returns:An instance of a BabelSynset iterator.
Return type:BabelSynsetIterator
Raises:NotImplementedError – Raised if the function is called using online API’s.
pybabelnet.api.offset_iterator()

Create a new instance of an offset iterator.

Returns:An instance of a offset iterator.
Return type:BabelOffsetIterator
Raises:NotImplementedError – Raised if the function is called using online API’s.
pybabelnet.api.lexicon_iterator()

Create a new instance of a lexicon iterator.

Returns:An instance of a lexicon iterator.
Return type:BabelLexiconIterator
Raises:NotImplementedError – Raised if the function is called using online API’s.

pybabelnet.conf module

The configuration hub for the BabelNet API.

class pybabelnet.conf.Option(default, interpolate=None, on_change=None, doc=None)

Bases: object

A descriptor implementing PyBabelNet configuration options.

Parameters:
  • default (Any) – The default value of this Option.
  • interpolate (Optional[Option]) – If interpolate is set to an Option having a str value and the current Option has a str value then the result is the concatenated strings. This is useful for joining a base path with a file name.
  • on_change (Optional[Callable[[Config], None]) – Optional callback function that is triggered when the Option is modified. The parameter of the function is the Config instance where the descriptor is used.
  • doc (Optional[str]) – Docstring used to document the Option.
class pybabelnet.conf.Config

Bases: object

Base configuration object.

classmethod options()

Return a dictionary the Option objects for this config

defaults()

Return the default values of this configuration.

load_dict(dct)

Load a dictionary of configuration values.

load_file(filename)

Load config from a YAML file.

snapshot()

Return a snapshot of the current values of this configuration.

class pybabelnet.conf.PyBabelNetConfig

Bases: pybabelnet.conf.Config

BASE_DIR

default='' The BabelNet base directory.

LEXICON_INDEX_DIR

default='/lexicon' The BabelNet lexicon index directory.

DICT_INDEX_DIR

default='/dict' BabelNet dictionary index directory.

INFO_INDEX_DIR

default='/info_CC_BY_NC_SA_30' The BabelNet info index directory.

WN_INFO_INDEX_DIR

default='/info_wordnet' The WordNet dictionary index directory.

GLOSS_INDEX_DIR

default='/gloss' The BabelNet gloss index directory.

GRAPH_INDEX_DIR

default='/graph_CC_BY_NC_SA_30' The BabelNet graph index directory.

IMAGE_INDEX_DIR

default='/image' The BabelNet image index directory.

MAPPING_INDEX_DIR

default='/core_CC_BY_NC_SA_30' The BabelNet mapping index directory.

LANGUAGES

default='languages.yml', on_change=_load_languages The BabelNet languages.

IS_BAD_IMAGE_FILTER_ACTIVE

default=True Whether the bad image filter is active

POINTER_LIST_PATH

default='/home/emilian/PycharmProjects/pybabelnet/pybabelnet/res/pointer.txt' The path of the pointer list.

IS_MUL_CONVERSION_FILTER_ACTIVE

default=True Whether the MUL conversion is active.

USE_REDIRECTION_SENSES

default=True Whether redirections also count as appropriate senses.

CATEGORY_PREFIXES

default='category_prefix.yml', on_change=_load_categories The prefixes for the categories in all languages.

LOG_STDOUT_LEVEL

default='INFO', on_change=_config_logger The logging level of the application.

log()

Log current settings.

pybabelnet.iter module

This module contains all the iterators over BabelNet.

class pybabelnet.iter.BabelIterator(searcher)

Bases: abc.ABC

Abstract iterator over BabelNet’s content.

Parameters:searcher (IndexSearcher) –
has_next()
class pybabelnet.iter.BabelSynsetIterator(searcher)

Bases: pybabelnet.iter.BabelIterator

Iterator over BabelSynsets

class pybabelnet.iter.BabelLexiconIterator(searcher)

Bases: pybabelnet.iter.BabelIterator

Iterator over BabelNet’s lexicon

has_next()
class pybabelnet.iter.BabelOffsetIterator(searcher)

Bases: pybabelnet.iter.BabelIterator

Iterator over BabelNet’s synset offsets

pybabelnet.language module

This module contains the Language enum.

class pybabelnet.language.Language(language_name, right_to_left=False)

Bases: aenum.Enum

A language enumeration.

Parameters:language_name (str) – Name of the language.
Keyword Arguments:
 right_to_left (bool) – Does the language read right to left? (default False).
is_right_to_left(self)

Does the language read right to left?

Return type:bool
static from_iso(iso)

Return the Language with the given ISO code.

Parameters:iso (str) – The iso code.
Returns:The Language object.
Return type:Language
EN = 'English'

English

AF = 'Afrikaans'

Afrikaans

SQ = 'Albanian'

Albanian

AR = ('Arabic', True)

Arabic

HY = 'Armenian'

Armenian

AZ = 'Azerbaijani'

Azerbaijani

EU = 'Basque'

Basque

BN = 'Bengali'

Bengali

BG = 'Bulgarian'

Bulgarian

CA = 'Catalan'

Catalan

ZH = 'Chinese'

Chinese

HR = 'Croatian'

Croatian

CS = 'Czech'

Czech

DA = 'Danish'

Danish

NL = 'Dutch'

Dutch

EO = 'Esperanto'

Esperanto

ET = 'Estonian'

Estonian

FI = 'Finnish'

Finnish

FR = 'French'

French

GL = 'Galician'

Galician

KA = 'Georgian'

Georgian

DE = 'German'

German

EL = 'Greek'

Greek

HE = ('Hebrew', True)

Hebrew

HI = 'Hindi'

Hindi

HU = 'Hungarian'

Hungarian

IS = 'Icelandic'

Icelandic

ID = 'Indonesian'

Indonesian

GA = 'Irish'

Irish

IT = 'Italian'

Italian

JA = 'Japanese'

Japanese

KK = 'Kazakh'

Kazakh

KO = 'Korean'

Korean

LA = 'Latin'

Latin

LV = 'Latvian'

Latvian

LT = 'Lithuanian'

Lithuanian

MS = 'Malay'

Malay

MT = 'Maltese'

Maltese

NO = 'Norwegian (Bokmål)'

Norwegian (Bokmål)

FA = ('Persian', True)

Persian

PL = 'Polish'

Polish

PT = 'Portuguese'

Portuguese

RO = 'Romanian'

Romanian

RU = 'Russian'

Russian

SR = 'Serbian'

Serbian

SIMPLE = 'Simple English'

Simple English

SK = 'Slovak'

Slovak

SL = 'Slovenian'

Slovenian

ES = 'Spanish'

Spanish

SW = 'Swahili'

Swahili

SV = 'Swedish'

Swedish

TL = 'Tagalog'

Tagalog

TA = 'Tamil'

Tamil

TH = 'Thai'

Thai

BO = 'Tibetan'

Tibetan

TR = 'Turkish'

Turkish

UK = 'Ukranian'

Ukranian

UR = 'Urdu'

Urdu

VI = 'Vietnamese'

Vietnamese

CY = 'Welsh'

Welsh

WAR = 'Waray-Waray'

Waray-Waray

CEB = 'Cebuano'

Cebuano

MIN = 'Minangkabau'

Minangkabau

UZ = 'Uzbek'

Uzbek

VO = 'Volapük'

Volapük

NN = 'Norwegian (Nynorsk)'

Norwegian (Nynorsk)

OC = 'Occitan'

Occitan

MK = 'Macedonian'

Macedonian

BE = 'Belarusian'

Belarusian

NEW = 'Newar / Nepal Bhasa'

Newar / Nepal Bhasa

TT = 'Tatar'

Tatar

PMS = 'Piedmontese'

Piedmontese

TE = 'Telugu'

Telugu

BE_X_OLD = 'Belarusian (Taraškievica)'

Belarusian (Taraškievica)

HT = 'Haitian'

Haitian

BS = 'Bosnian'

Bosnian

BR = 'Breton'

Breton

JV = 'Javanese'

Javanese

MG = 'Malagasy'

Malagasy

CE = 'Chechen'

Chechen

LB = 'Luxembourgish'

Luxembourgish

MR = 'Marathi'

Marathi

ML = 'Malayalam'

Malayalam

PNB = ('Western Panjabi', True)

Western Panjabi

BA = 'Bashkir'

Bashkir

MY = 'Burmese'

Burmese

ZH_YUE = 'Cantonese'

Cantonese

LMO = 'Lombard'

Lombard

YO = 'Yoruba'

Yoruba

FY = 'West Frisian'

West Frisian

AN = 'Aragonese'

Aragonese

CV = 'Chuvash'

Chuvash

TG = 'Tajik'

Tajik

KY = 'Kirghiz'

Kirghiz

NE = 'Nepali'

Nepali

IO = 'Ido'

Ido

GU = 'Gujarati'

Gujarati

BPY = 'Bishnupriya Manipuri'

Bishnupriya Manipuri

SCO = 'Scots'

Scots

SCN = 'Sicilian'

Sicilian

NDS = 'Low Saxon'

Low Saxon

KU = 'Kurdish'

Kurdish

AST = 'Asturian'

Asturian

QU = 'Quechua'

Quechua

SU = 'Sundanese'

Sundanese

ALS = 'Alemannic'

Alemannic

GD = 'Scottish Gaelic'

Scottish Gaelic

KN = 'Kannada'

Kannada

AM = 'Amharic'

Amharic

IA = 'Interlingua'

Interlingua

NAP = 'Neapolitan'

Neapolitan

CKB = ('Sorani', True)

Sorani

BUG = 'Buginese'

Buginese

BAT_SMG = 'Samogitian'

Samogitian

WA = 'Walloon'

Walloon

MAP_BMS = 'Banyumasan'

Banyumasan

MN = 'Mongolian'

Mongolian

ARZ = ('Egyptian Arabic', True)

Egyptian Arabic

MZN = ('Mazandarani', True)

Mazandarani

SI = 'Sinhalese'

Sinhalese

PA = 'Punjabi'

Punjabi

ZH_MIN_NAN = 'Min Nan'

Min Nan

YI = ('Yiddish', True)

Yiddish

SAH = 'Sakha'

Sakha

VEC = 'Venetian'

Venetian

FO = 'Faroese'

Faroese

SA = 'Sanskrit'

Sanskrit

BAR = 'Bavarian'

Bavarian

NAH = 'Nahuatl'

Nahuatl

OS = 'Ossetian'

Ossetian

ROA_TARA = 'Tarantino'

Tarantino

PAM = 'Kapampangan'

Kapampangan

OR = 'Oriya'

Oriya

HSB = 'Upper Sorbian'

Upper Sorbian

SE = 'Northern Sami'

Northern Sami

LI = 'Limburgish'

Limburgish

MRJ = 'Hill Mari'

Hill Mari

MI = 'Maori'

Maori

ILO = 'Ilokano'

Ilokano

CO = 'Corsican'

Corsican

HIF = 'Fiji Hindi'

Fiji Hindi

BCL = 'Central Bicolano'

Central Bicolano

GAN = 'Gan'

Gan

FRR = 'North Frisian'

North Frisian

RUE = 'Rusyn'

Rusyn

GLK = ('Gilaki', True)

Gilaki

MHR = 'Meadow Mari'

Meadow Mari

NDS_NL = 'Dutch Low Saxon'

Dutch Low Saxon

FIU_VRO = 'Võro'

Võro

PS = ('Pashto', True)

Pashto

TK = 'Turkmen'

Turkmen

PAG = 'Pangasinan'

Pangasinan

VLS = 'West Flemish'

West Flemish

GV = 'Manx'

Manx

XMF = 'Mingrelian'

Mingrelian

DIQ = 'Zazaki'

Zazaki

KM = 'Khmer'

Khmer

KV = 'Komi'

Komi

ZEA = 'Zeelandic'

Zeelandic

CSB = 'Kashubian'

Kashubian

CRH = 'Crimean Tatar'

Crimean Tatar

HAK = 'Hakka'

Hakka

VEP = 'Vepsian'

Vepsian

AY = 'Aymara'

Aymara

DV = ('Divehi', True)

Divehi

SO = 'Somali'

Somali

SC = 'Sardinian'

Sardinian

ZH_CLASSICAL = 'Classical Chinese'

Classical Chinese

NRM = 'Norman'

Norman

RM = 'Romansh'

Romansh

UDM = 'Udmurt'

Udmurt

KOI = 'Komi-Permyak'

Komi-Permyak

KW = 'Cornish'

Cornish

UG = ('Uyghur', True)

Uyghur

STQ = 'Saterland Frisian'

Saterland Frisian

LAD = 'Ladino'

Ladino

WUU = 'Wu'

Wu

LIJ = 'Ligurian'

Ligurian

FUR = 'Friulian'

Friulian

EML = 'Emilian-Romagnol'

Emilian-Romagnol

AS = 'Assamese'

Assamese

BH = 'Bihari'

Bihari

CBK_ZAM = 'Zamboanga Chavacano'

Zamboanga Chavacano

GN = 'Guarani'

Guarani

PI = 'Pali'

Pali

GAG = 'Gagauz'

Gagauz

PCD = 'Picard'

Picard

KSH = 'Ripuarian'

Ripuarian

NOV = 'Novial'

Novial

SZL = 'Silesian'

Silesian

ANG = 'Anglo-Saxon'

Anglo-Saxon

NV = 'Navajo'

Navajo

IE = 'Interlingue'

Interlingue

ACE = 'Acehnese'

Acehnese

EXT = 'Extremaduran'

Extremaduran

FRP = 'Franco-Provençal/Arpitan'

Franco-Provençal/Arpitan

MWL = 'Mirandese'

Mirandese

LN = 'Lingala'

Lingala

SN = 'Shona'

Shona

DSB = 'Lower Sorbian'

Lower Sorbian

LEZ = 'Lezgian'

Lezgian

PFL = 'Palatinate German'

Palatinate German

KRC = 'Karachay-Balkar'

Karachay-Balkar

HAW = 'Hawaiian'

Hawaiian

PDC = 'Pennsylvania German'

Pennsylvania German

KAB = 'Kabyle'

Kabyle

XAL = 'Kalmyk'

Kalmyk

RW = 'Kinyarwanda'

Kinyarwanda

MYV = 'Erzya'

Erzya

TO = 'Tongan'

Tongan

ARC = ('Aramaic', True)

Aramaic

KL = 'Greenlandic'

Greenlandic

BJN = 'Banjar'

Banjar

KBD = 'Kabardian Circassian'

Kabardian Circassian

LO = 'Lao'

Lao

HA = 'Hausa'

Hausa

PAP = 'Papiamentu'

Papiamentu

TPI = 'Tok Pisin'

Tok Pisin

AV = 'Avar'

Avar

LBE = 'Lak'

Lak

MDF = 'Moksha'

Moksha

JBO = 'Lojban'

Lojban

WO = 'Wolof'

Wolof

NA = 'Nauruan'

Nauruan

BXR = 'Buryat (Russia)'

Buryat (Russia)

TY = 'Tahitian'

Tahitian

SRN = 'Sranan'

Sranan

IG = 'Igbo'

Igbo

NSO = 'Northern Sotho'

Northern Sotho

KG = 'Kongo'

Kongo

TET = 'Tetum'

Tetum

KAA = 'Karakalpak'

Karakalpak

AB = 'Abkhazian'

Abkhazian

LTG = 'Latgalian'

Latgalian

ZU = 'Zulu'

Zulu

ZA = 'Zhuang'

Zhuang

TYV = 'Tuvan'

Tuvan

CDO = 'Min Dong'

Min Dong

CHY = 'Cheyenne'

Cheyenne

RMY = 'Romani'

Romani

CU = 'Old Church Slavonic'

Old Church Slavonic

TN = 'Tswana'

Tswana

CHR = 'Cherokee'

Cherokee

ROA_RUP = 'Aromanian'

Aromanian

TW = 'Twi'

Twi

GOT = 'Gothic'

Gothic

BI = 'Bislama'

Bislama

PIH = 'Norfolk'

Norfolk

SM = 'Samoan'

Samoan

RN = 'Kirundi'

Kirundi

BM = 'Bambara'

Bambara

MO = 'Moldovan'

Moldovan

SS = 'Swati'

Swati

IU = 'Inuktitut'

Inuktitut

SD = ('Sindhi', True)

Sindhi

PNT = 'Pontic'

Pontic

KI = 'Kikuyu'

Kikuyu

OM = 'Oromo'

Oromo

XH = 'Xhosa'

Xhosa

TS = 'Tsonga'

Tsonga

EE = 'Ewe'

Ewe

AK = 'Akan'

Akan

FJ = 'Fijian'

Fijian

TI = 'Tigrinya'

Tigrinya

KS = ('Kashmiri', True)

Kashmiri

LG = 'Luganda'

Luganda

SG = 'Sango'

Sango

NY = 'Chichewa'

Chichewa

FF = 'Fula'

Fula

VE = 'Venda'

Venda

CR = 'Cree'

Cree

ST = 'Sesotho'

Sesotho

DZ = 'Dzongkha'

Dzongkha

TUM = 'Tumbuka'

Tumbuka

IK = 'Inupiak'

Inupiak

CH = 'Chamorro'

Chamorro

MUL = 'International'

International

SH = 'Serbo-Croatian'

Serbo-Croatian

AZB = ('South Azerbaijani', True)

South Azerbaijani

MAI = 'Maithili'

Maithili

LRC = ('Northern Luri', True)

Northern Luri

GOM = 'Goan Konkani'

Goan Konkani

OLO = 'Livvinkarjala'

Livvinkarjala

JAM = 'Patois'

Patois

TCY = 'Tulu'

Tulu

ADY = 'Adyghe'

Adyghe

pybabelnet.pos module

This module contains the POS enum.

class pybabelnet.pos.POS(*args, **kwds)

Bases: aenum.NoAliasEnum

Universal POS tag set.

tag(self)

The POS tag character.

Return type:str
classmethod from_tag(cls, tag)

Construct a POS from a POS tag character.

Parameters:tag (str) – The POS tag character.
Returns:The corresponding POS.
Return type:Optional[POS]
static compare_by_pos(pos1, pos2)

Compare POSes.

Parameters:
  • pos1 (POS) – First POS.
  • pos2 (POS) – Second POS.
Returns:

Compare result.

Return type:

int

ADJ = 'a'
ADV = 'r'
ADP = 'p'
AUX = 'v'
CCONJ = 'c'
DET = 'd'
INTJ = 'i'
NUM = 'u'
NOUN = 'n'
PROPN = 'n'
PRON = 'o'
PART = 'l'
PUNCT = 't'
SCONJ = 'c'
SYM = 's'
VERB = 'v'
X = 'x'
ADJ_ADP = None
ADJ_PRON = None
ADP_ADJ = None
ADP_ADP = None
ADP_CCONJ = None
ADP_DET = None
ADP_NOUN = None
ADP_NUM = None
ADP_PART = None
ADP_PRON = None
ADP_PROPN = None
ADP_X = None
ADV_PRON = None
AUX_PRON = None
CCONJ_ADJ = None
CCONJ_ADP = None
CCONJ_ADV = None
CCONJ_AUX = None
CCONJ_CCONJ = None
CCONJ_DET = None
CCONJ_INTJ = None
CCONJ_NOUN = None
CCONJ_NUM = None
CCONJ_PART = None
CCONJ_PRON = None
CCONJ_PROPN = None
CCONJ_VERB = None
CCONJ_X = None
NOUN_ADJ = None
NOUN_NOUN = None
NOUN_PRON = None
NOUN_PUNCT = None
PART_ADJ = None
PART_ADV = None
PART_AUX = None
PART_NOUN = None
PART_PART = None
PART_PRON = None
PART_VERB = None
PRON_PRON = None
PRON_VERB = None
VERB_ADP = None
VERB_ADV = None
VERB_PRON = None
X_NOUN = None
X_PRON = None
X_X = None
VERB_NOUN = None
PROPN_DET = None

pybabelnet.resources module

This module contains classes of ResourceID’s related to BabelNet.

class pybabelnet.resources.BabelExternalResource(*args, **kwds)

Bases: aenum.AutoNumberEnum

External res linked from BabelNet.

DBPEDIA = 1
YAGO = 2
GEONAMES = 3
FRAMENET = 4
VERBNET = 5
class pybabelnet.resources.ResourceID(id_str, source)

Bases: abc.ABC

A basic resource identifier.

Parameters:
id_str

str – ID of a ResourceID.

pos

Optional[POS] – POS of the resource ID, if available.

source

BabelSenseSource – Source of the resource ID.

language

Optional[Language] – Language of the resource ID, if available.

to_synsets(languages: Iterable[pybabelnet.language.Language] = None)

Convert the ID to a collection of BabelSynsets.

Keyword Arguments:
 languages (Optional[Iterable[Language]]) – The languages to populate the synsets with (default is None).
Returns:The corresponding synsets (in most cases, it will be just a single synset).
Return type:List[BabelSynset]
class pybabelnet.resources.ResourceWithLemmaID(title, language, source, pos=None)

Bases: pybabelnet.resources.ResourceID

A resource identifier with multiple parameters.

Parameters:
Keyword Arguments:
 

pos (Optional[POS]) – The item’s part of speech (default None).

title

The title of this ResourceWithLemmaID.

Return type:str
exception pybabelnet.resources.InvalidSynsetIDError(id_str)

Bases: RuntimeError

Exception for an invalid synset ID. It is thrown when the format of the Babel synset identifier is invalid or is not well formed.

Parameters:id_str (str) – Synset ID.
class pybabelnet.resources.SynsetID(id_str, source)

Bases: pybabelnet.resources.ResourceID

A unique identifier for a Synset.

Raises:InvalidSynsetIDError – Raised if the ID is invalid.
simple_offset

The offset without prefix (e.g. wn: or bn:).

Return type:str
is_valid

True if the SynsetID is valid.

Return type:bool
class pybabelnet.resources.BabelSynsetID(id_str)

Bases: pybabelnet.resources.SynsetID

A resource identifier with the specified BabelSynset ID. To obtain the corresponding synset, call to_synset().

Parameters:id_str (str) – ID of the resource.
Raises:InvalidSynsetIDError – Raise if the ID is invalid.

Examples:

rid = BabelSynsetID('bn:03083790n')
ID_PREFIX = 'bn:'

The BabelNet ID prefix (str).

ID_LENGTH = 12

The BabelNet ID prefix (int).

is_valid

True if the SynsetID is valid.

Return type:bool
to_synset()

From a lightweight BabelSynsetID, create the corresponding BabelSynset.

Returns:The BabelSynset corresponding to this ID.
Return type:BabelSynset
outgoing_edges

The edges (BabelSynsetRelations) which connect the current synset.

Return type:List[BabelSynsetRelation]
class pybabelnet.resources.WordNetSynsetID(id_str, version=WN_30, mapping: Dict[pybabelnet.versions.WordNetVersion, List[str]] = None)

Bases: pybabelnet.resources.SynsetID

A resource identifier with the specified WordNet synset id.

Parameters:

id_str (str) – The synset ID string.

Keyword Arguments:
 
  • version (WordNetVersion) – The WordNet version (default WordNetVersion.WN_30).
  • mapping (Optional[Dict[WordNetVersion, List[str]]]) – Cross-version mapping (default None).
Raises:

InvalidSynsetIDError – Raised if the synset ID is invalid.

Examples:

rid = WordNetSynsetID('wn:00632820n')
version

WordNetVersion – WordNet version.

version_mapping

Optional[Dict[WordNetVersion, List[str]]] – Cross-version mapping.

ID_PREFIX = 'wn:'

The WordNet offset prefix (str).

ID_LENGTH = 12

ID string length (int).

is_valid

True if the SynsetID is valid.

Return type:bool
to_version(target_version)

Obtain a list of WordNetSynsetIDs corresponding to this WordNetSynsetID in the input WordNetVersion.

Parameters:target_version (WordNetVersion) – The target version to convert to.
Returns:Corresponding IDs.
Return type:List[WordNetSynsetID]
class pybabelnet.resources.FrameNetID(id_str)

Bases: pybabelnet.resources.ResourceID

A resource identifier with the specified FrameNet resource id.

Parameters:id_str (str) – ID of the resource.

Examples:

rid = FrameNet('4183')
class pybabelnet.resources.GeoNamesID(id_str)

Bases: pybabelnet.resources.ResourceID

A resource identifier with the specified GeoNames resource id.

Parameters:id_str (str) – ID of the resource.

Examples:

rid = GeoNamesID('3169071')
class pybabelnet.resources.MSTermID(id_str)

Bases: pybabelnet.resources.ResourceID

A resource identifier with the specified Microsoft Terminology resource id.

Parameters:id_str (str) – ID of the resource.

Examples:

rid = MSTermID('ms:63131')
class pybabelnet.resources.OmegaWikiID(id_str)

Bases: pybabelnet.resources.ResourceID

A resource identifier with the specified OmegaWiki resource id.

Parameters:id_str (str) – ID of the resource.

Examples:

rid = OmegaWikiID('ow:1499705')
class pybabelnet.resources.VerbNetID(id_str)

Bases: pybabelnet.resources.ResourceID

A resource identifier with the specified VerbNet resource id.

Parameters:id_str (str) – ID of the resource.

Examples:

rid = VerbNetID('vn:estimate-34.2')
class pybabelnet.resources.WikidataID(id_str)

Bases: pybabelnet.resources.ResourceID

A resource identifier with the specified Wikidata resource id.

Parameters:id_str (str) – ID of the resource.

Examples:

rid = WikidataID('Q4837690')
class pybabelnet.resources.WikipediaID(title, language, pos=None)

Bases: pybabelnet.resources.ResourceWithLemmaID

A resource identifier with the specified Wikipedia page title, language and POS.

Parameters:
  • title (str) – The Wikipedia page title.
  • language (Language) – The Wikipedia page language.
Keyword Arguments:
 

pos (Optional[POS]) – The POS (always noun) (default None).

Examples:

rid = WikipediaID('BabelNet', Language.EN)
rid = WikipediaID('BabelNet', Language.EN, POS.NOUN)
class pybabelnet.resources.WikiquoteID(title, language, pos=None)

Bases: pybabelnet.resources.ResourceWithLemmaID

A resource identifier with the specified Wikiquote page title, language and POS.

Parameters:
  • title (str) – The page title.
  • language (Language) – The page language.
Keyword Arguments:
 

pos (Optional[POS]) – The part of speach tag (default None).

Examples:

rid = WikiquoteID('Rome', Language.EN)
rid = WikiquoteID('Rome', Language.EN, BabelPOS.NOUN)
class pybabelnet.resources.WiktionaryID(id_)

Bases: pybabelnet.resources.ResourceID

A resource identifier with the specified Wiktionary id.

Parameters:id_str (str) – ID of the resource.

Examples:

rid = WiktionaryID('90930s1e1')

pybabelnet.sense module

This module defines BabelSenses and related data.

class pybabelnet.sense.BabelSense(lemma, language, pos, source, sensekey, synset, *, key_sense: bool = False, yago_url: str = None, phonetics: pybabelnet.data.phonetics.BabelSensePhonetics = None)

Bases: object

A sense in BabelNet, contained in a BabelSynset.

Parameters:
  • lemma (str) – The full lemma for this sense.
  • language (Language) – The language of this sense.
  • pos (POS) – The part-of-speech of this sense.
  • source (BabelSenseSource) – The source of this lemma: WordNet, Wikipedia, translation, etc.
  • sensekey (str) – The sensekey of the OmegaWiki, Wikidata, Wiktionary, GeoName, MSTERM or VerbNet sense to which this sense corresponds, if any.
  • synset (BabelSynset) – The synset this sense belongs to.
Keyword Arguments:
 
  • key_sense (bool) – True if it is a key sense (default False).
  • yago_url (Optional[str]) – A link to the corresponding YAGO URI (default None).
  • phonetics (Optional[BabelSensePhonetics]) – The set of the audio pronunciations.
sense_id

int – The id of the sense in the SQL database.

full_lemma

str – The full lemma for this sense.

simple_lemma

str – The simple lemma for this sense (without parentheses, etc.).

language

Language – The language of this sense.

pos

POS – The part-of-speech of this sense.

source

BabelSenseSource – The source of this lemma: WordNet, Wikipedia, translation, etc.

sensekey

str – The sensekey of the OmegaWiki, Wikidata, Wiktionary, GeoName, MSTERM or VerbNet sense to which this sense corresponds, if any.

synset

BabelSynset – The synset this sense belongs to.

synset_id

BabelSynsetID – The synset id this sense belongs to.

pronunciations

Optional[BabelSensePhonetics] – The set of the audio pronunciations.

synset

The synset the sense belongs to.

Return type:BabelSynset
license

The license for this Babel sense.

Return type:BabelLicense
is_key_sense

True if it is a key sense.

Return type:bool
is_automatic_translation

True if the sense is the result of an automatic translation.

Return type:bool
is_not_automatic_translation

Returns True if the sense is NOT the result of an automatic translation.

Return type:bool
sense_str

A String-based representation of this BabelSense alternative to the “canonical” one obtained using __str__. This corresponds to a diesis-like representation if the sense belongs to WordNet, e.g. 'car#n#1' or 'funk#n#3', or the page otherwise the lemma.

Return type:str
to_uri(resource) → Union[str, NoneType]

Return the URI of the sense for a given ExternalResource.

Parameters:resource (BabelExternalResource) – The external resource.
Returns:The URI to the external resource.
Return type:Optional[str]
class pybabelnet.sense.WordNetSense(lemma, language, pos, sensekey, synset, *, wordnet_offset: str = None, wordnet_synset_position: int = None, wordnet_sense_number: int = None, **kwargs)

Bases: pybabelnet.sense.BabelSense

A WordNet sense. Provides WordNet-specific methods.

Parameters:
  • lemma (str) – The full lemma for this sense.
  • language (Language) – The language of this sense.
  • pos (POS) – The part-of-speech of this sense.
  • source (BabelSenseSource) – The source of this lemma: WordNet, Wikipedia, translation, etc.
  • sensekey (str) – The sensekey of the OmegaWiki, Wikidata, Wiktionary, GeoName, MSTERM or VerbNet sense to which this sense corresponds, if any.
  • synset (BabelSynset) – The synset this sense belongs to.
Keyword Arguments:
 
  • wordnet_offset (Optional[str]) – The offset of the WordNet sense to which this sense corresponds, if any.
  • position (Optional[int]) – The position of the WordNet sense to which this sense corresponds.
  • wordnet_sense_number (Optional[int]) – The sense number of the WordNet if any.
  • kwargs (Dict[str, Any]) – Optional parameters of BabelSense.
wordnet_offset

Optional[str] – The offset of the WordNet sense to which this sense corresponds, if any.

sense_number

Optional[int] – The sense number of the WordNet if any.

position

Optional[int] – The position of the WordNet sense to which this sense corresponds.

sense_str

A String-based representation of this BabelSense alternative to the “canonical” one obtained using __str__. This corresponds to a diesis-like representation if the sense belongs to WordNet, e.g. 'car#n#1' or 'funk#n#3', or the page otherwise the lemma.

Return type:str
class pybabelnet.sense.BabelSenseComparator

Bases: object

Comparator for BabelSenses that:

  • puts WordNet senses first
  • sorts WordNet senses based on their sense number
  • sorts Wikipedia senses lexicographically
compare(b1, b2)
Parameters:
Returns:

Compare result.

Return type:

int

class pybabelnet.sense.BabelWordSenseComparator(word: str = None)

Bases: pybabelnet.sense.BabelSenseComparator

BabelSenseComparator for BabelSenses with precedence given to a certain word.

Keyword Arguments:
 word (Optional[str]) – The word whose sense numbers are used to sort the BabelSense.

pybabelnet.synset module

This module defines BabelSynsets and related data.

class pybabelnet.synset.SynsetType(*args, **kwds)

Bases: aenum.Enum

A kind of Synset – namely, named entity, concept or unknown.

NAMED_ENTITY = 'Named Entity'

Named Entity is a word that clearly identifies one item.

CONCEPT = 'Concept'

A concept is an abstraction or generalization from experience.

UNKNOWN = 'Unknown'

Unknown.

class pybabelnet.synset.BabelSynset(synset_id, target_langs)

Bases: abc.ABC

A Babel synset in BabelNet with all the operations.

Parameters:
  • synset_id (BabelSynsetID) – The id of the synset.
  • target_langs (OrderedSet[Language]) – The language filter.
synset_id

BabelSynsetID – The ID of the synset.

translations

All translations between senses found in this BabelSynset.

Return type:Dict[BabelSense, Set[BabelSense]]
images

The images (BabelImages) of this BabelSynset.

Return type:List[BabelImage]
languages

The set of languages used in this Synset.

Return type:Set[Language]
pos

The part of speech of this Synset.

Return type:POS
sense_sources

The list of sense sources contained in the synset.

Return type:List[BabelSenseSource]
domains

The BabelDomains of this BabelSynset.

Return type:Dict[BabelDomain, float]
wordnet_offsets

The WordNet offsets (version 3.0) whose corresponding synsets this BabelSynset covers, if any.

Return type:List[WordNetSynsetID]
is_key_concept

True if the synset is a key concept.

Return type:bool
main_image

The best image (BabelImage) of this BabelSynset.

Return type:Optional[BabelImage]
categories(*languages)

Get the categories (BabelCategory) of this BabelSynset in the specified languages (if not specified, return all categories).

Parameters:languages (Tuple[Language, ..]) – The search languages.
Returns:The categories (BabelCategory) of this BabelSynset
Return type:List[BabelCategory]
main_sense(language: pybabelnet.language.Language = None) → Union[pybabelnet.sense.BabelSense, NoneType]

Get the main BabelSense by importance to this BabelSynset.

Keyword Arguments:
 language (Optional[Language]) – The language of the the main sense (default None).
Returns:The main sense of this Babel synset.
Return type:Optional[BabelSense]
main_sense_preferably_in(language: Union[pybabelnet.language.Language, NoneType]) → Union[pybabelnet.sense.BabelSense, NoneType]

Get the main BabelSense by importance to this BabelSynset preferrably for a given language.

Parameters:language (Optional[Language]) – The preferred language of the main sense.
Returns:The senses of this Babel synset in a specific language sorted by importance.
Return type:Optional[BabelSense]
main_senses(language)

Collect distinct BabelSenses sorted by importance to this BabelSynset for a given language.

Parameters:language (Language) – The search language.
Returns:The senses of this Babel synset in a specific language sorted by importance.
Return type:List[BabelSense]
senses(language: pybabelnet.language.Language = None, source: pybabelnet.data.source.BabelSenseSource = None)

Get the senses contained in this Synset.

Keyword Arguments:
 
  • language (Optional[Language]) – The language used to search (default None).
  • source (Optional[BabelSenseSource]) – The source of the senses to be retrieved (default None).
Returns:

The senses of this synset.

Return type:

List[BabelSense]

senses_by_word(lemma, language, *sources, normalized=False)

Gets the Senses for the input word in the given language.

Parameters:
  • lemma (str) – Lemma of the sense.
  • language (Language) – Language of the sense.
  • sources (Tuple[BabelSenseSource, ..]) – Possible sources for the sense.
Keyword Arguments:
 

normalized (bool) – Use normalization? (default False)

Returns:

The Senses for the input word in the given language.

Return type:

List[BabelSense]

to_uris(resource, *languages)

Return the URIs of the various senses in the given languages in the synset for a given ExternalResource.

Parameters:
Returns:

The URIs to the external resource.

Return type:

List[str]

to_str(*languages)

Return the string representation of the BabelSenses of this BabelSynset only for a specific set of languages.

Parameters:languages (Tuple[Language, ..]) – The languages to use for the string representation.
Returns:A stringified representation of this Babel synset using only the senses in a specific set of languages.
Return type:str
lemmas(language, *admitted_types)

Return the lemmas in this BabelSynset sorted by relevance and type.

Parameters:
  • language (Language) – The language of interest.
  • admitted_types (Tuple[BabelLemmaType, ..]) – The types for the requested synset lemmas.
Returns:

The lemmas in the synset sorted by relevance and type.

Return type:

List[BabelLemma]

main_gloss(language: pybabelnet.language.Language = None) → Union[pybabelnet.data.gloss.BabelGloss, NoneType]

Collect all Glosses in the given language, if any.

Keyword Arguments:
 language – Optional[Language] The gloss language.
Returns:The main Gloss of the synset.
Return type:Optional[BabelGloss]
glosses(language: pybabelnet.language.Language = None, source: pybabelnet.data.source.BabelSenseSource = None)

Collect all Glosses for this Synset.

Keyword Arguments:
 
  • language (Optional[Language]) – The gloss language (default None).
  • source (Optional[BabelSenseSource]) – The gloss source (default None).
Returns:

A list of BabelGlosses.

Return type:

List[BabelGloss]

main_example(language: pybabelnet.language.Language = None) → Union[pybabelnet.data.example.BabelExample, NoneType]

Get the main Example for this Synset.

Keyword Arguments:
 language (Optional[Language]) – The example language (default None).
Returns:The main Example.
Return type:Optional[BabelExample]
examples(language: pybabelnet.language.Language = None, source: pybabelnet.data.source.BabelSenseSource = None)

Collect all Examples for this Synset.

Keyword Arguments:
 
  • language (Optional[Language]) – The example language (default None).
  • source (Optional[BabelSenseSource]) – The example source (default None).
Returns:

List[BabelExample]

wordnet_offset_map_from(from_version)

Obtain a map from WordNetSynsetID of the input WordNetVersion to the current version of WordNet (3.0 as of 2016).

Parameters:from_version (WordNetVersion) – The source WordNet version.
Returns:A map from WordNetSynsetID to list of WordNetSynsetID.
Return type:Dict[WordNetSynsetID, List[WordNetSynsetID]]
wordnet_offset_map_to(to_version)

Obtain a map from the current version of WordNet (3.0 as of 2016) to WordNetSynsetIDs of the input WordNetVersion.

Parameters:to_version (WordNetVersion) – The target WordNet version.
Returns:A map from WordNetSynsetID to list of WordNetSynsetID.
Return type:Dict[WordNetSynsetID, List[WordNetSynsetID]]
outgoing_edges(*relation_types)

Collect all Synset edges incident on this Synset.

Parameters:relation_types (Tuple[BabelPointer, ..]) – The types of the edges connecting this synset to other synsets
Returns:The SynsetRelations incident on this Synset
Return type:List[BabelSynsetRelation]
synset_type

The kind of synset.

Return type:SynsetType
retain_senses(*pred)

Retain all the senses which pass the predicate tests.

Parameters:pred (Tuple[Callable[[BabelSense], bool], ..]) – The predicates used to decide whether to keep each sense in the synset.
Returns:True if at least one sense is left in the synset, False otherwise.
Return type:bool
class pybabelnet.synset.BabelSynsetComparator(word: str = None, language: pybabelnet.language.Language = EN)

Bases: object

Comparator for BabelSynsets that

  • puts WordNet synsets first
  • sorts WordNet synsets based on the sense number of a specific input word
  • sorting Wikipedia synsets lexicographically based on their main sense
Parameters:
  • word (Optional[str]) – The word whose sense numbers are used to sort the BabelSynsets corresponding to WordNet synsets.
  • language (Language) – The language used to sort sense (default Language.EN).
compare(b1, b2)
Parameters:
Returns:

Compare result.

Return type:

int

pybabelnet.versions module

This module contains version data.

class pybabelnet.versions.BabelVersion(*args, **kwds)

Bases: aenum.MultiValueEnum

BabelNet version enumeration.

release_date(self)

The release date of the version.

Return type:datetime.date
classmethod latest_version(cls)

Return the latest version of BabelNet.

Return type:BabelVersion
UNKNOWN = 'unknown'
PRE_2_0 = '< 2.0'
V2_0 = '2.0'
V2_0_1 = '2.0.1'
V2_5 = '2.5'
V2_5_1 = '2.5.1'
V3_0 = '3.0'
V3_1 = '3.1'
V3_5 = '3.5'
V3_6 = '3.6'
V3_7 = '3.7'
V4_0 = '4.0'
LIVE = 'LIVE'
class pybabelnet.versions.WordNetVersion(*args, **kwds)

Bases: aenum.Enum

A version of WordNet.

static from_string(s)
Parameters:s (str) – Version string.
Returns:WordNetVersion
Return type:WordNetVersion
WN_15 = '1.5'
WN_16 = '1.6'
WN_171 = '1.7'
WN_20 = '2.0'
WN_21 = '2.1'
WN_30 = '3.0'