Reference data

Glottolog’s reference data consists of bibliographical information in a set of BibTeX files, described with metadata in BIBFILES.ini.

This information can be accessed via an instance of pyglottolog.Glottolog, too:

>>> Glottolog()
>>> print(g.bibfiles['hh'].description)
The bibliography of HH, typed in between 2005-2020.
It has been annotated by hand (type and language).
It contains descriptive material from all over the world, mostly lesser-known languages.
>>> print(g.bibfiles['hh:s:Karang:Tati-Harzani'])
@book{s:Karang:Tati-Harzani,
    author = {'Abd-al-'Ali Kārang},
    title = {Tāti va Harzani, do lahja az zabān-i bāstān-e Āẕarbāyjān},
    publisher = {Tabriz: Tabriz University Press},
    address = {Tabriz},
    pages = {6+160},
    year = {1334 [1953]},
    glottolog_ref_id = {41999},
    hhtype = {grammar_sketch},
    inlg = {Farsi [pes]},
    lgcode = {Tati, Harzani [hrz]},
    macro_area = {Eurasia}
}

The objects representing reference data are described below.

class pyglottolog.references.BibFiles(bibfiles)[source]

Ordered collection of BibFile objects accessible by filname or index.

classmethod from_path(path, api=None)[source]

BibTeX files from <path>/bibtex/*.bib if listed in <path>/BIBFILES.ini.

Parameters:
Return type:

pyglottolog.references.bibfiles.BibFiles

__getitem__(index_or_filename)[source]

Retrieve a bibfile by index or filename or an entry by qualified key.

Parameters:

index_or_filename (typing.Union[int, str]) – Either an int index, or a bibfile name, or a provider-qualified BibTeX key in the form <prov>:<key>.

Return type:

typing.Union[pyglottolog.references.bibfiles.BibFile, pyglottolog.references.bibfiles.Entry]

Returns:

A BibFile instance, or an Entry instance.

to_sqlite(filepath='bibfiles.sqlite3', verbose=False)[source]

Return a database with the bibfiles loaded.

Return type:

pyglottolog.references.bibfiles_db.Database

roundtrip_all()[source]

Load and save all bibfiles with the current settings.

Return type:

list[None]

class pyglottolog.references.BibFile(fname, name=None, title=None, description=None, abbr=None, encoding='utf-8', normalize='NFC', sortkey=None, priority=0, url=None, curation=None, api=None)[source]

Represents a BibTeX file, storing a provider’s bibliography, providing easy access to its records.

Parameters:
  • fname (pathlib.Path) –

  • name (str) –

  • title (str) –

  • description (str) –

  • abbr (str) –

  • encoding (str) –

  • normalize (str) –

  • sortkey (str) –

  • priority (int) –

  • url (str) –

  • curation (str) –

  • api (typing.Any) –

name: str = None

Short name of the bibliography

title: str = None

Title of the bibliography

description: str = None

The provenance of the bibliography

url: str = None

URL pointing to the source of the bibliography

curation: str = None

Curation policy for the bibliography at Glottolog

__getitem__(item)[source]
Parameters:

item (str) – BibTeX citation key of an entry

Raises:

KeyError – if no matching Entry is contained in the BibFile

Return type:

pyglottolog.references.bibfiles.Entry

visit(visitor=None)[source]

Visit the entries of the bibfile, possibly manipulating them in place.

Parameters:

visitor (typing.Optional[typing.Callable[[pyglottolog.references.bibfiles.Entry], bool]]) –

property size: int

Size of the file in bytes.

property mtime: datetime

Modification time.

keys()[source]

List of provider-qualified keys of the bibfile

Return type:

list[str]

property glottolog_ref_id_map: dict[str, str]

Maps bibkey to glottolog_ref_id value.

update(fname, log=None, keep_old=False)[source]

Update the bibfile with the data from fname.

Parameters:
  • fname (typing.Union[str, pathlib.Path]) –

  • log (typing.Optional[logging.Logger]) –

load(preserve_order=None)[source]

Return entries as bibkey -> (entrytype, fields) dict.

save(entries)[source]

Write bibkey -> (entrytype, fields) map to file.

__str__()[source]

Return str(self).

check(log)[source]

Run checks and report the result.

Parameters:

log (logging.Logger) –

Return type:

tuple[int, str]

show_characters(include_plain=False)[source]

Display character-frequencies (excluding printable ASCII).

class pyglottolog.references.Entry(key, type, fields, bib, api=None)[source]

Represents an entry in a BibFile, i.e. a bibliographical record.

Note

Entry instances are orderable. The ordering is the one used to compute MEDs, i.e.

  • grammars are “better” than other document types,

  • more pages is “better” than less,

  • more recent is “better” than old.

>>> g = pyglottolog.Glottolog()
>>> g.bibfiles['hh:g:MacDonell:Sanskrit'] > g.bibfiles['hh:hv:Weijnen:Nederlandse']
True
>>> refs = g.refs_by_languoid(gl.bibfiles['hh'])
>>> sorted(refs[0]['stan1295'])[-1].med_type.name
'long grammar'
Parameters:
type: str

BibTeX entry type

fields: dict

The metadata of the record

property weight: tuple[int, int, int, str]

The weight which determines ordering when computing MEDs.

property med_type: MEDType | None

The entry’s type on the MED scale.

property year_int: int | None

Year as number if possible.

property pages_int: int | None

Number of pages as int.

property publisher_and_address: tuple[Optional[str], Optional[str]]

Publisher and address values.

text()[source]

Return the text linearization of the entry.

Return type:

str

property id: str

The qualified entry ID, including the provider prefix.

classmethod lgcodes(string)[source]

Parse language codes from a string.

Return type:

list[str]

static parse_ca(s)[source]

Read a trigger expression form a field value.

Parameters:

s (str) –

Return type:

typing.Optional[str]

languoids(langs_by_codes)[source]

Expand the language codes mentioned in a reference’s “lgcode” field to Languoid objects.

Parameters:

langs_by_codes (dict) –

Return type:

tuple[list, typing.Optional[str]]

doctypes(hhtypes)[source]

Ordered doctypes assigned to this entry.

Parameters:

hhtypesOrderedDict mapping doctype names to doctypes

Returns:

list of values of hhtypes which apply to the entry, ordered by occurrence in hhtypes.