Languoid data
All metadata related to a languoid (i.e. the content of the languoid’s INI file
and the classification - its relation to other languoids) is available from a
pyglottolog.languoids.Languoid instance.
- class pyglottolog.languoids.Languoid(cfg, lineage=None, id_=None, directory=None, tree=None, _api=None)[source]
Info on languoids is encoded in the INI files and in the directory hierarchy of
pyglottolog.Glottolog.tree. This class provides access to all of it.Languoid formatting:
- Variables:
_format_specs – A dict mapping custom format specifiers to conversion functions. Usage:
>>> l = Languoid.from_name_id_level(pathlib.Path('.'), 'N(a,m)e', 'abcd1234', 'language') >>> '{0:newick_name}'.format(l) 'N{a/m}e'
See also
https://www.python.org/dev/peps/pep-3101/#format-specifiers and https://www.python.org/dev/peps/pep-3101/#controlling-formatting-on-a-per-type-basis
- Parameters:
cfg (
clldutils.inifile.INI) –lineage (
typing.Optional[list[tuple[str,str,str]]]) –id_ (
typing.Optional[str]) –directory (
typing.Optional[pathlib.Path]) –tree (
typing.Optional[pathlib.Path]) –
Refer to the factory methods for typical use cases of instantiating a Languoid:
Languoid.from_id_name_level()
- Parameters:
cfg (
clldutils.inifile.INI) – INI instance storing the languoid’s metadata.lineage (
typing.Optional[list[tuple[str,str,str]]]) – list of ancestors (from root to this languoid).id – Glottocode for the languoid (or None, if directory is passed).
_api – Some properties require access to config data which is accessed through a Glottolog API instance.
id_ (
typing.Optional[str]) –directory (
typing.Optional[pathlib.Path]) –tree (
typing.Optional[pathlib.Path]) –
- classmethod from_dir(directory, nodes=None, _api=None, **kw)[source]
Create a Languoid from a directory, named with the Glottocode and containing md.ini.
This method is used by
pyglottolog.Glottologto read Languoid`s from the repository’s `languoids/tree directory.- Parameters:
directory (
pathlib.Path) –nodes (
typing.Optional[dict[str,tuple[str,pyglottolog.languoids.models.Glottocode,pyglottolog.config.LanguoidLevel]]]) –
- classmethod from_name_id_level(tree, name, id_, level, **kw)[source]
This method is used in pyglottolog.lff to instantiate Languoid s for new nodes encountered in “lff”-format trees.
- newick_node(nodes=None, template=None, maxlevel=None, level=0)[source]
Return a newick.Node representing the subtree of the Glottolog classification starting at the languoid.
- Parameters:
template – Python format string accepting the Languoid instance as single variable named l, used to format node labels.
nodes (
typing.Optional[dict[str,pyglottolog.languoids.languoid.Languoid]]) –
- Return type:
newick.Node
- write_info(outdir=None)[source]
Write Languoid metadata as INI file to outdir/<INFO_FILENAME>.
- Parameters:
outdir (
typing.Optional[pathlib.Path]) –- Return type:
pathlib.Path
- property glottocode: Glottocode
Alias for id
- property category: str | None
Languoid category.
Category name from
pyglottolog.config.LanguageTypefor languoids of level “language”,“Family” or “Pseudo Family” for families,
“Dialect” for dialects.
- property isolate: bool
Flag signaling whether the languoid is an isolate, i.e. has level “language” and is not member of a family.
- children_from_nodemap(nodes)[source]
A faster alternative to children when the relevant languoids have already been read from disc.
- Parameters:
nodes (
dict[str,pyglottolog.languoids.languoid.Languoid]) –- Return type:
- descendants_from_nodemap(nodes, level=None)[source]
A faster alternative to descendants when the relevant languoids have already been read from disc.
- Parameters:
nodes (
dict[str,pyglottolog.languoids.languoid.Languoid]) –- Return type:
- property children: list[pyglottolog.languoids.languoid.Languoid]
List of direct descendants of the languoid in the classification tree.
Note
Using this on many languoids can be slow, because the directory tree may be traversed and INI files read multiple times. To circumvent this problem, you may use a read-only
pyglottolog.Glottologinstance, by passing cache=True at initialization.
- ancestors_from_nodemap(nodes)[source]
A faster alternative to ancestors when the relevant languoids have already been read from disc.
- Parameters:
nodes (
dict[str,pyglottolog.languoids.languoid.Languoid]) –- Return type:
- iter_ancestors()[source]
Yield ancestors going up the directory tree.
- Return type:
collections.abc.Generator[pyglottolog.languoids.languoid.Languoid,None,None]
- property ancestors: list[pyglottolog.languoids.languoid.Languoid]
List of ancestors of the languoid in the classification tree, from root (i.e. top-level family) to parent node.
Note
Using this on many languoids can be slow, because the directory tree may be traversed and INI files read multiple times. To circumvent this problem, you may use a read-only
pyglottolog.Glottologinstance, by passing cache=True at initialization.
- property parent: Languoid | None
Parent languoid or None.
Note
Using this on many languoids can be slow, because the directory tree may be traversed and INI files read multiple times. To circumvent this problem, you may use a read-only
pyglottolog.Glottologinstance, by passing cache=True at initialization.
- property family: Languoid | None
Top-level family the languoid belongs to or None.
Note
Using this on many languoids can be slow, because the directory tree may be traversed and INI files read multiple times. To circumvent this problem, you may use a read-only
pyglottolog.Glottologinstance, by passing cache=True at initialization.
- property names: dict[str, list]
A dict mapping alternative name providers to list s of alternative names for the languoid by the given provider.
- add_name(name, type_='glottolog')[source]
Add an alternative name.
- Parameters:
name (
str) –type_ (
str) –
- update_names(names, type_='glottolog')[source]
Update alternative names of a specific type.
- Parameters:
names (
collections.abc.Iterable[str]) –type_ (
str) –
- Return type:
bool
- property identifier: dict | SectionProxy
Alternative identifiers of the languoid.
- property sources: list[pyglottolog.languoids.models.Reference]
List of Glottolog references linked to the languoid
- Return type:
pyglottolog.references.Reference
- property endangerment: Endangerment | None
Endangerment information about the languoid.
- Return type:
- property classification_comment: ClassificationComment | None
Classification information about the languoid.
- Return type:
- property ethnologue_comment: EthnologueComment | None
Commentary about the classification of the languoid in Ethnologue.
- Return type:
- property macroareas: list[pyglottolog.config.Macroarea]
- Return type:
list of
config.Macroarea
- property timespan: tuple[int, int] | None
Extinct languages are associated with a timespan
- property links: list[pyglottolog.languoids.models.Link]
Links to web resources related to the languoid
- update_links(domain, urls)[source]
Update the links section of the languoid for a particular domain.
- Parameters:
domain (
str) –urls (
collections.abc.Iterable[str]) –
- Return type:
bool
- property countries: list[pyglottolog.languoids.models.Country]
Countries a language is spoken in.
- property name: str
The Glottolog mame of the languoid
- property latitude: float | None
The geographic latitude of the point chosen as representative coordinate of the languoid
- property longitude: float | None
The geographic longitude of the point chosen as representative coordinate of the languoid
- property hid: str | None
The languoid’s “H(arald)ID”, aka a “NOCODE” code.
- closest_iso(api=None, nodes=None)[source]
ISO 639-3 code assigned to the languoid or one of its ancestors in the classification (in case of dialects) or None.
- Parameters:
api (
typing.Optional[pyglottolog.api.Glottolog]) –nodes (
typing.Optional[dict[str,pyglottolog.languoids.languoid.Languoid]]) –
- Return type:
typing.Optional[str]
- property iso_retirement: ISORetirement | None
Information about a retired ISO code related to the languoid.
- property fname: Path
The location of the languoid’s info file in the Glottolog tree directory.
- class pyglottolog.languoids.Glottocodes(fname)[source]
Registry keeping track of glottocodes that have been dealt out.
Some of the data available for languoids has enough internal structure to merit separate classes, simplyfying access.
- class pyglottolog.languoids.Reference(key, pages=None, trigger=None, endtag='**', pattern=re.compile('\\\\*\\\\*(?P<key>[a-z0-9\\\\-_]+:[a-zA-Z.?\\\\-;*\\'/()\\\\[\\\\]!_:0-9\\\\u2014]+?)(?P<endtag>\\\\*\\\\*|\\\\(\\\\*\\\\*\\\\))(:(?P<pages>[0-9\\\\-f]+))?(<trigger "(?P<trigger>[^\\\\"]+)">)?'), old_pattern=re.compile('[^\\\\[]+\\\\[(?P<pages>[^]]*)]\\\\s*\\\\([0-9]+\\\\s+(?P<key>[^)]+)\\\\)'))[source]
A reference of a bibliographical record in Glottolog.
- Parameters:
key (
str) –pages (
typing.Optional[str]) –trigger (
typing.Optional[str]) –endtag (
str) –pattern (
re.Pattern) –old_pattern (
re.Pattern) –
- property provider: str
The provider id.
- property bibname: str
The name of the bibtex file.
- property bibkey: str
The local bibtex key in the bib.
- classmethod from_match(match)[source]
Instantiate a reference from a regex match.
- Parameters:
match (
re.Match) –- Return type:
- classmethod from_string(string, pattern=None)[source]
Parse a reference from a string.
- Parameters:
string (
str) –pattern (
typing.Optional[re.Pattern]) –
- Return type:
- classmethod from_list(list_, pattern=None)[source]
Turn list of strings into list of Reference instances.
- Parameters:
list_ (
collections.abc.Iterable[typing.Union[pyglottolog.languoids.models.Reference,str]]) –pattern (
typing.Optional[re.Pattern]) –
- Return type:
- class pyglottolog.languoids.Endangerment(status, source, comment, date)[source]
Info about the endangerment status of the languoid
- Parameters:
status (
pyglottolog.config.AES) –source (
pyglottolog.config.AESSource) –comment (
str) –date (
datetime.datetime) –
-
date:
datetime.datetime Date when the endangerment status was assessed
- class pyglottolog.languoids.EthnologueComment(isohid, comment_type, ethnologue_versions=<factory>, comment=None)[source]
Commentary about the classification of the languoid according to Ethnologue
- Parameters:
isohid (
str) –comment_type (
typing.Literal['spurious','missing']) –ethnologue_versions (
list[str]) –comment (
str) –
-
comment_type:
typing.Literal['spurious','missing'] Either
“spurious” meaning the comment is to explain why the languoid in question is spurious and in which Ethnologue (as below) that is/was
“missing” meaning the comment is to explain why the languoid in question is missing (as a language entry) and in which Ethnologue (as below) that is/was
-
ethnologue_versions:
list[str] Which Ethnologue version(s) from E16-E19 the comment pertains to, joined by /:s. E.g. E16/E17. In the case of comment_type=spurious, E16/E17 in the version field means that the code was spurious in E16/E17 but no longer spurious in E18/E19. In the case of comment_type=missing, E16/E17 would mean that the code was missing from E16/E17, but present in E18/E19. If the comment concerns a language where versions would be the empty string, instead the string ISO 639-3 appears.
- class pyglottolog.languoids.ISORetirement(code=None, name=None, change_request=None, effective=None, reason=None, change_to=<factory>, remedy=None, comment=None)[source]
Information extracted from accepted ISO 639-3 change requests about retired ISO codes associated with the languoid.
- Parameters:
code (
typing.Optional[str]) –name (
typing.Optional[str]) –change_request (
typing.Optional[str]) –effective (
typing.Optional[str]) –reason (
typing.Optional[str]) –change_to (
list[str]) –remedy (
typing.Optional[str]) –comment (
typing.Optional[str]) –
-
code:
typing.Optional[str] = None Retired ISO 639-3 code
-
name:
typing.Optional[str] = None Name of the retired ISO language
-
change_request:
typing.Optional[str] = None Number of the ISO change request
-
effective:
typing.Optional[str] = None Date of acceptance of the change request
-
reason:
typing.Optional[str] = None Reason to retire the ISO code
-
change_to:
list[str] List of ISO codes replacing the retired code
-
remedy:
typing.Optional[str] = None What to do about the retired code
- class pyglottolog.languoids.ClassificationComment(sub=None, subrefs=<factory>, family=None, familyrefs=<factory>)[source]
Commentary on the classification of the languoid
- Parameters:
sub (
typing.Optional[str]) –subrefs (
list[pyglottolog.languoids.models.Reference]) –family (
typing.Optional[str]) –familyrefs (
list[pyglottolog.languoids.models.Reference]) –
-
sub:
typing.Optional[str] = None Commentary on the internal classification of the descendants of the languoid
-
subrefs:
list[pyglottolog.languoids.models.Reference] References for the internal classification
-
family:
typing.Optional[str] = None Commentary on the classification of the languoid within its family
-
familyrefs:
list[pyglottolog.languoids.models.Reference] References for the family classification
- merged_refs(type_)[source]
Get unique sources referenced for the classification type, with accumulated page ranges.
- Parameters:
type_ (
typing.Literal['sub','family']) –- Return type: