Skip to content

anthology

Anthology

Anthology(datadir, verbose=True)

An instance of the ACL Anthology data.

Attributes:

Name Type Description
datadir PathLike[str]

The path to the data folder.

verbose bool

If False, will not show progress bars during longer operations.

collections instance-attribute

collections = CollectionIndex(self)

The CollectionIndex for accessing collections, volumes, and papers.

events instance-attribute

events = EventIndex(self, verbose)

The EventIndex for accessing events.

people instance-attribute

people = PersonIndex(self, verbose)

The PersonIndex for accessing authors and editors.

relaxng property

relaxng

The RelaxNG schema for the Anthology's XML data files.

sigs instance-attribute

sigs = SIGIndex(self)

The SIGIndex for accessing SIGs.

venues instance-attribute

venues = VenueIndex(self)

The VenueIndex for accessing venues.

find_people

find_people(name_def)

Find people by name.

Parameters:

Name Type Description Default
name_def ConvertableIntoName

Anything that can be resolved to a name; see below for examples.

required

Returns:

Type Description
list[Person]

A list of Person objects with the given name.

Examples:

>>> anthology.find_people("Doe, Jane")
>>> anthology.find_people(("Jane", "Doe"))       # same as above
>>> anthology.find_people({"first": "Jane",
                             "last": "Doe"})      # same as above
>>> anthology.find_people(Name("Jane", "Doe"))   # same as above

from_repo classmethod

from_repo(
    repo_url="https://github.com/acl-org/acl-anthology.git",
    path=None,
    verbose=True,
)

Instantiates the Anthology from a Git repo.

Parameters:

Name Type Description Default
repo_url str

The URL of a Git repo with Anthology data. If not given, defaults to the official ACL Anthology repo.

'https://github.com/acl-org/acl-anthology.git'
path Optional[PathLike[str]]

The local path for the repo data. If not given, automatically determines a path within the user's data directory.

None
verbose bool

If False, will not show progress bars during longer operations.

True

get

get(full_id)

Access collections, volumes, and papers, depending on the provided ID.

Parameters:

Name Type Description Default
full_id AnthologyID

An Anthology ID that refers to a collection, volume, or paper.

required

Returns:

Type Description
Optional[Collection | Volume | Paper]

The object corresponding to the given ID.

get_event

get_event(event_id)

Access an event by its ID.

Parameters:

Name Type Description Default
event_id str

An ID that refers to an event, e.g. "acl-2022".

required

Returns:

Type Description
Optional[Event]

The event associated with the given ID.

get_paper

get_paper(full_id)

Access a paper by its ID.

Parameters:

Name Type Description Default
full_id AnthologyID

An Anthology ID that refers to a paper.

required

Returns:

Type Description
Optional[Paper]

The volume associated with the given ID.

get_person

get_person(person_id)

Access a person by their ID.

Parameters:

Name Type Description Default
person_id str

An ID that refers to a person.

required

Returns:

Type Description
Optional[Person]

The person associated with the given ID.

get_volume

get_volume(full_id)

Access a volume by its ID or the ID of a contained paper.

Parameters:

Name Type Description Default
full_id AnthologyID

An Anthology ID that refers to a volume or paper.

required

Returns:

Type Description
Optional[Volume]

The volume associated with the given ID.

load_all

load_all()

Load all Anthology data files.

Calling this function is not strictly necessary. If you access Anthology data through object methods or SlottedDict functionality, data will be loaded on-the-fly as required. However, if you know that your program will load all data files (particularly the XML files) eventually, for example by iterating over all volumes/papers, loading everything at once with this function can result in a considerable speed-up.

papers

papers(full_id=None)

Returns an iterator over all papers.

Parameters:

Name Type Description Default
full_id Optional[AnthologyID]

If provided, only papers matching the given ID will be included.

None

resolve

resolve(name_spec)

Resolve a name specification (e.g. as attached to papers) to a natural person.

Parameters:

Name Type Description Default
name_spec NameSpecificationOrIter

A name specification, or an iterator over name specifications.

required

Returns:

Type Description
PersonOrList

A single Person object if a single name specification was given, or a list of Person objects with equal length to the input iterable otherwise.

Examples:

>>> paper = anthology.get("C92-1025")
>>> anthology.resolve(paper.authors)
[Person(id='lauri-karttunen', ...), Person(id='ronald-kaplan', ...), Person(id='annie-zaenen', ...)]

volumes

volumes(collection_id=None)

Returns an iterator over all volumes.

Parameters:

Name Type Description Default
collection_id Optional[str]

If provided, only volumes belonging to the given collection ID will be included.

None