Controlled Vocabulary

Controlled Vocabulary

The data in AILLA are organized into collections that are made up of resources, or folders, that contain related media files. A collection must contain at least one resource; and a resource must contain at least one media file.

Collection: AILLA collections are organized around a particular person's research (e.g. the Kuna Collection of Joel Sherzer) or a particular research project (e.g., the Chatino Language Documentation Project Collection).

Resource: An AILLA resource is a folder or bundle of related media files. The organizing principles for resources can vary. Examples include (but are not limited to):

  • All the files from a single recording session or speech event (e.g. audio, video, transcription, notes, photographs);
  • All media files associated with a particular town or speaker (e.g., survey responses, traditional stories), etc.;
  • All media files associated with a particular theme (e.g., toponyms, ideophones, grammatical categories, etc.).

Media Files: The digital data files; media files are divided into 6 content models in AILLA: audio, video, basic images, large images, PDFs, binary objects (non-PDF text-based files). Please see the list of allowed file format types here.

Controlled Vocabulary: AILLA has its own unique controlled vocabulary schemas for organizing information in the archive. Depositors will need to use the controlled vocabulary from 4 schemas (listed below) as they prepare their metadata. The AILLA-specific definition of each controlled vocabulary term for these 4 schemas follows.


Article: A work of writing included with others in a publication such as a journal or edited volume.

Book: A stand-alone written and published work, either analog or digital.

Ceremonial dialog: Conversation between two or more people as a feature of a ceremony.

Ceremony: The ritual observances and procedures performed to mark particular occasions.

Chant: A monotonous or repetitive song, typically an incantation or part of a ritual.

Commentary: A descriptive spoken account of an event or performance as it happens.

Conversation: People speaking informally to each other.

Correspondence: Communication by exchanging letters or emails.

Curse: A solemn utterance intended to invoke a supernatural power to inflict harm or punishment on someone or something.

Dataset: A collection of related sets of information that is composed of separate elements.

Debate: A formal discussion on a particular topic in a public meeting or legislative assembly, in which opposing arguments are put forward.

Description: A spoken or written representation or account of a an event, place, custom, phenomenon, thing, etc.

Dispute: A disagreement, argument, or debate.

Document: A piece of written, printed, or electronic matter that provides information or evidence or that serves as an official record.

Drama: A play for theater, church, radio, or television.

Educational material: Resources meant for teaching and learning.

Elicitation: Process of asking a series of questions meant to draw out and tease apart information.

Ethnography: The description of the customs, procedures, and workflows of individual peoples and cultures.

Field notes: Notes taken while conducing field work.

Grammar: A description of the structure of a language.

Grammar sketch: A short description of the structure of a language.

Greeting & leave-taking: Rituals accociated with beginning and ending communicative interactions.

Handout: Printed information to accompany a lecture or presentation.

History: Past events connected to someone, something, some place.

Image: A physical likeness or representation of a person, place, thing, animal, etc. as in a drawing or photograph.

Instructions: Information telling how something should be done.

Instrumental music: Music without accompanying singing.

Interview: A formal meeting in which one or more persons ask questions of another person or other people.

Lexicon: A vocabulary or dictionary.

Map: A diagram or representation of an area of land, etc., showing physical features, bodies of water, roads, etc.

Meeting: An assembly of people for a purpose.

Myth: A traditional or legendary story.

Narrative: A spoken or written account of connected events; a story.

Oratory: Formal, public speech.

Permissions: The process of getting informed consent.

Photograph: An image made with a camera.

Poetry: A rhythmical written or spoken literary composition.

Prayer: Spiritual communication with God or another object of worship.

Presentation: A formal speech or lecture, either written or spoken.

Procedure: An established way of doing something.

Proverb: A short, pithy saying or story in general use.

Questionnaire: A set of questions for the purpose of a survey or study.

Reader: A book designed for reading practice.

Recipe: Instructions for preparing a particular dish or food.

Ritual song: A song that is part of a ritual.

Song: A poem set to music and/or meant to be sung.

Speech play: Verbal art such as jokes, metaphor, parallelism and other narrative manipulations of speech.

Testimony: A formal written or spoken statement, such as a personal history, a religious experience, a legal statement, a declaration.

Thesis: A long essay or dissertation involving personal research, written by a candidate for a college degree.

Unintelligible speech: Ritually unintelligible speech; glossolalia; speaking in tongues.

Whistled speech: Whistling that emulates speech; whistled communication.

Wordlist: A list of words.

Participant Roles

Actor, performer: A person who acts or performs a part in a dramatic production, play, skit, religious pageant.

Analyst: A person who analyzes a recording, text or dataset.

Annotator: A person who annotates a recording, text or dataset.

Author: The writer of a book, article, report, poem, etc.

Collector: A person or organization that was responsible for or that oversaw the collection of the materials contained in an archival collection; the person or organization whose name is on a collection.

Compiler: A person who produces a list or dataset by assembling information or material from other sources.

Consultant: A person who provides expert information on a topic.

Contributor: A person who contributed in some way to creation of the material. Use only if there is not a more specific role.

Creator: The creator of a resource. Use only when nothing more specific is appropriate.

Data technician, keyboarder: A person who entered the data (e.g., into a database program, word processing file, etc.).

Depositor: A person who deposits a collection in an archive. Frequently this person is the same as the collector; sometimes this person is the heir or representative of the collector.

Digitizer: A person or institution that digitized an analog artifact.

Editor: A person who edits a book, journal, document, file.

Illustrator: A person who draws or creates pictures or illustrations.

Interlocutor/Interlocutor: A person who takes part in a conversation or dialog.

Interpreter: A person who interprets or translates oral speech or signed language.

Interviewer: A person who conducts an interview or elicitation either free form or following a questionnaire.

Photographer: A person who takes photographs.

Publisher: A business that publishes a book, article, map, image, recording, etc.

Recorder: A person who makes an audio recording.

Research participant: A research assistant, collaborator, or subject. Use only when nothing more specific is appropriate.

Researcher: A person who conducts, oversees, or is in charge of the research.

Responder: A person who answers interview or eliciation questions.

Sign language signer: A person who signs the sign language under investigation.

Signatory: A person who signs a document.

Singer: A person who sings a song.

Speaker: A person who speaks the research language.

Sponsor: A person or organization that funded the research.

Transcriber: A person who transcribes an audio or video recording.

Translator: A person who translates a recording (audio or video), transcription or other text.

Videographer: A person who makes a video recording.

Media Content Type

Annotation: A text that contains added comments, notes, explanations, etc.

Commentary: A description of or comment on another recordings, text, event, etc. similar to a director's commentary on a movie or a book review.

Context: Context for an event, such as introductory remarks before a performance.

Guide: Index, table of contents, gloss codes, abbreviation list, etc.

Illustration: Any image (hand-drawn or computer rendered) other than a photograph.

Interlinearization: Annotation that includes transcription, morpheme breakdowns, glosses, and translations.

Interpretation: A re-telling or interpretation of the primary text.

Photo: Photograph.

Primary text: The original recording (audio, video) of a speech event or the entirely of an original written document.

Sample: A one minute sample of a primary recording.

Transcription: A written representation of speech or signed language.

Transcription & translation: A written representation of speech plus the translation of that speech into another language.

Translation: The translation of a primary text or a transcription. This can be written or audio/video recorded.

Original Media Type

audio:cassette - Audio: analog magnetic tape enclosed in a cassette.

audio:DAT - Audio: magnetic Digital Audio Tape enclosed in a cassette.

audio:digital - Born-digital audio; audio recorded directly to digital media.

audio:minidisc - Audio: Magneto-optical disc-based audio storage; small disk enclosed in a plastic case.

audio:open-reel - Audio: Reel-to-reel magnetic tape audio recording; tape moves from one reel to another.

image:analog - Analog images of photos printed on paper, prints, negatives, or slides.

image:digital - Digital images of any born-digital format: jpg, jpeg, gif, tiff, png.

other - Anything not listed.

text:archivable digital - Born-digital textual files that can be archived without first converting to another format: csv, eaf, html, pdf/a, trs, txt, xml.

text:Excel - A Microsoft Excel file.

text:manuscript - Analog text: handwritten or typed.

text:published - Analog text: Previously published books, articles.

text:Shoe,Toolbox - Any kind of Shoebox or Toolbox database file.

text:unarchivable digital - Born-digital textual files in proprietary formats that must be converted to to non-proprietary formats.

text:unpublished - Analog text: printed materials that were not published and that are unlikely to have been preserved elsewhere.

text:Word - A Microsoft Word file.

text:WordPerfect - A WordPerfect file.

unknown - Unknown original medium; not specified by depositor/creator.

video:avi - An avi video file.

video:cassette - Video: analog magnetic tape enclosed in a cassette.

video:digital - Born-digital video, any unlisted format (mov, wma, …).

video:dv - DV or mini-DV, digital video on tape enclosed in a cassette.

video:mov - An mov video file.

video:mpeg - Archivable video formats in the mpeg family.