Metadata - cataloging archive materials

AILLA Metadata Spreadsheet

Video Tutorials about the AILLA Metadata Spreadsheet

Metadata is catalog information about each resource. Some of it is used in searches, some defines property rights, other parts document how the resource was created. Complete metadata makes resources easier to accession, easier to protect, and more useful over time. We recommend that you use the AILLA Metadata Spreadsheet (or some other metadata collection method) to manage your language documentation corpus as it is created. We require that you use the AILLA Metadata Spreadsheet when you submit your collection to AILLA.

The best way to understand what metadata is and what information we want from you is to browse the archive and look at a variety of examples. The AILLA Metadata Spreadsheet contains a description of each of the metadata fields.

How to complete AILLA's Metadata Spreadsheet

ALL deposits must be accompanied by the AILLA Metadata Spreadsheet. There are five tabs or sheets:

1. Collection: The contextual information about the collection or project. You will fill out only one row on this tab. See Collections below for more information.

2. Languages: Information about the languages in your collection. See Languages below for more information.

3. Resources: Files are bundled together in a folder in some sort of meaningful way; that folder is called a resource. You will fill out one row per resource. See Resources below for more information.

4. Media Files: Fill out one row for every single file. Note that Column B “Resource name” corresponds to Column A “Title” on the Resource tab.

5. Contributors: Fill out one row for every person who participated in the project and wants to be named. Do not include people who wish to remain anonymous.

6. Terms: A list of our controlled vocabularies.


If this is your first deposit, you will be creating a new collection. This is an organizing layer for your materials that will help future users view your work as a coherent whole. Please take a few minutes to browse some collection pages to get an understanding of this concept. We encourage you to include collection overview materials, such as a bibliography, a summary of the project, maps, etc., anything that will help future users understand and make effective use of your corpus.


All collections will include materials focusing on one or more indigenous languages of Latin America, these are the subject languages of resources and the collection languages of a collection. The materials themselves may be in the subject languages, or may include other languages. For example, an article about Mapadungun may be written in Spanish, or a speaker of Portuguese may elicit vocabulary from a bilingual speaker of Xingú. In these cases, Spanish and Portuguese would be media languages while Mapudungun and Xingú would be subject languages and media languages, since they both appear in the files and are the focus of the files. AILLA staff will consider your list of languages and identify which ones already exist in our collections, and records will be created for any new languages.


Files are bundled together in a folder—called a resource--in some sort of meaningful way, like a recording with transcription and translation files and some photographs. Some bundles consist of a single item, such as a journal article. Others have hundreds of items. For example, a book based on several performances of verbal art has these elements:

  • all the pages in the book, scanned into several hundred tiff files;
  • audio files in two formats (wav and mp3) for each performance;
  • additional photographs of the performers.

Depositors decide how to organize their materials into Resources, and provide descriptions of how the items in each Resource are related. Users can then use this information to reassemble and reuse the Resource.

AILLA's metadata schema conforms to the standards for language resources defined by both the International Standards for Language Engineering and the Open Language Archives Community. We use the United States Library of Congress MODS and MADS metadata schemas.