The Language Archive (TLA) is a unit of the Max Planck Institute for Psycholinguistics concerned with digital language resources and tools. Its major features are:
- A large data archive holding resources on languages worldwide. Many of the data are annotated audio and video recordings, but several other data types such as time series (eye tracking, brain images) are included, too. For some languages, the archive contains language acquisition data, data obtained in psychological experiments etc.
Perhaps the best known part of the archive concerns data on smaller languages and cultures as typically obtained in field work. We have currently (end of 2011) data on about 200 languages, where data on some 60 languages come from research projects of the program “Documentation of Endangered Languages” (DOBES).
- TLA is involved in a wide variety research projects which entail infrastructure and software development. Therefore, we have been able to develop several tools for creating, managing and exploring linguistic resources. The “Language Archiving Technology” (LAT) suite of tools and web-services allows for example annotating and analysing recordings and creating and manipulating online multimedia lexical databases. Annotated recordings can be organized into “sessions” which are described and organized by metadata in standard formats. Using LAT web-applications, all data can then be integrated into a structured, sustainable repository with differentiated access levels and user management, and they can be accessed online with the help of dedicated LAT tools for searching and presentation.
- Our long-term expertise with archiving and software development has provided the background and basis for our participation in trend setting international projects and collaborations that aim at developing lasting and functional infrastructures for the digital humanities in general. To name only a few: ISLE, DOBES, CLARIN (EU, D, NL), EUDAT, DASISH, Radieschen, TextGrid, AVATecH, INNET. We also support institutions worldwide that want to establish a LAT-based repository on their own, and we are organizing and participating in education and training activities.