Resources in our repository exist in different languages, for different types of materials, as well
as for different categories of material.
Thanks to the use of Standards for the storage and
description of these resources, they can all be listed from a single access point, as demonstrated here.
UCLA Phonetics Lab Data
Index of Languages, Index of Sounds, Map Index, and material relevant to Peter Ladefoged books: A course in phonetics and Vowels and Consonants.
Localized Dictionaries for Mozilla Thunderbird
The XPI files are basically just a ZIP format with some special requirements for Mozilla.
Unzip a file and you get two files xxxxxx.dic and xxxxxx.aff which are the "standard" myspell format.
and the links from it will tell you how to interpret them.
Dictionaries list at linguistlist
Links to over 200 dictionaries of specific languages and a collection of multilingual dictionaries, as well as acronym dictionaries, thesauri, and dictionaries of specialized terms. It also includes dictionary projects (e.g. The Euro Wordnet Project).
Chinese Word frequencies based on film and television subtitles.
Dutch Word frequencies based on film and television subtitles.
According to the authors "The best definitions and explanations for over one million topics." The matches in dictionaries, wikipedia encyclopedia, or wordnet are all grouped on one screen.
Dictionary for street slang.
Various dictionary resources.
Four Letter Words
This small project is an attempt to give a spacial overview of the entirety of this part of english language heritage, as well as to explore and visualize relations between all four letter words.
WordCount™ is an interactive presentation of the 86,800 most frequently used English words..
English (US) Word frequencies based on film and television subtitles.
Lexical resources at the LEAD (Universite de Bourgogne)
Liste de bases de données lexicales à Orthorélie
Glossaires de termes
Lexiques, glossaires et dictionnaires spécialisés (très longue liste)
Diccionario de la lengua Española
Nuevo Tesoro Lexicografico de la Lengua Española
La Real Academia Española
Royal Academy of Spanish Language
A multilingual pseudoword generator.
[From the website] On this webpage you will find an annotated reference system to find everything related to Corpus Linguistics that is available on the Internet: Corpora, Concordances, Corpus Linguistics research efforts and events, software for tagging, annotation etc.
Devoted to Corpora (Bookmarks for Corpus-based Linguists)
[From the website] These annotated links (c. 1,000 of them) are meant mainly for linguists and language teachers who work with corpora, not computational linguists/NLP (natural language processing) people, so although the language-engineering-type links here are fairly extensive, they are not exhaustive (for such info, you'll have to look elsewhere). Stuff here also represent my personal interests and biases (which will be obvious in some of my descriptive notes) and consequently there may be gaps, errors and omissions which you are welcome to tell me about. The English language bias on these pages will, I hope, be forgiven.
ELDA (Language Resources Distribution)
[From the website] Our catalogue of language resources currently gathers around 700 spoken and written language resources. It can be accessed from the ELRA web site and from the ELDA web site. The identification and the collection of existing language resources is part of our regular activity. The new resources we have collected, once the catalogue has been updated, are announced on some mailing lists, as well as in the ELRA members' news and in the quarterly ELRA newsletter.
EURALEX (European Association for Lexicography)
[From the website] EURALEX is the European Association for Lexicography: an international association which was founded in 1983, with the aims of furthering all aspects of the broad field of lexicography, and of promoting the exchange of ideas and information. It is committed to the development of lexicography in all European languages (as well as other non-European languages). EURALEX's interests include dictionaries of all kinds (monolingual, bilingual, and multilingual, general and specialist, in book and in machine-readable form); metalexicography, the theory of lexicography, and the history of lexicography; the praxis of dictionary-making; dictionary use; terminology and terminography; corpus lexicography; computational lexicography and dictionaries for natural language processing; and lexicology in general.
Linguistic Data Resources on the Internet
A topically organized list of language data resources on the Internet.
Archives for Language and Machine Learning
[From the website] SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational Linguistics, provides an umbrella for research interests on lexical issues ranging from lexicography and the use of online dictionaries to computational lexical semantics. SIGLEX is also the umbrella organization for SENSEVAL, evaluation exercises for Word Sense Disambiguation.
History of the English Language
List of Links about the English Language and its historical changes.
Une Histoire de la langue française @ Globe-Gate
Collection of nearly 100 links related to French, its dialects and historical changes