Resources in our repository exist in different languages, for different types of materials, as well as for different categories of material.
Thanks to the use of Standards for the storage and description of these resources, they can all be listed from a single access point, as demonstrated here.
data
repository
links
1_parts_of_words
english
2_words
across_the_board
chinese
dutch
english
french
spanish
3_nonwords
across_the_board
4_running_text
across_the_board
english
french
UCLA Phonetics Lab Data
Index of Languages, Index of Sounds, Map Index, and material relevant to Peter Ladefoged books: A course in phonetics and Vowels and Consonants.
Localized Dictionaries for Mozilla Thunderbird
The XPI files are basically just a ZIP format with some special requirements for Mozilla.
Unzip a file and you get two files xxxxxx.dic and xxxxxx.aff which are the "standard" myspell format.
lingucomponent.openoffice.org and the links from it will tell you how to interpret them.
Dictionaries list at linguistlist
Links to over 200 dictionaries of specific languages and a collection of multilingual dictionaries, as well as acronym dictionaries, thesauri, and dictionaries of specialized terms. It also includes dictionary projects (e.g. The Euro Wordnet Project).
Answers.com
According to the authors "The best definitions and explanations for over one million topics." The matches in dictionaries, wikipedia encyclopedia, or wordnet are all grouped on one screen.
Four Letter Words
This small project is an attempt to give a spacial overview of the entirety of this part of english language heritage, as well as to explore and visualize relations between all four letter words.
wordcount.org
WordCount™ is an interactive presentation of the 86,800 most frequently used English words..
corpus-linguistics.de
[From the website] On this webpage you will find an annotated reference system to find everything related to Corpus Linguistics that is available on the Internet: Corpora, Concordances, Corpus Linguistics research efforts and events, software for tagging, annotation etc.
Devoted to Corpora (Bookmarks for Corpus-based Linguists)
[From the website] These annotated links (c. 1,000 of them) are meant mainly for linguists and language teachers who work with corpora, not computational linguists/NLP (natural language processing) people, so although the language-engineering-type links here are fairly extensive, they are not exhaustive (for such info, you'll have to look elsewhere). Stuff here also represent my personal interests and biases (which will be obvious in some of my descriptive notes) and consequently there may be gaps, errors and omissions which you are welcome to tell me about. The English language bias on these pages will, I hope, be forgiven.
ELDA (Language Resources Distribution)
[From the website] Our catalogue of language resources currently gathers around 700 spoken and written language resources. It can be accessed from the ELRA web site and from the ELDA web site. The identification and the collection of existing language resources is part of our regular activity. The new resources we have collected, once the catalogue has been updated, are announced on some mailing lists, as well as in the ELRA members' news and in the quarterly ELRA newsletter.
EURALEX (European Association for Lexicography)
[From the website] EURALEX is the European Association for Lexicography: an international association which was founded in 1983, with the aims of furthering all aspects of the broad field of lexicography, and of promoting the exchange of ideas and information. It is committed to the development of lexicography in all European languages (as well as other non-European languages). EURALEX's interests include dictionaries of all kinds (monolingual, bilingual, and multilingual, general and specialist, in book and in machine-readable form); metalexicography, the theory of lexicography, and the history of lexicography; the praxis of dictionary-making; dictionary use; terminology and terminography; corpus lexicography; computational lexicography and dictionaries for natural language processing; and lexicology in general.
Linguistic Data Resources on the Internet
A topically organized list of language data resources on the Internet.
SIGLEX
[From the website] SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational Linguistics, provides an umbrella for research interests on lexical issues ranging from lexicography and the use of online dictionaries to computational lexical semantics. SIGLEX is also the umbrella organization for SENSEVAL, evaluation exercises for Word Sense Disambiguation.

