Categories used

In this repository, we try to provide access to material of interest to psycholinguists. Accordingly, the data material listed in our repository is organized using the levels of analysis described below. Separate pages detail the types of material available within each category and the standards used for the storage and description of these resources.

1. Parts of words

parts of words

Statistics attached to parts of a word. Examples are Bigram or Trigram Frequency (e.g., number of time that the bigram "ar" is found in English words), Syllable Frequency (number of time that the syllable "tar" is found in English words)

View all Parts of Words statistics listed in our repository...

2. Words


Statistics attached to a word in full. Examples are Word Frequency, Age of Acquisition (i.e., age at which a word has been acquired), Lexical Neighborhood (i.e., number of words that share all letters but one with the string).

These statistics are typically found in lexical databases or in tables which list a somewhat limited number of words, along with a limited set of attached variable, as age of acquisition or frequency.

View all Words statistics listed in our repository...

3. Nonwords


A fairly recent trend is to create databases for nonwords (strings that eventually could be English words but aren't), that is letter strings that do not match any real wordof the language. Such strings are particularly useful to evaluate the knowledge of print-to-sound associations that readers avoid a material that presents inevitable differences in lexical properties when studying stages of processing at which the lexical status of the word is irrelevant (typically early perceptive processes).

View all Nonwords statistics listed in our repository...

4. Running text


Statistics about the word in sentences (syntax, prosody, etc.)

View all Running text statistics listed in our repository...

5. Visual Material visual stimuli

Psycholinguistic studies sometimes also involve the presentation of visual material, for naming. For instance, picture naming tasks have been used to test the source of the age-of-acquisition effects [ref Bonin et al.]

View all Visual Material statistics listed in our repository...

6. Associations (print-to-sound, sound-to-print) associations

Statistics that reflect the connection between a word in one modality (for instance writtten work) and its equivalent in another (for instance spoken word). They typically reflect the regularity or consistency with which parts of words are translated from print to sound or the other way around. For instance, consistency estimates provides a coefficient that reflect the probability with which a given segment would be found with a given pronunciation in English words. The Body-Rime consistency estimate, for instance, reflects the consistency of pronunciaiton for each possible body of English (a body is defined as made up from the nucleus and coda parts of a syllable, or the vowel and final consonants of a one syllable word).

View all Associations statistics listed in our repository...

7. Datasets


Datasets provide information about the performance of human participants or computer models on a set of words. They are helpful to test predictions before running a well constructed experiment. As a result, they diminish the risk of running an experiment that leads to null results and henceforth increase productivity.

View all Performance Measures datasets listed in our repository...