ReadmeR5.html - ReadmeR5.htmlDownload Complete Wordlist (22.56 K)
the words associated with a single headword of 2+2lemma.txt, in all of
to existing files were to correct a small number of embarrassing errors.
inflections, even though technically they are not. Thus, "talented"
language Web. (See the ann
spelling over a British spelling here. This has some effect on
It is composed of entries of 1 or 2 lines each. The first
manage to keep up. After all, it took them 20 years to recognize the word
but it is credited with half the total count for the word, and
appear that capitalization on the Web is random, or at least beyond
"price", "Price" and "PRICE" were counted separately.) The
and consistently related words. These suffixes are -ful, -ish,
delighted to see the interest in these lists for projects ranging from
replicated web pages linking to one another, with the hope of creating
The 12Dicts Word Lists, release 5
the "putt" headword? Since there is no way of knowing, when I
fewer words than 2of12inf.txt.
extracted at random from my own lists. This is a use of 12dicts of which I
Perhaps my favorite example is that "nostdinc" (a compiler option
interactive games to literacy programs. And I have been
2+2lemma list, but with the headwords arranged approximately by the
the word "borscht" on the Web is "Borscht", and of "mesh" is "MeSH".
the announcement.) Th
chose to ignore capitalization. This was necessary - as it would
plural inflection, as with "meaning" and "kindness". Such words
based on powers of 2. That is, the band of least
data with considerably higher frequency than credible for English.
information has always been of interest to "word nerds" like myself,
further changes, except perhaps for minor error corrections. Ho
the number of headwords. I treat "cheque" as a variant of
cross-references are indicated:
No distinction is made of
based, baseless, basely, baseness, baser, bases -> [basis], basest, basing
No distinction is made of different meanings of the same word,
Almost always, a given word has only one cross-reference - the exception is the incre
information has always been of interest to "word nerds" lik
accumulated the frequencies I used the expedient technique of dividing
instance, "likely" is not considered to be derived from "like", nor
by agid.txt. I release neol2007.txt into the public domain.
would have implied more significance to the data than is actually
m the English language Web. (See the inflections, even though technically they are not. Thu
have been added, marked with a + if they w
and in this form is likely quite uncommon on the Web. The noun
there is no -al word to apply the -ly suffix to. (For instance,
The list of related words contains three sorts of entries.
the Google data,
Science assignments specifying a 12dicts list rather than
"holier" relates to "holey" as well as "holy"), or that an inflection
"bashful" from "bash". There are some rather difficult questions
under the Linux operating system) occurs more frequently on the
Back | WordLists