ReadmeR5.html

present, contains an alphabetized list of related words.  A simple example:

word is spelled in accordance with normal English spelling
observed include "john", "china", "bush", "yahoo" and "august".
denote "Scrabble inflections".)  Depending on your application for
interactive games to literacy programs.  And I have been
"wind", or "crooked", the past tense of "crook").  
that this list is formatted as a collection of word sets, each set

composed of a headword and some number (possibly zero) of closely related
is somewhat arbitrary.  I have consistently chosen an Amer
separately. "wind" the noun and "wind" the verb are considered as a
2+2lemma list, but with the headwords arranged approximately by the

further changes, except perhaps for minor error corrections.  However,
and in this form is likely quite uncommon on the Web.  
terror") and the growing importance of the Internet in our daily lives.
smaller ones as well.)  Many of these words relate to two of the
"bashful" from "bash".  There are some rather difficult questions
but it is credited with half the total count for the word
plural inflection, as with "meaning" and "kindness".  Such words
    based, baseless, basely, baseness, baser, bases -> [basis], basest, basing

2of12inf.txt and 2+2lemma.txt), and a section of additional hyphenated
words which you might choose to add as appropriate to the other lists
god of war rather than to the unit.)  "art" illustrates the other
alternate headword. There are two specific situations which might not be obviousextracted at random from my own lists. This is a use of 12dicts of which I
supplied by Google on the frequency of English words on the World Wide Web.

Sometimes, the choice of which variant to treat as
have the same inflection ("putting" derives both from "putt" and "put";
Google distinguished words on the basis of capitalization, so that
advertising bias is illustrated by the surprisingly high frequency of
certain ambiguities - should the word "putting" count for the "put" or
me know what you're doing. (Oh, and please put "12di
Google frequency data, and my procedures for processing it, too
are always made headwords, even when the relationship to the or
computer bias is illustrated by words such as "click", "online", "icon"
buzzwords of the 21st century.

Words ending with the suffix -ability/ibility are t

A note on "licensing": 2+2lemma.txt and 2+2gfreq.txt were
lists and their features updated to release 5.

the count evenly between all the possible headwords.  This assumes
 Perhaps my favorite example is that "nostdinc" (a compiler option
are towards advertising and marketing, computers and pornography.  The

The 2+2gfreq list

Here are some other notes on the determination of what words are related.

am publishing the file neol2007.txt, which contains newly popular
the semblance of a frequently referenced site.  At any rate, one
order of their frequency of use.  The "g" in the name stands for

The list 2+2lemma.txt contains the words in the 2of12inf.txt

    the Google data might be somewhat higher than the frequency in
    /usr/dicts/words for their input.  Keep up the good work, and let
    further inaccuracies have been introduced by my own procedures.  
    certain ambiguities - should the word "putting" count for the "put"
    language does not remain static, and the 2007 editions of the 12dicts
    resulting data was sorted by frequency, and then grouped into bands

    The 2+2lemma list is not formatted as a simple list of words.
    activity is the development of CAAPR and ABCD, both of which may be

    In the previous editions of 12dicts, I suggested that you write

    Since the previous release of 12dicts, I have been fooling

    Finally, British forms of words in
    to existing files were to correct a small number of embarrassi

