Tag Archives: Just the Word

Just the Word & WORDLE…a match made in [lexical] heaven..

Just the Word (http://193.133.140.102/JustTheWord/) is gaining popularity with practitioners as well as researchers.  WORDLE is a wonderful graphic interface to illustrate corpus frequency statistics.  Few people are aware of the ADVANCED feature on WORDLE and how to ‘mash up’ input from a site like Just the Word.

Here is an example WORDLE based on high frequency collocates of RESEARCH using the pattern analysis of the BNC from Just the Word.  I replaced the root RESEARCH with a bullet to make it less cluttered.

Wordle: Research collocates

And here is how I did this:

WORDLE has an ‘advanced’ button (top right) that takes you to http://www.wordle.net/advanced – from here, you can specify not only the ‘size’ of the words, but also the colour.

For example, from Just the Word I generated the collocates of ‘RESEARCH’.  I then did a little Excel ‘magic’ and sorted all the collocates by pattern, and filtered within the frequency range of 100 to 1000 (to produce a reasonable wordle not dominated by one or two really high frequency items).  I then selected a different colour for each PATTERN.  Because RESEARCH was the common root, I replaced it with a ‘bullet’ to make the graphic less dominated by the repeated word.  I then put the data into the ADVANCED feature.  See http://www.wordle.net/show/wrdl/2168943/Research_collocates

Here is the original filtered data from JTW.  (I copied the JTW output, put it into EXCEL and then executed a few formulae to repeat the PATTERN and cluster data.)

research FREQUENCY cluster PATTERN
carry out research 155 cluster 1 V obj *research*
conduct research 132 cluster 1 V obj *research*
undertake research 122 cluster 2 V obj *research*
do research 358 cluster 3 V obj *research*
research show 380 cluster 1 *research* subj V
research suggest 131 cluster 1 *research* subj V
research have 745 cluster 4 *research* subj V
recent research 171 cluster 1 ADJ *research*
further research 190 cluster 9 ADJ *research*
more research 115 cluster 9 ADJ *research*
medical research 242 cluster 9 ADJ *research*
much research 102 cluster 9 ADJ *research*
own research 153 cluster 9 ADJ *research*
scientific research 240 cluster 9 ADJ *research*
social research 182 cluster 9 ADJ *research*
such research 111 cluster 9 ADJ *research*
market research 425 cluster 1 N *research*
Cancer research 114 cluster 2 N *research*
research into 708 cluster 2 *research* PREP
research on 644 cluster 2 *research* PREP
research in 840 cluster 2 *research* PREP
research by 164 cluster 2 *research* PREP
research at 151 cluster 2 *research* PREP
research department 103 cluster 1 *research* N
research group 205 cluster 1 *research* N
research institute 214 cluster 1 *research* N
research team 151 cluster 1 *research* N
research unit 178 cluster 1 *research* N
research study 135 cluster 2 *research* N
research work 132 cluster 2 *research* N
research method 141 cluster 3 *research* N
research programme 316 cluster 3 *research* N
research project 482 cluster 3 *research* N
research grant 185 cluster 5 *research* N
research council 446 cluster 7 *research* N
research center 344 cluster 7 *research* N
research finding 128 cluster 7 *research* N
research laboratory 189 cluster 7 *research* N
research student 137 cluster 7 *research* N
result of research 117 cluster 4 N PREP *research*
center for research 109 cluster 5 N PREP *research*
research and development 359 cluster 1 *research* and N
our research 148 cluster 1 article *research*
some research 140 cluster 1 article *research*
this research 262 cluster 1 article *research*
their research 171 cluster 1 article *research*
my research 111 cluster 1 article *research*

Here is the data coded for WORDLE (which I pasted into the ADVANCED feature of WORDLE–the number is the FREQUENCY, and the HEX value is the HTML colour code.)  Note that I’ve replaced the word RESEARCH with a bullet.

carry out•:155:4411AA
conduct•:132:4411AA
undertake•:122:4411AA
do•:358:4411AA
•show:380:00FF48
•suggest:131:00FF48
•have:745:00FF48
recent•:171:6280AA
further•:190:6280AA
more•:115:6280AA
medical•:242:6280AA
much•:102:6280AA
own•:153:6280AA
scientific•:240:6280AA
social•:182:6280AA
such•:111:6280AA
market•:425:62FF48
Cancer•:114:62FF48
•into:708:6280FF
•on:644:6280FF
•in:840:6280FF
•by:164:6280FF
•at:151:6280FF
•department:103:0080FF
•group:205:0080FF
•institute:214:0080FF
•team:151:0080FF
•unit:178:0080FF
•study:135:0080FF
•work:132:0080FF
•method:141:0080FF
•programme:316:0080FF
•project:482:0080FF
•grant:185:0080FF
•council:446:0080FF
•center:344:0080FF
•finding:128:0080FF
•laboratory:189:0080FF
•student:137:0080FF

Neat, eh?