March 2015
TESOL HOME Convention Jobs Book Store TESOL Community

ARTICLES
USING THE CORPUS OF CONTEMPORARY AMERICAN ENGLISH TO LEARN ABOUT NEAR SYNONYMS: COLLOCATIONS
Roger W. Gee, Holy Family University, Philadelphia, Pennsylvania, USA

Near synonyms, such as small/little and strong/powerful, are words that have similar meanings, but different uses or distributions. Because they have different uses, near synonyms often cannot be interchanged. At times it’s difficult to explain to students the differences between near synonyms, but the Corpus of Contemporary American English (Davies, 2008-) may be used to learn about the ways in which near synonyms vary and to provide material for teaching.

The Corpus of Contemporary American English (COCA) is a freely-available corpus, though registration is required. It contains 450 million words of American English in use from 1990–2012, and it is balanced in two ways. First, it is balanced by genre; the corpus contains five equal-sized sections of spoken, fiction, magazine, newspaper, and academic texts. It is also balanced in the sense that COCA includes 20 million words for each year from 1990–2013 (Davies, 2008-). These features of balance are important when using the corpus for instructional purposes.

The interface for COCA allows for quite sophisticated searches, but basic searching is rather straightforward. To get started with COCA, a Brief Tour is available under Help. The tour guides the user through various options on the search interface. As seen in Figure 1, help for each section of the interface is available by clicking on the question mark on the right side of the interface. Also, there are several introductions to COCA available on YouTube.


Figure 1. COCA basic search interface

One of the ways in which near synonyms differ is that their collocates, or the words with which they commonly occur, differ. With COCA, it’s easy to find the collocates of each near synonym and to compare the collocates of two words. Referring again to Figure 1, there are four options under display. “List” is the first option, and it can be used to find collocates of a single word.

In Figure 2, the List is chosen, the word little is entered as a word to search for, and “collocates” has been selected. These settings will result in a list of words that occur frequently with little in a window, or span, of four words to the left and four words to the right. These words are collocates of little, and the reader will recall that near synonyms often have different collocates. The five most frequent collocates for little are given in Table 1.


Figure 2. COCA interface to search for collocates of little


Table 1. Collocates of little

Note that the frequency of each collocate is also given. Bit occurs with little 31,859 times in COCA. There is a rapid fall off in frequency for the other collocates; girl occurs only 8,957 with little, and nervous only 869 times.


Figure 3. COCA interface to compare collocates of little and small

A useful feature of COCA is the ability to see how frequent a collocate is with each of two near synonyms, compared to the overall frequency of the two words. Figure 3 shows the interface setting for this contrast. This time the “Compare” option was chosen, and in the second search string field, “small” was entered. The relative frequency of collocates for little and small is given in Table 2. When examining Table 2, it is important to note that none of the top 10 collocates for little appear as one of the most frequent collocates for small, and vice versa. That is, while in COCA funny appears 355 times as a collocate of little, it never appears as a collocate of small. Likewise, saucepan appears as a collocate of small 632 times, but never as a collocate of little.

Table 2. Relative frequency of collocates for little and small


These lists provide a starting point for teaching the collocates of near synonyms. It is easy to make a matching exercise with the word lists. However, clicking on the W1 number for each collocate provides a list of sentences with collocates. So, for funny, clicking on “355” results in 355 sentences that could be used to create exercises. Some sentences would be too difficult or otherwise inappropriate for use with students, but here are five examples I selected from among the first 20 sentences:

  • It's just kind of a funny little thing
  • He's got a very, very funny little book
  • He developed a funny little half smile.
  • Funny little monkeys, aren't they?
  • I found this funny little chair, marked "Snug-seat," at a tag sale.

Choosing sample sentences for several collocates of both “little” and “small” would provide authentic material that could be turned into a variety of sentence-level exercises.

Here is a second example using the near synonyms strong/powerful. The 10 most relatively frequent collocates for strong and powerful are given in Table 3. The first thing to note is that, again, none of the most frequent collocates are shared between the two words, though some collocates of one word do appear infrequently with the other. That is, showing, which appears 217 times with strong, appears once with powerful, and weapon, which appears as a collocate of powerful, occurs once with strong.

Table 3. Collocates of strong and powerful

Another important feature to note is that several of the collocates of powerful belong to the semantic sets of machines and weapons, while those of strong do not. That is, the collocates of each near synonym tend to have different semantic associations. Using COCA to investigate the semantic associations of the collocations of near synonyms will be the topic of a future newsletter article.

References

Davies, M. (2008-). The corpus of contemporary American English: 450 million words, 1990-present. Retrieved from http://corpus.byu.edu/coca/.

Gee, Roger (2015-). Created by accessing from http://corpus.byu.edu/coca/ by the author. ( February 15th, 2015)


Roger W. Gee is a professor in the School of Education at Holy Family University, where he is the director of the Masters in TESOL and Literacy Program. He is interested in the use of language corpora in teacher education.

« Previous Newsletter Home Print Article Next »
In This Issue
LEADERSHIP UPDATES
ARTICLES
ABOUT THIS COMMUNITY
Tools
Search Back Issues
Forward to a Friend
Print Issue
RSS Feed
Poll
After you get the newsletter either through email, the TESOL Community, Facebook or Twitter, what do you actually read the newsletter on?
A. a laptop
B. a tablet
C. a mobile phone
D. or something else- please enlighten us.

EV Program
The New Electronic Village Program.
Find Us on Facebook and Twitter
Follow the CALL-IS and Electronic Village events at the TESOL Convention 2015 on our Facebook Page, TESOL CALL Interest Section. We know you will really "like" us.
New Word Press Moodle
Check out our new (and improved) Word Press Moodle. Thanks to Stephanie Korlsund.