ALIS Newsletter - October 2015 (Plain Text Version)

Return to Graphical Version

 

In this issue:
LEADERSHIP UPDATES
•  LETTER FROM THE EDITORS
•  LETTER FROM THE CHAIR
•  LETTER FROM THE INCOMING CHAIR
ARTICLES
•  A NEW ACADEMIC VOCABULARY LIST
•  PLAY AND COGNITION IN THE EAP CLASSROOM
•  LEARNING GRAMMAR BY EAR
•  MAKING HUMOR TEACHABLE: A FOCUS ON MICROSKILLS DEVELOPMENT
ABOUT THIS COMMUNITY
•  APPLIED LINGUISTICS INTEREST SECTION (ALIS)

 

ARTICLES

A NEW ACADEMIC VOCABULARY LIST

The role of academic vocabulary in the school success of ELLs is well established (Nagy & Townsend, 2012). Simply stated, those who have sufficient vocabulary knowledge are better readers and writers than those who do not. By extension, those who read and write well perform better in school overall, and they are much better equipped to pass the gate-keeping tests of higher education (e.g., ACT, SAT, GRE, GMAT, LSAT, MCAT), thus improving their opportunities for economic success (Gardner, 2013). The same is true for native-language learners. These facts have fueled the extensive interest in vocabulary learning and teaching in academic settings. However, part of the problem with vocabulary training is that there are several million lexemes of English (words with different meanings), and many new words enter the language every year. Where does one begin to learn or teach from such an enormous pool of words?

With this in mind, my colleague and I analyzed a corpus of more than 120 million running words of academic English and found that a relatively small number of distinct words (3,015) were much more common among the nine major disciplines of our academic corpus than they were in general English (Gardner & Davies, 2014). We formalized these words into the Academic Vocabulary List (AVL). This new list is different in many ways from the Academic Word List (AWL; Coxhead, 2000), which has served language education well for more than 15 years. I highlight these differences in what follows.

  1. The AWL was based on a corpus of 3.5 million words of academic English, mostly from New Zealand. The AVL was based on a corpus of 120 million words of academic English, primarily from the United States.

  2. Word counts in the AWL were based on word families (base word forms with their inflections and transparent derivatives), whereas the AVL counts were based on lemmas (base word forms distinguished by part of speech—nouns, verbs, adjective, adverbs—together with their simple inflections). The example in Table 1 illustrates the difference.

    The value of considering lemmas over word families is obvious in this example, as proceeds, the noun (n), pronounced with stress on the first syllable, and proceeds, the verb (v), pronounced with stress on the second syllable, would be erroneously considered as being the same word family, but they would be correctly treated as being different lemmas. Likewise, the noun proceedings (meaning records or minutes), the noun procedure (meaning technique), and the adjective procedural (meaning technical or routine) would be counted as their own lemmas instead of as part of the same word family. Also, knowing that a particular word is functioning as a noun or a verb or an adjective helps to constrain the possible meanings of that word. For example, it makes a big difference if we know that study is a noun (e.g., They will complete the study) instead of a verb (e.g., She will study for the exam).

    Table 1. Example of word family versus lemmas

    One Word Family (AWL) =

    Proceed

    Proceeds

    Procedural

    Procedure

    Procedures

    Proceeded

    Proceeding

    Proceedings

    Four Lemmas (AVL)

    Proceed (v)

    Proceeds (v)

    Proceeding (v)

    Proceeded (v)

    Proceeds (n)

    Proceedings (n)

    Procedure (n)

    Procedures (n)


    Note: yellow here and hereafter indicates academic core words

  3. The AWL was built on top of the General Service List (GSL; West, 1953), a list of 2,000 high frequency word families of English. In other words, any academic words appearing on the GSL were not considered in the AWL (e.g., company, market, account, business, capital, exchange, interest). In contrast, words on the AVL were derived purely by statistics—that is, all words were considered if they appeared statistically more often in academic materials than in other registers of English, and if they had sufficient coverage across nine disciplines of academic English: (1) education; (2) humanities; (3) history; (4) social science; (5) philosophy/religion/psychology; (6) law/political science; (7) science/technology; (8) medicine/health; and (9) business/finance. The result is that the AVL contains academic words at all levels of frequency, making it possible to more effectively address the core academic vocabulary needs of learners at essentially any level of proficiency.

    To illustrate this point, Table 2 provides examples of AVL lemmas at three different frequency bands. It is clear that AVL lemmas in the first column are much more frequent and basic than those in the second and third columns. Despite the relative differences in frequency and sophistication, all of these lemmas are what I refer to as being “saturated with academic sense.” In other words, these words are likely to occur in many different academic disciplines, thus validating their “core” status.

    Table 2. AVL lemmas at three different frequency bands


    Rank

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    AVL Lemma

    study

    group

    system

    social

    provide

    however

    research

    level

    result

    include

    important

    process

    use

    development

    data

    information

    effect

    change

    table

    policy

    POS

    n

    n

    n

    j

    v

    r

    n

    n

    n

    v

    j

    n

    n

    n

    n

    n

    n

    n

    n

    n

    Rank

    1501

    1502

    1503

    1504

    1505

    1506

    1507

    1508

    1509

    1510

    1511

    1512

    1513

    1514

    1515

    1516

    1517

    1518

    1519

    1521

    AVL Lemma

    bridge

    individualism

    noteworthy

    impetus

    experimentation

    sequential

    continuation

    attributable

    disparate

    safeguard

    suppression

    subset

    markedly

    concurrent

    degrade

    incompatible

    tenet

    unify

    indispensable

    intended

    POS

    v

    n

    j

    n

    n

    j

    n

    j

    j

    v

    n

    n

    r

    j

    v

    j

    n

    v

    j

    j

    Rank

    2996

    2997

    2998

    2999

    3000

    3001

    3002

    3003

    3004

    3005

    3006

    3007

    3008

    3009

    3010

    3011

    3012

    3013

    3014

    3015

    AVL Lemma

    unidirectional

    redirection

    reversion

    obtainable

    privation

    inborn

    bimonthly

    capitalistic

    circumscribed

    targeting

    unusable

    unpalatable

    causally

    prioritization

    overemphasis

    imprimatur

    coherently

    component

    tangential

    relevancy

     

    POS

    j

    n

    n

    j

    n

    j

    r

    j

    j

    n

    j

    j

    r

    n

    n

    n

    r

    j

    j

    n


    (n = noun; v =verb; j= adjective; r = adverb)

  4. AVL lemmas were subsequently grouped into word families to meet certain learning, teaching, and research purposes. Unlike the AWL, however, AVL families maintain their lemma distinctions within the word families, as the comparison in Table 3 shows. In the AVL case, we can see that only three lemmas are actually on the AVL—control (noun), control (verb), and uncontrolled (adjective). The two red lemmas, controller (noun) and controlled (adjective), are specialized (technical) academic words in the disciplines of science and medicine, respectively. The four gray words, while part of the control family from a purely linguistic perspective, are not found statistically more often in academic materials than they are in general English. The numbers next to the words indicate the frequency of the words in the academic corpus, thus giving some indication of the relative importance of the words. Such detail is very useful for teachers, learners, researchers, and materials writers.

Table 3. Example of AWL word family versus AVL word family

AWL Example for Control Family

control, controlled, controller, controlling, controls, uncontrollable, uncontrollably, uncontrolled

AVL Example for Control Family

control (n) 45690 control (v) 19621 controller (n) Sci 1780 controlled (j) Med 1392 uncontrolled (j) 425 controlling (j) 353 uncontrollable (j) 337 controllable (j) 329 uncontrollably (r) 64


Knowing what is common or core in academic materials (the AVL) allows us to be more precise in determining what is specialized in those materials. Table 4 shows how academic vocabulary can be focused on in terms of academic core words (AVL—yellow), discipline core words (from the general core of English—magenta), and discipline technical words (within specific disciplines—red). All of these are derived statistically from the academic corpus.

Finally, our dynamic web interface contains important information about all words on the AVL, and also allows users to input any text and receive detailed information about the academic core words (AVL) and discipline-specific words (technical) in that text. This tool is particularly well suited for many academic reading and writing purposes.

Table 4. Academic vocabulary levels

References

Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34, 213–238.

Gardner, D. (2013). Exploring vocabulary: Language in action. London, England: Routledge.

Gardner, D., & Davies, M. (2014). A new academic vocabulary list. Applied Linguistics, 35(3), 305–327.

Nagy, W., & Townsend, D. (2012). Words as tools: Learning academic vocabulary as language acquisition. Reading Research Quarterly, 47(1), 91–108.

West, M. (1953). A general service list of English words. London, England: Longman, Green.


Dee Gardner is a professor of Applied Linguistics and TESOL at Brigham Young University in Provo, Utah. He specializes in vocabulary, reading, and applied corpus linguistics.