Corpus research

In my role as a lexicographer, I’ve become very familiar with corpora as a means of researching language usage. Outside of my work on dictionaries, I carry out corpus research to feed into a whole range of projects, both those I’m working on myself as a writer and also to produce reports for other writers.

I had the fascinating task of researching new trends in language use for the 4th edition of Collins COBUILD English Usage (Collins, 2019). I’ve given numerous talks about using corpus tools in ELT and I often find myself dragged in to answer language queries on social media! (See a couple of the resulting posts about outdated secretaries and messing things up over on my blog)

I’ve carried out extensive research using the Cambridge Learner Corpus to feed into a variety of ELT projects for Cambridge University Press over the years. I’ve written about my learner corpus research here and summarized a talk I gave about my work researching Spanish learner errors at the IVACS conference here.

I’ve both carried out research and written common error notes and activities for a number of books, both for the international market and for specific local markets. I’ve worked especially with CUP Spain to research errors typical of Spanish speakers to inform coursebooks aimed at the Spanish market:

  • Complete (ESS; Key, Preliminary & First)
  • Interactive (ESS)
  • Empower (ESS; levels A2, B1 & C1)
  • English in Mind (international editions)
  • English in Mind (ESS editions)
  • Cambridge Learner’s Dictionary

I carried out the research for most of the Common Mistakes at … series of books. I also authored the Common Mistakes at Proficiency and IELTS Advanced titles.

I’ve carried out research and produced reports for CUP authors on projects including:

  • Objective PET
  • Objective First
  • Objective IELTS
  • Complete CAE
  • IELTS Trainer
  • English 365
  • Vocabulary in Practice
  • Business Benchmark