Language Test development

SurveyLang brings together five of the world’s leading organisations in language assessment (Cambridge ESOL, CIEP, Goethe-Institute, Instituto Cervantes, Universidad de Salamanca, and CVCL Università per Stranieri di Perugia) to develop and deliver the Language Test instruments for the European Survey on Language Competences.

As well as drawing on these organisations’ considerable experience, SurveyLang implemented innovative processes for collaborative language test development to meet the complex demands of this fast-paced and large scale pan-European survey. Martin Robinson at Cambridge ESOL coordinates this exciting area of work.

SurveyLang’s approach to test development reflects the long and continuing interaction between SurveyLang partners and the Council of Europe Languages Policy Division on the initiatives which developed the CEFR. This history of collaboration is an important dimension of our claim to a uniquely detailed comprehension of the CEFR levels and a practical understanding of how to relate to them in assessment.

The initial test design

The Commission has specified for the Survey to use the CEFR as the framework against which to measure language learning outcomes. Mindful of the need to ensure relevance for 15-year-olds in a school setting, SurveyLang has adopted a socio-cognitive model based on the CEFR’s model of language use and learning. It identifies two dimensions – the social: functional language use in real-life, and the cognitive: language as a developing set of competences, skills and knowledge.

Using this model, SurveyLang defined testable abilities at each proficiency level (A1– B2 of the CEFR). To ensure the test construct could be implemented comparably across languages, these abilities were mapped to specific task types, drawing chiefly on those task types SurveyLang partners have used successfully in their exams.

The sets of testable abilities and sample tasks for each proficiency level were then the subject of consultation and a Pilot Study before a final set of task types was agreed on for Pretesting, the Field Trial and the Main Study.

The routing test

The development of 'targeted testing' was another example of innovation in SurveyLang’s approach to the delivery of the European Survey on Language Competences.

SurveyLang developed a short routing test (approximately 15 minutes) to be administered in advance of the Field Trial and Main Study test sessions. Depending on the score obtained on the routing test, students were assigned to one of three versions of the Language Test which overlaps the three CEFR levels assessed. There was an ‘easy’ test (A1-A2), an ‘intermediate’ test (A2-B1) and a ‘difficult’ test (B1-B2). This helped to ensure candidates received tests closely matched to their ability, thereby increasing the validity of the test results.

Pilot Study

The Pilot Study provided SurveyLang with important feedback on the proposed task types and enabled language partners to test their collaborative item writing procedures. This helped ensure the sharing of best practice whilst also positively contributing to cross-language comparability of tasks.

Cross-language vetting was also successfully piloted and has been adopted in the main item writing stage of the Survey. This process meant that language tasks were vetted by at least two other language partners against a list of criteria to ensure that tasks, items and options were operating correctly, and that topics, content and level of difficulty were appropriate. See Pilot Study and Pretesting for a more detailed overview.

Item writing and pretesting

After the Pilot Study phase, test specifications were finalised and task types to be used in the Survey were agreed with the Commission, participating countries and other important stakeholders.

A lengthy process of item writing, editing and cross-language vetting was completed in preparation of pretesting in October 2009. Schools in countries participating in the survey, as well as other selected educational institutions, took the pretests. Following this, extensive analysis of the level and quality of test tasks and items took place. Further editing of tasks was carried out, with the best quality tasks being selected for the Field Trial.

Field Trial

The Field Trial in early 2010 involved a sample of approximately 40 schools for each tested language in each participating country. As with the pretesting phase, a cycle of analysis, editing and re-approval took place before the best quality tasks were selected for use in the Main Study.

Main Study

The Main Study in early 2011 involved a sample of 71 schools from each participating country for each tested language. Schools and students were sampled to ensure they were nationally representative of the target testing population. A more detailed account of the processes used can be found on the sampling page.

For information on how the analysis and results are used, see analysis and results and standard setting.