14-15 December 2018, Medipol University - Item Writing II: Assessing Reading Skills 

by Pinar Demiral Gündüz & Berna Akpinar Aslan


Day 1 – Morning

Session 1: Four Skills Exam by OSYM

The first session of the Forum was given by Ömer Faruk Yıldız. The session was a presentation of the test specifications of a new 4 skills-based examination. The exam aims to create a backwash effect that will shift the learners’ focus away from being exam-centred and from having rote-learning based study skills through incorporating a variety of tasks at varying levels of cognitive processing. The exam will be aligned to the Common European Framework for Languages and will have references from the European Qualification Framework and the Turkish Qualification Framework. The exam will be given in e-exam centres in Ankara, Istanbul and Izmir initially. 

The target group is young adults that will enter higher education programs as well as masters and doctorate degree program candidates. Moreover, organizations which require a four skills proficiency exam for their employees will also be able to get the candidates to take the exam. 

All educational institutions and organizations will be able to set their own required scores. For this purpose, a test specifications document will be provided to allow institutions and organizations to create their own requirement point criteria. 

The test construct is as follows: 

•Reading: 40 items based on 6 written texts. The questions will comprise 12 B1 items, 16 B2 ITEMS, and 12 C1 items. 

•Listening: 40 items based on 4 spoken texts. The questions will be the same as described above. 

•Writing: There will be 3 writing tasks. One task will be based on input and prompt at B1 level, the second one on B2 level, and third one will be on C1 level. 

•Speaking: There will be 2 speaking tasks. Task 1 will have 15 graded questions, with target levels ranging from B1 to C1. The second task will be at the level of B1/C1. 

The task types will vary, and not be based on Multiple Choice questions only. For example, besides traditional multiple choice questions, there will also be multiple response items, diagram/mind-map completion, ordering paragraphs, paragraph/sentence completion, identifying irrelevant information, and summary completion. 

In the final part of the session, the participants were given one sample reading task, and were asked to give feedback based on the criteria. 


Session 2: Automatic Text Analysis Facilitating Text Selection in Reading Assessment by Dr. Aylin Ünaldı

Dr. Aylin Ünaldı’s session focused on automatic text analysis to help with text selection. Although the session focused on analysing texts for the purpose of ensuring their level of difficulty is at the right level, the tools the presenter referred to are also important for materials writers. 

Selecting texts at the appropriate level for a specific purpose is vital to ensure content validity of the exam, as well as to make certain that the level of the tests or core materials remain at the same level over years. 

However, it should also be noted that there are various factors that can affect the level of challenge of a certain reading exam. These features include factors such as whether test-takers have background information on the text, the task types that are used for the reading input, or even the motivation of the test-takers. 

The presenter referred to tools that are more widely used such as Flesch-Kincaid Grade Level and Flesch Reading Ease, Lexile by MetaMetrics and Vocabulary Profiler (Lextutor) by Heatley & Nation. Among these easy-to-use tools Lexile could give a more accurate picture of text difficulty since it brings Flesch readability analysis and Vocabulary Profiler together in one tool. 

However, the presenter suggested that these tools falls short in determining the level of difficulty accurately because they rely on a limited range of factors: frequency of vocabulary items, length of sentences and the number of passive sentences. In fact, even when sentences are the same length, their level of difficulty can change greatly. Therefore, a tool that focuses on various other factors can determine level of difficulty much more accurately. For instance, noun phrases can greatly increase level of challenge. Rhetorical patterns and use of linkers were given as another factor that can affect the difficulty level. CohMetrix is a tool that can analyse these and other factors. The presenter suggested that each institution starts with analysing their course materials to clearly define level expectations, and then analyse exams and anchor them to the analysis of the course materials. Because CohMetrix gives a very detailed analysis that can be too complicated to deal with, the presenter suggested institutions identify a few factors and check these consistently. For example, narrativity, syntactic simplicity, referential cohesion and deep cohesion are aspects of the texts that can greatly affect the readability. 


Day 1 - Afternoon 

Focus Group Discussions

FOAI XI participants worked in three groups, namely A1-A2, B1, and B2, and took part in discussions which focused mainly on the agenda summarized below:

•the major considerations assessing reading at A1-A2, B1, and B2 levels

•what to include in a reading test specification (i.e. text source/genre, text length, lexical profile, etc.)

•the most commonly used task types in assessing reading

•a discussion on how reliable, valid, and practical these common task types are considering the level objectives and needs

•alternative tasks for reading assessment

The groups then briefly analysed the sample tasks used at different institutions and as usual, the participants, acknowledging the fact that every institution has specific needs, shared their opinions and suggestions for revision. The discussions were fruitful as there was a unique opportunity to see samples from a variety of institutions and discuss what could be done to further improve reading assessment in one’s own institution and why. 

Day 2 – Morning

The focus groups got together and prepared a presentation based on their discussion on Day 1. Then three focus group presentations* were delivered to the whole FOAI participant group, which enabled everyone to see the differences and similarities across levels. The highlights of the presentations including agreed points and major variations can be summarised as follows:

•There is a need for a test specification detailing what to consider when producing a reading test each test so that any item writer can easily understand what they need to do when they are in charge of producing a test. There should be two types of specifications: a detailed one for assessors and a simpler one for other stakeholders.

•Authenticity of texts is considered important despite being a major concern and difficulty for lower level reading assessment. 

•Common task types include: multiple choice, short answer responses, sentence completion, gap-fill, and matching. Each task type has their own shortcomings and strengths so they need to be used considering the needs of the institution.

•There seems to be a need for and desire to include more alternative reading assessment: reading into writing and reading into speaking. Some creative activities were shared by some participants. 

For further details regarding FOAI XI sessions and the presentations of focus groups and Dr. Aylin Ünaldı, you may visit FOAI website