Research related to the National Certificates of Language Proficiency (NC)




The Finnish National Certificates of Language Proficiency (NC) is a language proficiency testing system for adults in nine languages (English, Spanish, Italian, French, Swedish, Sami, German, Finnish and Russian) and at three levels: basic (proficiency levels 1-2), intermediate (proficiency levels 3-4) and advanced (proficiency levels 5-6).  All the tests include a subtest in speaking, listening comprehension, writing, and reading comprehension. Assessment is criteria-based, and the test-taker receives a certificate with proficiency level ratings of each skill on a scale 1-6.

In a high stakes test such as the NC, further developing the test is an on-going process that requires the integration of research and development. In order to assure the reliability and ethicality of the test, research and development activities focus on the following key areas:

  • Validity: how valid are the proficiency level ratings of a test-taker’s performance in describing his language skills in speaking, listening comprehension, writing, and reading comprehension
  • Reliability: how consistent and reliable are the tests from one testing round to another and from one language to another
  • Impact: how do test-takers and other interested parties, such as authorities and employers, regard the test results and evaluate them
  • Practicability: how functional, ethical, and effective are the test practices in relation to the resources required

Some examples of research and development concerning these key areas:

Item bank

Building an item bank is an integral part of test development. Tests taken in different testing rounds should measure the same skills construct, and the results of the tests from different rounds should be mutually comparable. The item bank contains test items the validity of which is assured both qualitatively and quantitatively. Items are calibrated on the same level of difficulty by means of the IRT (item response theory) methodology (Wright & Stone, 1979; Wright & Masters 1982). The item bank provides a good basis for standardizing test items and examining the consistency of the tests.


Standardization of test items involves negotiating the requirements for performance on each level of the examination. In other words, the standardization process has to do, on one hand, with valuations and conceptions about language and in this respect about education political issues, but on the other, with the utilization of psychometric methods in order to make the determination of proficiency levels as transparent and fitting as possible. Standardization methods (Kane 2002; Cizek & Bunch 2007) form the basis for standardization research that focuses on speaking and reading comprehension tasks. Research aims at defining the skills level of each test task, and setting the limits for the item difficulty scale accordingly.

Conformity and consistency of rating

The conformity of rating in speaking and writing and the raters’ consistency is assured with continuous training and feedback. Rating data are evaluated by the IRT (item response theory) methodology that requires carefully constructed assessment data in which raters are linked to each other. The analyses provide individual feedback to raters about their rating behavior. Long-term information about individual raters is collected and used for further developing both feedback practices and rater training.

Language proficiency and test-takers

When taking the NC, test-takers are asked to fill in a background information questionnaire with questions about their sex, date of birth, education, use of the test language in everyday situations, and self-assessment of language proficiency. In addition, they are asked to fill in, after taking the test, another questionnaire that deals with test tasks and organization of the examination. The collected data are used to examine the background profiles of test-takers and to investigate, for example, the connection between background variables and language proficiency. The findings are made use of in developing the testing system but they also reflect the proficiency of Finnish adults and provide information that is interesting from the point of view of language policy.

The National Certificates corpus

The NC corpus is compiled of test data. The data in the corpus are both quantitative and qualitative: quantitative data include test-takers’ assessments and background information (e.g. sex, mother tongue, education, use of the language tested in various situations) and qualitative their speaking and writing performances. The data cover nine languages: English, Spanish, Italian, French, Swedish, Sami, German, Finnish and Russian. The corpus is built on a continuous basis and new data are input annually. The user interface is designed in such a way that it allows the versatile use of the corpus for both teaching and research purposes. The role of the corpus in research is twofold: it is both a source of data and a target for research. The NC corpus is available as a database in the web and it is included in the Finnish Social Science Data Archive (http://www.fsd.uta.fi/).