test constructor
Recently Published Documents


TOTAL DOCUMENTS

7
(FIVE YEARS 1)

H-INDEX

1
(FIVE YEARS 0)

2020 ◽  
Vol 11 (3) ◽  
pp. 289-306
Author(s):  
Harvey Goldstein ◽  
Michele Haynes ◽  
George Leckie ◽  
Phuong Tran

The presence of randomly distributed measurement errors in scale scores such as those used in educational and behavioural assessments implies that careful adjustments are required to statistical model estimation procedures if inferences are required for ‘true’ as opposed to ‘observed’ relationships. In many cases this requires the use of external values for ‘reliability’ statistics or ‘measurement error variances’ which may be provided by a test constructor or else inferred or estimated by the data analyst. Popular measures are those described as ‘internal consistency’ estimates and sometimes other measures based on data grouping. All such measures, however, make particular assumptions that may be questionable but are often not examined. In this paper we focus on scaled scores derived from aggregating a set of indicators, and set out a general methodological framework for exploring different ways of estimating reliability statistics and measurement error variances, critiquing certain approaches and suggesting more satisfactory methods in the presence of longitudinal data. In particular, we explore the assumption of local (conditional) item response independence and show how a failure of this assumption can lead to biased estimates in statistical models using scaled scores as explanatory variables. We illustrate our methods using a large longitudinal data set of mathematics test scores from Queensland, Australia.


2019 ◽  
Vol 73 (5) ◽  
pp. 101-115
Author(s):  
Viktor E. Bondarenko

A Computerized Adaptive Test proposes items according to the student's knowledge level. Therefore, the number of items, which are given to students, is reduced. Besides, the ending of such test is determined by the student's knowledge level, which allows an instructor to reduce testing time. As usual, construction of such tests is based on the Item Response Theory (IRT). This theory gives models which use statistical data about the student's knowledge level and difficulty of items. We do not have such statistics for new tests. In such cases, this paper proposes to estimate the complexity of items on the basis of the experts' conclusions. These conclusions are based on the analytic hierarchy process (AHP) which was modified. The modification allows experts to estimate the complexity of items with the help of the collection of the items characteristics. This modification can remove the expert's inadequate estimates of items or their characteristics. This method allows experts to classify all items in clusters according to their complexity in the first stage of the testing when statistics of items use is absent. A test constructor, on the basis of a decision tables network, realizes the algorithm of the items' selection from different clusters. In the future, tutors will have tested a sufficient number of students' groups. They record statistics of the test using. A test constructor receives such statistics, which will allow them to use the models of the Item Response Theory for estimation of the test items' complexity. The assessment of the knowledge level of students is made with the help of an adaptive test, which is based on a network of decision tables. This network determines the algorithm of using items from different clusters for the testing. The adaptive test is built on the basis of the network of decision tables as a computer system. This system is constructed on the Java platform with the help of the programming environment Android Studio. It has the interface suitable for students as well as for a constructor, which allows the constructor to change the algorithm of using items if received statistics of items use shows such necessity.


2014 ◽  
Vol 4 (1) ◽  
Author(s):  
Rudi Camerer

AbstractThe testing of intercultural competence has long been regarded as the field of psychometric test procedures, which claim to analyse an individual's personality by specifying and quantifying personality traits with the help of self-answer questionnaires and the statistical evaluation of these. The underlying assumption is that what is analysed and described as a candidate's personality can be treated as an indicator of that same person's practical performance in intercultural encounters. From the point of view of a test constructor for language competence, all intercultural tests of this type raise basic questions concerning their construct and predictive validity.Against this background, this article firstly examines the shortcomings of personality-based tests of intercultural competence. Secondly, based on relevant parts of the CEFR as well as on the work of numerous contributors to the international debate, a practicable construct of intercultural communicative competence is suggested. Special attention is paid to the concept of politeness in intercultural encounters and the role of English as a lingua franca (ELF). Thirdly, a basic outline of a criterion-based test of intercultural competence in English is provided. The test procedures on which this article draws have been extensively piloted and are part of a training package including test specifications, course materials and teacher-training material.


2003 ◽  
Vol 20 (1) ◽  
pp. 57-87 ◽  
Author(s):  
Abdoljavad Jafarpur
Keyword(s):  

2001 ◽  
Vol 66 ◽  
pp. 91-99
Author(s):  
Arie Hoeflaak

The use of video in foreign language teaching is considered to be a powerful tool by many teachers and researchers. It seems, however, that a sound 'video teaching methodology' has not yet been fully developed. This article sets out to present some reflections on the advantages of the use of video. We will then briefly describe some elements from two more or less theoretical studies, Lang (1995) and particularly Paivio (1986), and discuss the results of other experiments that we found in the literature. Finally, we will put forward some tentative ideas about experiments that we will prepare on the basis of the most important findings of other experiments. The main idea is that information is best processed if it is presented in a redundant way, e.g., both by an audio and a video channel. Many experiments claim that reversed subtitling (subtitles not in L1, but in L2 or FL) is the most successful visual support for foreign language learners. Our experimental design will be organized as follows. Subjects (pre-university students and first-year university students of French) will be divided into four experimental groups to be tested under four different conditions: 1) Image, sound (French spoken text), no subtitles; 2) Image, no sound, French subtitles; 3) No image, sound, subtitles; 4) Image, sound, subtitles. We hypothesize that condition 4 will yield the best result, but before conducting the experiment, we will have to examine three aspects: 1) The assessment format: subjects might consider open questions unclear, whereas, in closed questioning (true-false, multiple choice, cloze), items might be biased by the test constructor. 2) Clarifying the distinction between high, medium, and low redundancy. 3) Bi- or multimodal information input may lead to cognitive overload.


1989 ◽  
Vol 14 (3) ◽  
pp. 279-290 ◽  
Author(s):  
Jos J. Adema ◽  
Wim J. van der Linden

Recently, linear programming models for test construction were developed. These models were based on the information function from item response theory. In this paper another approach is followed. Two 0-1 linear programming models for the construction of tests using classical item and test parameters are given. These models are useful, for instance, when classical test theory has to serve as an interface between an IRT-based item banking system and a test constructor not familiar with the underlying theory.


Sign in / Sign up

Export Citation Format

Share Document