- Test (student assessment)
A test or an examination (or "exam") is an
assessment, often administered on paper or on the computer, intended to measure the test-takers' or respondents' (often a student) knowledge, skills, aptitudes, or classification in many other topics (e.g., beliefs). Tests are often used in education, professional certification, counseling, psychology(e.g., MMPI), the military, and many other fields. The measurement that is the goal of testing is called a test score, and is "a summary of the evidence contained in an examinee's responses to the items of a test that are related to the construct or constructs being measured." [Thissen, D., & Wainer, H. (2001). Test Scoring. Mahwah, NJ: Erlbaum. "Page 1, sentence 1."] Test scores are interpreted with regards to a norm or criterion, or occasionally both. The norm may be established independently, or by statistical analysis of a large number of subjects.
standardized testis one that is administered and scored in a consistent matter to ensure legal defensibility. [North Central Regional Educational Laboratory [http://www.ncrel.org/sdrs/areas/issues/students/earlycld/ea5lk3.htm] ] A large proportion of formal testing is standardized. A standardized test with important consequences for the individual examinee is referred to as a high stakes test.
The basic component of a test is an "item". These are often colloquially referred to as "questions," but not every item is phrased as a question; it may be such things as a true/false statement or a task that must be performed (if a performance test).
The earliest known standardized tests (which included both practical and written components) are the Chinese Imperial Examinations which began in 587. [Feng, Y. (1994). From the Imperial Examination to the National College Entrance Examination: the Dynamics of Political Centralism in China's Educational Enterprise. ASHE Annual Meeting Paper. [http://www.eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/0000019b/80/13/64/3b.pdf] ]
Europe, traditionally school examinations were conducted orally. Students would have to answer questions posed by teachers in Latin, and teachers would grade them on their answers. The first written exams in Europe were held at Cambridge University, Englandin 1792by professors who were paid a piece rate and realized that written exams would earn them more money.
Types of items
Many possible item formats are available for test construction. These include:
multiple-choice, free response, performance or simulation, true/false, and Likert-type. There is no "best" format to use; the applicability depends on the purpose and content of the test. For example, a test on a complex psychomotor task would be better served by a performance or simulation item than a true/false item.
A common type of test item is a
multiple-choicequestion, the author of the test provides several possible answers (usually four or five) from which the test subjects must choose. [Haladyna, T. (2004). Developing and Validating Multiple-Choice Test Items. Erlbaum.] There is one right answer, usually represented by only one answer option, though sometimes divided into two or more, all of which subjects must identify correctly. Such a question may look like this:
Test (student assessment)
Test authors generally create incorrect response options, often referred to as distracters, which correspond with likely errors. [Kehoe, Jerard (1995). Writing multiple-choice test items. Practical Assessment, Research & Evaluation, 4(9). Retrieved February 26, 2008 from http://PAREonline.net/getvn.asp?v=4&n=9 ] For example, distracters may represent common misconceptions that occur during the developmental process. The construction of effective distracters is a key challenge that must be faced in order to construct multiple-choice items that possess strong
A graph depicting the functioning of a multiple-choice question is shown in Figure 1. The x-axis represents an ability
The grey line maps ability to the probability of a correct response according to the
An attractive feature of multiple-choice questions is that they are particularly easy to score. [Test Item Writing - From the University of Alabama at Birmingham [http://www.uab.edu/uasomume/cdm/test.htm] ] Machines such as the
This format is not, however, appropriate for assessing all types of skills and abilities. Poorly written multiple-choice questions often create an overemphasis on simple memorization and deemphasize processes and comprehension. They also leave no room for disagreement or alternate interpretation, making them particularly unsuitable for humanities such as
Free response items
At the other end of the spectrum, scores may be awarded according to superficial qualities of the response, such as the presence of certain important terms. In this case, it is easy for test subjects to fool scorers by writing a stream of
While free-response items have disadvantages, they are able to offer more differentiating power between examinees. [Vale, C.D., & Weiss, D.J. (1977). A Comparison of Information Functions of Multiple-Choice and Free-Response Vocabulary Items. Technical Report, University of Minnesota Psychometric Methods Laboratory. [http://stinet.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA039255] ] However, this might be offset by the length of the item; if a free-response item provides twice as much measurement information as a multiple-choice item, but takes as long to complete as three multiple-choice items, is it worth it?
Performance test or practical examination
Knowledge of "how to do" something does not lend itself well to either free-response or multiple-choice questions. It may be demonstrated only outright by a performance test. [Performance Testing Council - Why Performance Testing? [http://www.performancetest.org/whytest.html] ]
A practical examination may be administered by an examiner in person (in which case it may be called an "audition" or a "tryout") or by means of an audio or
Tests of the
General aptitude tests, such as the SAT in the
Similarly, college entrance exams are criticized for not accurately predicting first-year university
The content of the exam might not correspond with its intended use or representation. An example of this would be for an exam to have the ratio of questions in
People are variously susceptible to stress. Some are virtually unaffected, and excel on tests, while in extreme cases, individuals can become very nervous and forget large components of exam material. To counterbalance this, often
Through specialized training on material and techniques specifically created to suit the test, students can be "coached" on the test to increase their scores without actually significantly increasing knowledge of the subject matter. However, research on the effects of coaching remains inconclusive, and the increase might be simply due to practice effects. [Domino, G., & Domino, M.L. (2006). "Psychological Testing: An Introduction". Cambridge University Press. page 340 Preview available at [http://books.google.com/] ]
Although test organizers attempt to prevent it and impose strict penalties for it, academic dishonesty (cheating) can be used to obtain an advantage over other test-takers. On a multiple-choice test, lists of answers may be obtained beforehand. On a free-response test, the questions may be obtained beforehand, or the subject may write an answer that creates the illusion of knowledge Fact|date=September 2008. If students sit in proximity to one another, it is also possible to copy answers off other students, especially if a test-taker knows that particular person knows the material better than they do Fact|date=September 2008. Despite such issues, tests are less susceptible to cheating than other tools of learning evaluation Fact|date=September 2008. Laboratory results can be fabricated, and homework can be done by one student and copied by rote by others Fact|date=September 2008. The presence of a responsible test administrator, in a controlled environment, helps to guard against cheating.
* [http://www.testpublishers.org/faq.htm Association of Test Publishers FAQs]
* [http://www.ncme.org National Council of Measurement in Education]
* [http://www.apa.org/science/jctpweb.html Joint Committee on Testing Practices]
* GCSE and A-level — Used in the UK except
*International General Certificate of Secondary Education (IGCSE)- international exams
* Airasian, P. (1994) "Classroom Assessment," Second Edition, NY" McGraw-Hill.
* Cangelosi, J. (1990) "Designing Tests for Evaluating Student Achievement." NY: Addison-Wesley.
* Gronlund, N. (1993) "How to make achievement tests and assessments," 5th edition, NY: Allyn and Bacon.
* Haladyna, T.M. & Downing, S.M. (1989) Validity of a Taxonomy of Multiple-Choice Item-Writing Rules. "Applied Measurement in Education," 2(1), 51-78.
* Monahan, T. (1998) [http://torinmonahan.com/papers/testing.pdf The Rise of Standardized Educational Testing in the U.S. – A Bibliographic Overview] .
* Wilson, N. (1997) Educational standards and the problem of error. http://olam.ed.asu.edu. Tap into archives, vol 6. No 10
Wikimedia Foundation. 2010.