Standards for Educational and Psychological Testing

March 31, 2022

Standards for Educational and Psychological Testing

Order ID:89JHGSJE83839

Style:APA/MLA/Harvard/Chicago

Pages:5-10

Instructions:

Standards for Educational and Psychological Testing

Notes (1) CUNY-New York State Initiative on Emergent Bilinguals (NYSIEB) is a

collaborative project of the Research Institute for the Study of Language in Urban Society (RISLUS) and the PhD program in Urban Education funded by the New York State Education Department.

(2) Remember: A rubric is not an assessment method; it is a scoring device (or tool) to help make assessment results more reliable and therefore more valid.

(3) This General Home Language Reading Assessment Rubric can be found in a book called A CUNY-NYSIEB FRAMEWORK for the Education of Emergent Bilinguals with low home literacy: 4–12 grades García,O., Herrera, L., Hesson, S., and Kleyn, T. (http://www.nysieb.ws.gc.cuny.edu/files/2013/05/CUNY-NYSIEB-Framework-

http://web.gc.cuny.edu/dept/lingu/rislus/

http://web.gc.cuny.edu/urbaneducation/index.html

http://www.nysed.gov/

http://www.nysieb.ws.gc.cuny.edu/files/2013/05/CUNY-NYSIEB-Framework-for-EB-with-Low-Home-Literacy-Spring-2013-Final-Version-05-08-13.pdf

for-EB-with-Low-Home-Literacy-Spring-2013-Final-Version-05-08-13.pdf). The rubric can be found in Appendix A.

(4) WIDA is no longer using this acronym definition because it no longer represents their mission. Just WIDA now.

References AERA, APA, NCME (2014) Standards for Educational and Psychological Testing.

Washington, DC: AERA. Celic, C. and Seltzer, K. (2013) Translanguaging: A CUNY-NYSIEB Guide for

Educators. New York: CUNY-NYSIEB, The Graduate Center. García, O. (2009) Bilingual Education in the 21st Century: A Global Perspective.

New York: John Wiley. Ladson-Billings, G. (1994) Dreamkeepers: Successful Teachers of African

American Children. San Francisco, CA: Jossey-Bass. Menken, K. and Solorza, C. (2014) No Child Left Bilingual: Accountability and

the elimination of bilingual education programs in New York City schools. Educational Policy 28 (1), 96–125.

Messick, S. (1989) Meaning and values in test validation: The science and ethics of assessment. Educational Researcher 18 (2), 5–11.

Stefanakis, E. (1999) Whose Judgment Counts? Assessing Bilingual Children, K-3. Portsmouth, NH: Heinemann.

Stefanakis, E. (2003) Multiple Intelligences and Portfolios: A Window into the Learners Mind. Portsmouth, NH: Heinemann.

Stefanakis, E. (2011) Differentiated Assessment: Finding Every Learners Potential. Wiley-Jossey Bass Series. San Francisco, CA: Jossey-Bass.

Stiggins, R. and Chappuis, J. (2011) An Introduction to Student-Involved Assessment for Learning (6th edn). New York: Pearson.

http://www.nysieb.ws.gc.cuny.edu/files/2013/05/CUNY-NYSIEB-Framework-for-EB-with-Low-Home-Literacy-Spring-2013-Final-Version-05-08-13.pdf

2 History: How Did We Get Here?

Themes from Chapter 2 (1) The history of testing EB students includes many examples of

inappropriate testing and misuse of results. (2) Despite more than a decade of warning from the

measurement community, test scores from tests given in English, to students who don’t know English, are still being used for important education decisions.

(3) The current political climate favors accountability over validity.

Key Vocabulary Class analysis argument Cultural argument Eugenics Genetic argument NCLB Test fairness Test misuse

PUMI Connection: Use

This chapter focuses primarily on the U (Use) of test scores for emergent bilinguals (EBs). By providing a history and presenting the damaging mistakes that have been made in the past, it is hoped we can prevent the misuse of test scores in the future, and better understand where we are today.

This chapter may seem rather gloomy but its aim is to help explain the present. The author wanted to include a variety of historical perspectives regarding testing, which may be of particular interest to readers who wish to know how present practices took root. First, is a review of the history of test misuse among non- English-speaking and other groups of people, followed by some theoretical frameworks that contradict the old but strong genetic argument of school success/failure. These frameworks may also help EB educators understand and explain the reasons why today, some students succeed and others don’t. The second section of the chapter provides a major-event history table documenting important and influential policies in the history of EB assessment over time. The final section emphasizes the changes in assessment policy since No Child Left Behind (NCLB) was introduced in the US.

A History of Misuse A history of discrimination exists in US education whereby non-

English-speaking children have been denied equal educational opportunity based on the use of standardized tests. Standardized tests in English, when presented to non-English-speaking students, raise several obvious validity concerns (a detailed discussion of validity is forthcoming in Chapter 3). The issue of fairness in testing has attracted scrutiny since the 1960s, yet federal and state mandates requiring the testing of students in a language they do not know are increasing.

Understanding test fairness was less of a concern before the 1960s. Many examples exist, pre-1960, of the misuse of intelligence tests to gatekeep or advance some racial groups over others. During this era, for example, some educators, psychologists and others used intelligence test scores to describe American Indian and Mexican children as having many negative qualities such as ‘dullness’. For example, Lewis Terman,1 who was most famous for his Stanford-Binet Intelligence test first used in schools in 1916 (and a version of it is still used today, 100 years

later), was also a well-known eugenicist.2 At the same time as he became a champion for ‘gifted’ children, he also promoted a very dark social agenda for ‘other’ groups of children. Terman and other eugenicists claimed that the smartest or more fit people, such as the wealthy with European ancestry, were reproducing too slowly and in danger of being overwhelmed by more ‘feeble-minded’ races (non-wealthy and non-European). Terman also promoted the idea that America was being jeopardized from within, by the rapid proliferation of people lacking intelligence and moral fiber, and warned that the unchecked arrival of immigrants from southern and eastern Europe would drag down the national stock (Leslie, 2000).

As Leslie (2000) reports, early eugenicists such as Terman managed to advocate and pass several laws aligned with their social agenda. Thirty-three states, including California, passed laws that required sterilization of about 60,000 men and women at mental institutions. Early eugenicists also affected immigration policy; in 1924, Congress set quotas that drastically cut immigration from eastern and southern Europe (Leslie, 2000).

In 1916, while Terman promoted his Stanford-Binet Intelligence test in schools, he also published a book called The Measurement of Intelligence. In this book, Terman discussed his findings after administering the Stanford-Binet test to Spanish speakers and unschooled African Americans. This only supported his preference for white European racial groups over others. In his words:

[a] high-grade or border-line deficiency…is very common among Spanish Indian and Mexican families of the Southwest and also among negroes. Their dullness seems to be racial or at least inherent in the family stocks from which they come…children of this group should be separated into separate classes…They cannot master abstractions but they can often be made into efficient workers…from a eugenic point of view, they constitute a grave problem because of their unusually prolific breeding.

Although this quote is extreme and old (it dates back to 1916), many still question the roots of intelligence tests used today – especially as entrance criteria for gifted programs and other similar programs. It is a well-known fact that such programs exhibit an under-representation of minorities, with almost no EB representation. Because EBs come from non-dominant cultural and linguistic groups, they are most vulnerable to test misuse

based on race or language. Figure 2.1 is a cartoon, dated 1922, showing the powerful assessor and the child as an object of the assessment. As depicted in the cartoon, the assessment is based on intelligence testing and psychological theories. This is new and an improvement on the ‘old method’ where students were sorted based on race and class. What is better the ‘old’ or the ‘new’ method?

Measurement misuse in order to advance one racial group over another can be traced back even further to the use of craniometry in the 19th century. This now-laughable practice measured cranial features in order to classify intelligence and race superiority as well as temperament and morality. Those who practiced craniometry believed that measurements of skull size and shape could determine traits such as intelligence and capacity for moral behavior. The British used such measurements to justify racist policies against Africans, Indians and the Irish. The Nazis and Belgians used similar craniometry methods to claim their superiority. In The Mismeasure of Man, Stephen J. Gould (1981) implies that craniometry in the 19th century gave way to intelligence testing in the 20th century.

In 1969, researchers Chandler and Plakos designed a research study to investigate how intelligence (IQ) tests were being used with Spanish-dominant Mexican-American children. They selected 47 Spanish-dominant students to be a part of the study, all of whom were enrolled in educable mentally retarded (EMR) classes after being assessed on the English-only IQ test. Chandler and Plakos retested all 47 students with the Spanish-language version. In most cases, the Spanish-dominant children were found not to be EMR; the decision to classify them as EMR was therefore based on an invalid use of the English IQ test scores. The study concluded that many children of Mexican descent were inappropriately placed in EMR classes.

RUBRIC

Excellent Quality

95-100%

Introduction

45-41 points

The background and significance of the problem and a clear statement of the research purpose is provided. The search history is mentioned.

Literature Support

91-84 points

The background and significance of the problem and a clear statement of the research purpose is provided. The search history is mentioned.

Methodology

58-53 points

Content is well-organized with headings for each slide and bulleted lists to group related material as needed. Use of font, color, graphics, effects, etc. to enhance readability and presentation content is excellent. Length requirements of 10 slides/pages or less is met.

Average Score

50-85%

40-38 points

More depth/detail for the background and significance is needed, or the research detail is not clear. No search history information is provided.

83-76 points

Review of relevant theoretical literature is evident, but there is little integration of studies into concepts related to problem. Review is partially focused and organized. Supporting and opposing research are included. Summary of information presented is included. Conclusion may not contain a biblical integration.

52-49 points

Content is somewhat organized, but no structure is apparent. The use of font, color, graphics, effects, etc. is occasionally detracting to the presentation content. Length requirements may not be met.

Poor Quality

0-45%

37-1 points

The background and/or significance are missing. No search history information is provided.

75-1 points

Review of relevant theoretical literature is evident, but there is no integration of studies into concepts related to problem. Review is partially focused and organized. Supporting and opposing research are not included in the summary of information presented. Conclusion does not contain a biblical integration.

48-1 points

There is no clear or logical organizational structure. No logical sequence is apparent. The use of font, color, graphics, effects etc. is often detracting to the presentation content. Length requirements may not be met