Disparate Impact Theory

under

Title VI of the 1964 Civil Rights Act

and

Some Reasons to Believe

that

Boalt Broke the Law

sky woodruff

Contents

Introduction

I. The Dismantling of Affirmative Action in California and Its Universities

II. Disparate Impact under Title VI

A. The Court’s Uncertain Title VI Jurisprudence

1. Lau and Bakke: Title VI Includes Disparate Impact ... or Not

2. Guardians and Other Cases: Disparate Impact Is Available under Title VI Regulations, and Title VII’s Disparate Impact Framework Applies ... Whatever That Means

B. The Future of Disparate Impact under Title VI

III. Disparate Impact under Title VII

A. Plaintiff’s Prima Facie Case: Demonstrating Disparate Impact

1. Proof of Causation

2. The Requirement of Particularity

3. Selection Criteria

B. Defendant’s Case: Business Necessity, Job Relatedness, and Validation

1. Attacking the Sufficiency of Plaintiff’s Statistics

2. Business Necessity and Job Relatedness

3. Validation Strategies

4. Training Program Validity

C. Alternative Employment Practices

IV. The Disparate Impact of Boalt’s 1997 Admission Process

A. Boalt’s 1997 Admissions Process

1. How the Process Worked

2. General Critique of the 1997 Admissions Process

3. The Results of the 1997 Admissions Process

B. A Prima Facie Case

1. The Bottom Line

a. Z Test Results

b. Chi Square Results

2. The LSAT under OCR Guidelines

a. OCR’s Standards for Finding a Prima Facie Case of Disparate Impact

b. Per Se Violations under OCR Guidelines

C. Educational Necessity and Validity

1. The Admission Process and the LSAT

a. What Is Educationally Necessary and School Related

b. The Educational Necessity and Law School Relatedness of Boalt’s Criteria

2. Training Program Validity

3. General Problems with the LSAT

a. Favoring the Affluent

b. Poorly Assessing Qualification

c. Bias Caused by the Method of Test Construction

d. The Problem of Stereotype Threat

D. Alternative Practices

1. Restoring Consideration of Race

2. The New Directions "Character Index"

3. The Ad Hoc Task Force’s Recommendations

4. Consideration of UGPA but not LSAT

5. Race Norming of the LSAT

Conclusion

Appendix A: Z Score Calculations

Appendix B: Chi Square Calculations

 

Introduction

In 1997, the graduate and professional schools of the University of California implemented a resolution passed by the system’s Board of Regents in the Summer of 1995 and ceased to consider the race of applicants, thereby beginning the end of affirmative action in higher education in the state. Just months before, the voters of California had followed the Regents’ lead and voted to end all affirmative action in public hiring, contracting and education. The results at most of UC’s professional and graduate programs were somewhat disheartening, including at Boalt Hall, Berkeley’s law school. Boalt made 860 offers of admission: 122 to Asians, 46 to Latinos, 15 to African-Americans, 2 to Native Americans, and 675 to White applicants. Of those offered admission, 268 applicants opted to enroll; 215 of them were White, 38 were Asian, 14 were Latino, one was African American, and none were Native American. Those numbers prompted a complaint to the Department of Education’s Office for Civil Rights, which began an investigation over the Summer of 1997. The complaint alleged, in part, that Boalt had used a facially neutral policy that had disparate racial impacts in violation of Title VI of the 1964 Civil Rights Act, which prohibits discrimination in federally funded programs, including the UC schools. This paper investigates that same question, reaching the conclusion that Boalt did practice impermissible discrimination.

Part I briefly summarizes the path by which the UC Regents and the voters of California decided to end the state’s public affirmative action programs. Part II describes the history and present state of disparate impact theory under Title VI. Because the courts have held that a case of disparate impact under Title VI should follow the framework of such a case under Title VII, Part III summarizes the current jurisprudence of Title VII disparate impact theory. Part IV applies the facts of the situation to that framework, concluding that Boalt’s facially neutral policy had impermissibly discriminatory effects on racial minorities.

i

The Dismantling of Affirmative Action in

California and Its Universities

The California Constitution no longer allows the consideration of race as part of affirmative action programs in public education, public employment, and public contracting; at least, that is the simple interpretation of a law that is poorly worded, unclear, and that has not yet been tested in court or in legislatures as to its specific implications and applications. How the law got changed is a political tale. Like most political tales, it is the story of small men, personal and partisan agendas that have nothing to do with the law itself, short-run goals, the absence of meaningful dialog as prelude to policy formulation, ill-conceived policies, and products that, though facially clear, lack polish, direction, substance, or detail. Before the state electorate voted to amend the California Constitution, the Regents of the University of California (UC), led by Ward Connerly, a political appointee of a Republican governor with presidential aspirations (i.e. a governor looking to increase his national notoriety), and a man with no pedagogical knowledge to speak of but any number of personal opinions without philosophical framework or factual support, voted to end the consideration of race, religion, sex, color, ethnicity, or national origin as criteria in admission to UC schools. As a result of that vote, all UC schools, including UC Berkeley’s School of Law (Boalt Hall), reformulated their admission policies to eliminate all consideration of race. In 1997, the first year in which Boalt applied its reformulated policies, as many predicted, the number of African-Americans, Latinos, and Native Americans admitted to the entering class dropped dramatically. The number of those students who decided to enroll at Boalt was even lower. The Mexican American Legal Defense and Educational Fund (MALDEF) subsequently filed a complaint with the United States Department of Education’s Office for Civil Rights alleging that Boalt’s admission policy had had a disparate racial impact in violation of Title VI of the 1964 Civil Rights Act (CRA). That is the issue at the heart of this paper.

When California Governor Pete Wilson appointed Ward Connerly to the UC Board of Regents in 1993, the two had known each other for twenty-five years. They had met when Wilson, then a young legislator, was appointed to the state Assembly’s Committee on Urban Affairs and Housing and Connerly was working in the state Department of Housing and Community Development. After working together for several years, Wilson urged Connerly to leave government and enter private enterprise, which Connerly did, becoming a successful land use consultant. Their relationship did not end, however; as part of his business, Connerly has taken advantage of parts of the state’s housing law that Wilson wrote, and with his business contacts, Connerly has helped to raise millions of dollars for Wilson’s various campaigns. Considering their personal and professional relationship, then, one could hardly be surprised that Wilson appointed Connerly to the Board, a veritable club of the monied, powerful, and politically connected. What might nevertheless give one pause is the appointment of a man with a personal agenda but no expertise in the area in which he has been given power, a man known to play fast and loose with facts, who resorts to the tools of illogic and bias--appeals to emotion, popular appeals, poisoning the well, and ad hominem attacks--when he has no grounds for his arguments, and who passes such illogic off as a sound philosophical framework.

When Connerly joined the Board, he almost immediately carved out a unique place with what he himself has called a dogmatically rhetorical style that usually involved a near-harassing campaign of faxes and letters to other Regents expressing his opinions on the issues on the Board’s agenda. But on no issue was that style used to greater effect than on his resolutions to eliminate affirmative action in UC’s hiring, contracting, and admissions. Whether SP-1 and SP-2 were the result of personal inspiration or a concocted scheme between him and Wilson, the Regents passed them both in July of 1995, and the vote garnered Wilson’s fledgling, but ultimately abortive, presidential campaign the media attention it needed if he was to be able to solicit donations from outside the state. The requirement that UC’s graduate schools cease considering race and sex in their admissions processes did not become effective until January 1, 1997, but in the interim, the California electorate, much at the urging of Connerly and Wilson, followed the Regents lead.

Enter Tom Wood, Connerly’s and Wilson’s partner in the campaign for the Proposition with the doublespeak name--the California Civil Rights Initiative (CCRI)--and a veritable poster child for the hypothetical White men much abused by the supposedly "reverse discriminatory" effects of affirmative action. Wood and Glynn Custred, another frustrated academic, purportedly had the idea for something like Proposition 209 independently, but they joined to write the actual initiative in 1991. Together they tried and failed twice to gather sufficient signatures to get the measure on the state’s ballot. By 1993, they realized that they needed help, so they took their idea to well connected members of the Republican Party, who recognized that an initiative (or federal statute) ending affirmative action but framed as a civil rights bill would make a fantastically powerful wedge issue to divide Democratic voters. From the GOP they received financing, national public support and press, and general aid. What the campaign really needed, however, was an event to bring the issue of affirmative action from the background of societal dialog, where conservatives themselves had labored for years to put it, into the forefront of American politics; of course, the event had to be one that would help them frame and deliberately mischaracterize the elimination of affirmative action as a continuation of the country’s evolving civil rights struggle. The Regents’ decision to end affirmative action in the UC system provided just the needed event.

And so it made sense that Connerly became a leading figure in the campaign to pass the CCRI as well. Connerly, while making the colorblind society argument and saying that he had succeeded without any special help because of his race, nevertheless used his race to give the campaign a credibility it otherwise might have lacked. At the same time, Wood argued for the Initiative in terms of its fairness to White men long and unjustly burdened by affirmative action, using himself as an example. A wannabe academic, Wood had not received teaching positions, he claimed when he first started to receive media attention, because they had gone instead to less qualified minorities. It later became known that he had not even applied for one of the positions, and those hired might have been at least as qualified, if not more so, than he. Apparently never considering the possibility that it was he who was unqualified to teach in any classroom anywhere in the country, or perhaps realizing that exactly that point, he turned his energy and time to the campaign for CCRI, using his perhaps fallacious experiences as examples of the dangers of affirmative action, adding to the anecdotal evidence of such cases, which is the only kind affirmative action foes ever seem to have.

Despite the perhaps purposefully and ultimately unhelpfully muddled language of the Proposition that the state’s Office of the Legislative Counsel allowed onto the ballot, despite the almost transparent agendas at work behind the measure, and perhaps because of a poorly run and sickly underfunded opposition campaign, in November of 1996, 54% of those Californians who bothered to cast ballots voted to approve the CCRI. Other numbers, however, show that much of the state, at least by the end, had managed to see through the well orchestrated charade: 74% of African-American, 76% of Latinos, and 61% of Asian-Pacific-American residents voted against the measure, as did 59% of those with incomes under $20,000. Similarly, three large counties (Los Angeles, Alameda, and Santa Clara), the City of Los Angeles, San Francisco, and the Bay Area generally all voted against CCRI. Their votes were not enough, though, so opponents had to turn to the courts.

By December 23, 1996, Judge Thelton Henderson had ruled that plaintiff Coalition for Economic Equity had demonstrated a probability of success on its claim that Proposition 209 violates the Equal Protection Clause of the Fourteenth Amendment of the United States Constitution under the Hunter v. Erickson and Washington v. Seattle Sch. Dist. No. 1 line of cases and granted a preliminary injunction against enforcement of the measure.

Defendants, of course, appealed and pulled perhaps the most mind-bogglingly reactionary panel of circuit judges one could hope/fear to encounter. That panel, unsurprisingly, overruled Judge Henderson; though it attempted (poorly) to find a way around the Hunter and Seattle jurisprudence, its decision is rather transparently grounded in the shockingly simpleminded belief that a popular enactment could not possibly violate the Equal Protection Clause and that a federal court should not interfere in the decisions of the voters of a state. By the end of the summer of 1997, the Ninth Circuit voted not to grant an en banc rehearing of the panel decision. And the Supreme Court put the matter of 209’s constitutionality to rest temporarily by refusing to grant cert and letting the panel’s decision stand.

In the interim, however, Boalt Hall had proceeded to implement the Regents’ decision by adjusting its admission policy. Part IV.A describes the change and its results in detail, but put simply, the school’s admission policy in 1997 contained no consideration of race itself as a criterion, though it did allow candidates to describe adversities they had faced and overcome. The result of the new policy was a radical drop in the number of African-Americans, Latinos, and Native Americans admitted; the number of individuals who identified themselves as part of those groups who decided to attend Boalt was, of course, even lower. In part on the basis of those numbers, MALDEF filed a complaint with OCR raising the question that this paper explores: whether in complying with SP-1 and Proposition 209, Boalt implemented a facially neutral policy that nevertheless had discriminatory effects in violation of Title VI.

ii

Disparate Impact under Title VI

The Supreme Court has only rarely addressed the scope of disparate impact theory under Title VI of the Civil Rights Act of 1964, and its jurisprudence in the area, though relatively clear in holding that one may bring a disparate impact claim against those who receive federal funding, has not, perhaps because of the rarity of pronouncements, provided a detailed approach to how to frame such a claim. Indeed, the Court has, since the passage of the Act, changed its mind several times about whether one may bring such a claim, whether one may bring it under the statute itself, or whether one may only bring it under Title VI’s implementing regulations. And considering the number of justices who have left the bench since the last decision on the issue, the Court may well change its mind again about whether one may bring such a claim at all, if given the chance. Thus, although the following is a fair summary of the law as it is, the Court may rewrite this chapter of civil rights law if the case against Boalt proceeds that far.

A. The Court’s Uncertain Title VI Jurisprudence.

Under its most recent decisions, Guardians Ass’n v. Civil Service Comm’n of the City of New York and Alexander v. Choate, both already more than a decade old, the Court said that the prohibitory language of Title VI reaches only intentional discrimination, but that plaintiffs can press disparate-impact-based claims of discrimination under the regulations that the Department of Education has issued pursuant to Title VI. How the agency then had the statutory authority to create such regulations is difficult to understand. On the other hand, when the Court decided Guardians, it still claimed to give "great deference" to agencies’ interpretations of the statutes Congress empowered them to implement; and the Chevron doctrine should have led the Court to defer to the agency’s interpretation of the ambiguous term "discrimination" in Title VI when it decided Alexander v. Choate. Thus, it is equally as troubling that the Court would ignore those principles and give the somewhat ambiguous language of the statute a different meaning, even though agencies promulgated the regulations years before the Court first spoke on the issue, and Congress had never acted to overturn the agencies’ interpretation of Title VI. The following is a summary of the current status of disparate impact theory under Title VI and the twisted path by which the Court got there.

1. Lau and Bakke: Title VI Includes Disparate Impact ... or Not.

In the course of five years and in just two decisions, the Court managed to make and then change the law under Title VI. The statute states, briefly and ambiguously, "No person in the United States shall, on the ground of race, color, or national origin, be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any program or activity receiving Federal financial assistance." The next section states that

Each Federal department and agency which is empowered to extend Federal financial assistance to any program or activity . . . is authorized and directed to effectuate the provisions of . . . this title with respect to such program or activity by issuing rules regulations or orders of general applicability which shall be consistent with achievement of the objectives of the statute authorizing the financial assistance in connection with which the action is taken.

In conformance with Congress’ command, the then Department of Health, Education, Education, and Welfare, issued general regulations regarding federally funded schools as early as 1968. By 1970, it had made them more specific. They now read, in relevant part, as follows:

A recipient [of Federal funds], in determining the types of services, financial aid, or other benefits, or facilities to be provided under any such program, or class of individuals to whom, or the situation in which, such services, financial aid, other benefits, or facilities will be provided under any such program, or the class of individuals to be afforded an opportunity to participate in any such program, may not, directly or through contractual or other arrangements, utilize criteria or methods of administration which have the effect of subjecting individuals to discrimination because of their race, color, or national origin, or have the effect of defeating or substantially impairing accomplishment of the objectives of the program as respect individuals of a particular race, color, or national origin.

The relevant portions of the regulations had substantially the same form when the Court decided its first Title VI case, Lau v. Nichols, in 1974. In the decision, the majority did not treat the regulations and the statute as conflicting in any way: the regulations prohibited policies that had discriminatory effect, even when no purposeful design was present, and a violation of the regulations was a violation of the law. Those who wrote separately wondered "whether the regulations and guidelines promulgated by HEW go beyond the authority of § 601." However, they concluded that the regulations were reasonably related to the purposes of the Act, implying that the purpose of Title VI was to eliminate both intentional discrimination and policies with discriminatory effect. Indeed, although somewhat circular, they gave the regulations "great weight" in determining what the purpose of the legislation was.

That deference and perception of consistency did not last more than five years, as the Court’s decision in Regents of the University of California v. Bakke cast serious doubt on Lau’s interpretation of Title VI. The votes and opinions in Bakke are as strange as they are messy. Justice Powell, writing for himself and delivering the opinion of the Court, noted but ignored both Lau and the regulations and held, "In view of the clear legislative intent, Title VI must be held to proscribe only those racial classifications that would violate the Equal Protection Clause or the Fifth Amendment." Two years earlier, the Court had held that only purposeful discrimination by state and federal actors was subject to strict scrutiny under the Fourteenth and Fifth Amendments, respectively; in other words, according to the Court, the Constitution and, after Bakke, Title VI prohibit only intentional discrimination. Although Powell had voted with the majority in Washington v. Davis, he had also voted with the majority in Lau.

Even stranger, Brennan and Marshall, who voted with the majority in Lau and dissented in Davis, agreed with Powell on the proper interpretation of Title VI. The decision, however, does not seem so strange, only short-sighted, when one recalls that Bakke was a "reverse discrimination" case in which a two-time applicant turned down by many other medical schools claimed to have been the victim of racial discrimination because UC Davis Medical School did not consider him for its special admissions program (thereby demonstrating an audacious refusal to admit only white men). The bloc of justices formed by Stevens, Burger, Stewart, and Rehnquist wanted to decide the case on statutory grounds by holding that the school had violated the law by excluding Bakke on the basis of race. For them, "the meaning of Title VI’s ban on exclusion is crystal clear: Race cannot be the basis of excluding anyone from participation in a federally funded program." Their reading of the law is understandable, if simpleminded, but such statements are ultimately unhelpful, since they do not say anything about how one proves such exclusion. At issue in the case was a program that, at least according to the Court, facially considered race, so proving exclusion on that basis was not difficult. However, one cannot deny the possibility that a recipient of federal funds could effectively deny someone participation in the program on the basis of race, and the Stevens opinion, despite its survey of legislative history, provides no convincing reason why such discrimination should not be legally cognizable.

More difficult to understand, considering their defense of affirmative action in other cases, is Brennan and Marshall voting with Powell to interpret Title VI as coextensive with the Constitution and, therefore, prohibiting only intentional discrimination in federally funded programs. Perhaps they believed that only by adopting that position of Powell’s could they get the votes they needed to assure that affirmative action would remain a viable option for public schools in the future to remedy past societal discrimination. Thus, they were able to prevent the outcome desired by Stevens et al. only by effectively disempowering Title VI by reading the statute to prohibit only intentional discrimination in federally funded programs. As Brennan’s opinion acknowledged, the position that he, his bloc, and Powell staked out called Lau seriously into question. Only by later importing the principles of disparate impact under Title VII into the DoE’s regulations were they able to return some power to Title VI and make it an effective means of fulfilling the one Congressional purpose on which all agree: the elimination of segregation in federally funded programs.

2. Guardians and Other Cases: Disparate Impact Is Available under Title VI Regulations, and Title VII’s Disparate Impact Framework Applies ... Whatever That Means.

Having left the proper interpretation of Title VI somewhat unclear in its absurdly messy opinion(s) in Bakke, the Court again addressed the issue a few years later in Guardians Ass’n v. Civil Service Comm’n of New York. The votes in the latter case were equally messy, but those justices with an interest in actually preventing discrimination in federally funded programs managed to restore some remedial power to Title VI by cobbling together five votes for the proposition that the regulations promulgated under the Title prohibit policies with discriminatory effects, and that the regulations are valid, even if the Title itself does not prohibit the use of such policies. Justice White, who delivered the judgment of the Court and was one of those five, began his analysis by distinguishing any holding regarding Title VI in Bakke on the ground that the issue there was

whether Title VI forbids intentional discrimination in the form of affirmative action intended to remedy past discrimination, even though such affirmative action is permitted by the Constitution. Holding that Title VI does not bar such affirmative action if the Constitution does not is plainly not determinative of whether Title VI proscribes unintentional discrimination in addition to the intentional discrimination that the Constitution forbids.

It is sensible to construe Title VI, a statute intended to protect racial minorities, as not forbidding those intentional, but benign, racial classifications that are permitted by the Constitution, yet as proscribing burdensome, non-benign discriminations of a kind not contrary to the Constitution.

Put somewhat more clearly, in White’s opinion, the prohibitory reach of Title VI and the Constitution are coextensive when applied to an affirmative action policy: both disallow only intentional racial discrimination; unlike the Constitution, however, Title VI itself also prohibits facially neutral policies that have discriminatory effects. Thus, Bakke did not actually overrule Lau.

From there, Justice White went on to reason that

[e]ven if Bakke must be taken as overruling Lau’s holding . . . none of the opinions of the five Justices whose opinions arguably compel this result considered whether the statute would permit regulations that clearly reached [disparate impact]. And no Justice in Bakke took issue with the view of the three concurring Justices in Lau, who concluded that even if Title VI itself did not proscribe unintentional discrimination, it nevertheless permitted federal agencies to promulgate valid regulations with such effect.

So, Justice White believed that Bakke’s ruling on Title VI applied only to affirmative action policies, that Title VI otherwise prohibited unintentional discrimination, and that the regulations under Title VI were valid and reached disparate impact, even if the statute did not. Unfortunately, the majority that agreed on the last point did not also agree on the first.

Indeed, of the nine justices, seven agreed that discrimination under Title VI does not encompass disparate impact, five agreed that the regulations promulgated under the Act validly do, and only Marshall agreed with White, presenting the most compelling argument. In his opinion, Justice Marshall summarized the history of administrative regulations issued pursuant to Title VI. He found that they had "uniformly and consistently interpreted the statute to prohibit programs that have a discriminatory impact and that cannot be justified on nondiscriminatory grounds." He also found that that interpretation originated in the same year as the Act, 1964; that Congress had never altered that interpretation, though it had been aware of it; that it had modeled other statutes on it without requiring an intent test; and that it had even rebuffed attempts to overturn that interpretation. He therefore concluded as follows:

In the face of a reasonable and contemporaneous administrative construction that has been consistently adhered to for nearly 20 years, originally permitted and subsequently acquiesced in by Congress, and expressly adopted by this Court in Lau, I would hold that Title VI bars practices that have a discriminatory impact and cannot be justified on legitimate grounds. I frankly concede that our reasoning in Bakke was broader than it should have been. . . . Whatever the precise relationship between Title VI and the Equal Protection Clause may be, it would have been perverse to construe a statute designed to ameliorate the plight of victims of racial discrimination to prohibit recipients of federal funds from voluntarily employing race-conscious measures to eliminate the effects of past societal discrimination.

Justices Stevens, Brennan, and Blackmun, who, considering the tone of their opinion, must have found Marshall’s arguments persuasive, nevertheless did not join his opinion, relying on stare decisis to justify their decision. "If a statute is to be amended after it has been authoritatively construed by this Court, that task should almost always be performed by Congress. Title VI must therefore mean what this Court has said it means, regardless of what some of us may have thought it meant before this Court spoke." Thus, they concluded that Bakke must govern, and "proof of invidious purpose is a necessary component of a valid Title VI claim." Apparently unwilling to eviscerate the statute completely, however, they went on to hold that the regulations implementing the statute are valid, even though they incorporate an effect test: "By prohibiting grant recipients from adopting procedures that deny program benefits to members of any racial group, the administrative agencies have acted in a reasonable manner to further the purposes of Title VI." They, with White and Marshall, thus generated the interpretation of Title VI that governs today and that would control any challenge to Boalt’s admission procedures.

The remaining justices disagreed, but their reasoning is less than convincing. In the face of Justice Marshall’s lengthy and detailed summary of the development of the administrative regulations implementing Title VI, in which he explicitly points out that the regulations predate even Lau by a decade, Powell wrote, "I know of no precedent whatever for asserting that [the deference earlier opinions had required the Court to give reasonable administrative interpretations] is proper after this Court has issued a definitive--and contrary--construction of its own." One cannot win a debate in which some participants ignore obvious and known facts to effectuate their own purposes.

Regardless of how the Court reached its interpretation of Title VI and its implementing regulations, a majority had agreed that the regulations validly prohibit recipients of federal funds from using policies that have disparate racial impact; it had, to that point, however, provided little guidance about how to structure a disparate impact case under Title VI. Justice White provided some hint in his summary of the case in Guardians. The plaintiffs in the case had initially brought claims under both Title VI and Title VII. The trial court held that they had proven a case under the latter, so it did not reach the former. The appellate court, however, found that an intervening decision by the Supreme Court had made part of the ruling untenable. On remand, the trial court agreed but then reached Title VI, concluding that proof of discriminatory effect was enough to establish a violation. It apparently never reconsidered the finding of disparate impact it had made under Title VII. And since the Court never disputed the findings of disparate impact, it implicitly accepted that a finding of disparate impact under Title VII principles would also make out a disparate impact case under Title VI.

The Court had been somewhat more explicit a few years earlier in Board of Education v. Harris, but that case involved legislation analogous to Title VI and not the Title itself. In Harris, the Court confronted a denial by the then-Department of Health Education, and Welfare for funds under the 1972 Emergency School Aid Act (ESAA); the Department based the denial on statistics it had developed during a compliance investigation under Title VI, showing a disproportionately high assignment of minority teachers to schools with a majority of students of color, as well as a disproportionately low assignment of teachers of color to schools with a minority of students of color. The Court first determined that the ESAA’s ineligibility provisions encompassed school policies with racially discriminatory effects. It then implicitly approved a Title VII-type framework in which the Department could make out a prima facie case of disparate impact using statistics, and though that showing was rebuttable, the "burden is on the party against whom the statistical case has been made." In support of that proposition, the Court cited two Title VII cases and said that that "burden could be carried by proof of ‘educational necessity,’ analogous to the ‘business necessity’ justification applied under Title VII . . . ." The decision had Title VI implications because one of the eligibility criteria under the ESAA was implementation of a desegregation plan approved by HEW as adequate under the Title. And though the Court did not reach the issue of whether the ESAA and Title VI were coextensive, since a violation of the latter could result in ineligibility under the former, the Title VII-type framework implied as appropriate under the ESAA would also be appropriate for the resolution of a disparate impact claim under Title VI.

The federal circuits and district courts have dealt much more with Title VI cases, allowing private plaintiffs to proceed under the regulations, and making explicit what the Supreme Court had only implied by importing Title VII’s burdens and defenses for disparate impact claims into the Title VI setting. Larry P. v. Riles is the Ninth Circuit’s primary statement on the outline of a Title VI disparate impact claim, and since any challenge to Boalt’s admissions practices would be heard there, a brief review of the court’s treatment of the issue seems worthwhile. The case dealt with the state’s use of an IQ test for the placement of students in special classes for the mentally retarded and the disproportionate placement of African-American students in such programs as a result of the practice. The court held, analogizing specifically to Title VII cases, that a plaintiff establishes a prima facie case by showing that the tests have a discriminatory impact on a protected group. The burden then shifts to the defendant to demonstrate that the requirement that caused the disproportionate impact was required by educational necessity. However, that is not even close to a full description of disparate impact as developed under Title VII; the courts on all levels seem to have had little opportunity to consider disparate impact claims under Title VI, so the jurisprudence remains underdeveloped, saying that Title VII principles apply but only occasionally discussing their adaptation in any amount of detail. Therefore, Part III sketches an outline of the issues, burdens, and defenses involved in Title VII disparate impact cases, particularly in light of the changes worked by the Civil Rights Act of 1991.

B. The Future of Disparate Impact under Title VI.

Perhaps the most important--and most unanswerable--question surrounding a disparate impact case against Boalt for its 1997 admissions process is whether the Court will continue to hold that one can even claim disparate impact under Title VI’s regulations. Considering the litany of decisions that have come out of the current Court that are hostile to affirmative action, not to mention claims of racial discrimination generally, the survival of disparate impact claims under those regulations is very much in doubt. When the Court decided Guardians and Alexander v. Choate, its membership included Justices Powell, Rehnquist, O’Connor, Marshall, Stevens, Brennan, Blackmun, White, and Chief Justice Burger. Of them, only Rehnquist, O’Connor, and Stevens remain on the bench. In Guardians, Rehnquist and O’Connor both opined that Title VI prohibits only intentional discrimination. Stevens voted that Title VI’s implementing regulations are valid, and therefore, that they permissibly prohibit facially neutral policies that have discriminatory effects. In any challenge to Boalt’s 1997 admission process, Scalia and Thomas can almost certainly be expected to vote with O’Connor and Rehnquist, despite stare decisis, on the basis of textualism alone, if not on the basis of their personal politics. Of the remaining current Justices, Kennedy, Breyer, Souter, and Ginsburg, Kennedy at least seems likely to vote with the others to overturn the Guardians Court’s interpretation of Title VI and its regulations. Thus, it certainly seems likely that the legal basis for this paper has been eliminated by changes on the Court. It nevertheless proceeds on the assumption that Guardians will remain good law.

The ‘91 CRA adds an interesting wrinkle to any rumination on the survival of disparate impact under Title VI. Recall that Brennan and those who joined with him in Bakke probably agreed with the others that Title VI is no broader than the Constitution, and therefore prohibits only intentional discrimination, because they believed that any other holding would permit White applicants to bring disparate impact claims against schools using affirmative action plans. Setting aside that under a Title VII framework, such plaintiffs could probably not even make out a prima facie case, § 116 of the Civil Rights Act of 1991 provides that none of the amendments to Title VII shall affect affirmative action that is in accordance with the law. Since a Title VI disparate impact case is supposed to follow the Title VII framework, one could infer from that part of the ‘91 CRA that as long as an affirmative action program is otherwise in compliance with federal law, it is not subject to a disparate impact attack on the basis of Title VI. That means that on a practical level, those who would, like Brennan, Marshall, Powell, and others, vote to restrict Title VI’s reach to intentional discrimination as a means of protecting affirmative action programs no longer have to take that position, as the general disparate impact framework immunizes otherwise valid affirmative action programs.

iii

Disparate Impact under Title VII

Since the federal courts now consistently use Title VII’s framework of burdens and defenses for claims of disparate impact in Title VI cases, an overview of that framework is necessary to understand how the courts and the OCR would evaluate the challenge to Boalt’s admission processes. The most important recent development in the operation of disparate impact analysis under Title VII is the 1991 Civil Rights Act. The ‘91 Act overturned certain portions of Supreme Court jurisprudence in the area, reaffirmed some, and brought others into relief. At its most basic, the form of a disparate impact claim today is essentially as it was when the Court first dealt with the issue in Griggs. The plaintiff must first make out a prima facie case of discrimination by showing that the employment practice or policy challenged "selects applicants for hire or promotion in a racial pattern significantly different from that of the pool of applicants." If the plaintiff sustains that burden, the defendant must then show that the policy or practice is job related and consistent with business necessity. Even if the employer makes that showing, the plaintiff can still prevail by showing that another practice, policy, selection device, etc. would satisfy the employer’s legitimate business concerns but would not have as much or any discriminatory effect. The model is simple; the devil, of course, is in the details, as well as in converting the principles to the setting of Title VI.

A. Plaintiff’s Prima Facie Case: Demonstrating Disparate Impact.

The heart of plaintiff’s prima facie case is the identification of a selection device or employment policy or practice that has the effect of discriminating against a protected group and the provision of proof that that device, policy, or practice actually causes the discrimination. The central prohibition of Title VII in § 703(a) states "It shall be an unlawful employment practice for an employer--

(1) to fail or refuse to hire or to discharge any individual, or otherwise to discriminate against any individual with respect to his compensation, terms, conditions, or privileges of employment, because of such individual’s race, color, religion, sex, or national origin; or

(2) to limit, segregate, or classify his employees or applicants for employment in any way which would deprive or tend to deprive any individual of employment opportunities or otherwise adversely affect his status as an employee, because of such individual’s race, color, religion, sex, or national origin.

Obviously, that prohibition does not itself explicitly refer to discriminatory effects. Thus, the development of disparate impact theory resided in the courts until the ‘91 Act. In that legislation, Congress reacted to developments in the area, particularly to Wards Cove Packing Co., Inc. v. Atonio, by enacting the following language:

An unlawful employment practice based on disparate impact is established under this title only if--

(i) a complaining party demonstrates that a respondent uses a particular employment practice that causes a disparate impact on the basis of race, color, religion, sex, or national origin and the respondent fails to demonstrate that the challenged practice is job related for the employment in question and consistent with business necessity . . . .

Although the codification in the ‘91 Act was helpful in some ways, it provided no definition of causation, leaving intact the jurisprudence of the federal courts on the issue. The following section addresses that issue. Those that follow discuss the particularity requirement codified in §§ 703(k)(1)(A)(i) and (B)(i) and introduced in Wards Cove, as well as the problems presented by the various kinds of criteria employers use in making decisions.

1. Proof of Causation.

The first and most difficult step in a disparate impact case is proving that a specific employment practice had a significant adverse impact in the selection of the plaintiff’s protected group. The Court has characterized a disparate impact as both a "significantly different" selection rate and a "substantially disproportionate" disqualification rate. Plaintiffs normally proceed by introducing statistics that they believe show such a disparity. They may introduce such statistics as selection rates, pass/fail comparisons, population/workforce comparisons, and regression analyses, among others.

The most widely used means of showing that an observed disparity in outcome is sufficiently substantial to satisfy plaintiff’s burden of proving adverse impact is to show that the disparity is sufficiently large that it is highly unlikely to have occurred at random. Tests of statistical significance are commonly used in the social sciences to rule out chance as the cause of observed disparities. Courts thus generally find adverse impact where the selection rate of members of the protected group is statistically significantly different from the selection rate that would have been expected in the absence of discrimination. Tests of statistical significance determine the probability of obtaining the observed disparity (or a greater disparity) by chance. The 0.05 probability level (or below) is accepted by many courts as sufficient to rule out chance. Approximately the same conclusion flows from the standard-deviation model described in Casteneda v. Partida, in which an observed disparity of two or three standard deviations was said to be sufficient to rule out chance.

The Equal Employment Opportunity Commission (EEOC) and other enforcement agencies have developed their own test for when a when disparity is statistically significant in the Uniform Guidelines on Employee Selection Procedures. The so-called "80 percent rule" states that

an employee selection criterion has an adverse impact, for purposes of a plaintiff’s prima facie case where members of a protected group . . . are selected at a rate less than four fifths (80%) that of the allegedly preferred counterparts. For example, if 50 percent of white applicants are hired, but only 35 percent of African-American applicants, then the relevant ratio would be 35/50, or 70 percent, and adverse impact would exist . . . ."

Courts have criticized the 80 percent rule as being not as probative of statistical significance as other techniques and as prone to mislead when sample sizes are small.

Two of the statistical techniques to which courts have given greater approval are the "Z score" test and the "chi-square" test. The formula for the former is the following:

In the equation, %PC is the percentage of members of plaintiff’s protected class in the relevant labor market or selection pool. Conversely, %NPC is the percentage of others in the relevant labor market or selection pool. To calculate the value of ExpectedSelectionspc, multiply the total number of selections made (TotalSelections) by the percentage of the protected class members in the relevant labor market or selection pool. If the resulting Z score is less than 1.96, a court could conclude that there is not enough evidence to believe the disproportion is discriminatory because the relationship is not statistically significant at the 0.05 level. In other words, if the Z score is less than 1.96, there is a greater than 5 percent chance that the disproportion occurred randomly.

The chi square test also compares differences in proportions, though somewhat differently. The simplest version compares just two classes, though one can construct more complex models. To perform a two-class analysis, one begins by setting up a four-cell table as follows:

 

Selected

Not Selected

Class 1

a

b

Class 2

c

d

The variables a, b, c, and d represent the number of people in each group. The model assumes that if class status is unrelated to performance on the test, the proportion of each class selected and not selected will be the same, which expressed mathematically, would be: . The chi-square formula is this:

If X2 is greater than 3.84, then at the 95 percent confidence level, one can reject the theory that class status had no statistically significant role in the outcome; if X2 is greater than 6.635, then at the 99 percent confidence level, a court can conclude that plaintiff has made out a prima facie case because class status played a statistically significant role in the outcome. But in using the statistical techniques, plaintiffs still have to identify the proper numbers to plug into the formulae.

As part of the ‘91 Act’s failure to discuss the methods of proof of causation, it also does not identify the appropriate statistical samples to use in making out a prima facie case. Thus, one has to look to the cases that preceded the Act. In its early disparate impact cases, the Court focused on the comparison between the actual applicant pool and those selected. In making out a prima facie case, the plaintiff must show that "the tests in question select applicants for hire or promotion in a racial pattern significantly different from that of the pool of applicants." In later cases, however, the Court began to express concern about the nature of the applicant pool, especially its qualifications. For example, in Watson v. Fort Worth Bank and Trust, in making clear that an employer could undermine a prima facie case by attacking weaknesses in plaintiff’s statistics, it stated that "statistics based on an applicant pool containing individuals lacking minimal qualifications for the job would be of little probative value."

In Wards Cove, the Court expanded on that aside in Watson, turning it into a more substantive alteration of the requirements of a prima facie showing in a disparate impact case.

The "proper comparison [is] between the racial composition of [the at-issue jobs] and the racial composition of the qualified . . . population in the relevant labor market." It is such a comparison--between the racial composition of the qualified persons in the labor market and the persons holding at-issue jobs--that generally forms the proper basis for the initial inquiry in a disparate-impact case. Alternatively, in cases where such labor market statistics will be difficult if not impossible to ascertain, we have recognized that certain other statistics--such as measures indicating the racial composition of "otherwise qualified applicants" for at-issue jobs--are equally probative for this purpose.

Since plaintiffs in disparate impact cases usually challenge the qualifications about which the Court is so concerned that applicants meet, the decision is usually understood by emphasizing the "otherwise" in the Court’s language in Wards Cove, implying a permissible focus only on applicants who satisfy qualifications other than those challenged. Such a focus would be consistent with the particularity requirement that the Court imposed in Wards Cove and that Congress seemingly codified in the ‘91 Act, since plaintiffs could dispute the effects of one qualification but still have to base the statistical sample on those who meet the others (if there are any). However, so parsing the applicant pool may be a prohibitively difficult task for plaintiffs. Additionally, according to the Wards Cove Court, before plaintiffs may use that alternative showing, they first must be able to demonstrate that ascertaining labor market figures would be difficult or impossible, and the Court has provided no guidance in how plaintiffs may make such a showing. Moreover the ‘91 Act admits that there may be situations in which identifying the offending qualification is impossible, permitting plaintiff to deal with the employer’s decisionmaking process as a whole, and making unnecessary a focus on qualifications at all in the statistical samples.

2. The Requirement of Particularity.

As touched on briefly in the preceding section, both the Court in Wards Cove and the Congress in the ‘91 Act required plaintiffs in disparate impact cases to identify the particular business practice that had the discriminatory effect. In the former, the Court stated that "a plaintiff must demonstrate that it is the application of a specific or particular employment practice that has created the disparate impact under attack." In the latter, Congress required plaintiffs alleging disparate impact to demonstrate that an employer "uses a particular employment practice that causes a disparate impact." In clarifying that requirement, the Wards Cove Court held that the plaintiffs in the case could not use so-called "bottom line statistics" offensively. Bottom line statistics "look at the results of an employer’s selection process rather than at a single component . . . ." Thus, plaintiffs in Wards Cove could not make out a prima facie case by showing that two job categories had starkly different racial compositions; rather, they had to show that some particular employment practice operated to cause a statistically significant disproportion. That the job categories themselves were de facto segregated was not enough.

In an earlier case, Connecticut v. Teal, the Court similarly held that an employer could not use bottom line statistics defensively. There, the employer used a test in making promotion decisions, promoting those who passed and not promoting those who failed. The test therefore operated as a pass/fail barrier. It also had a statistically proven disparate impact, and the employer had not validated it. However, at the bottom line, the employer eventually promoted disproportionately more African-Americans than Whites. The Court nevertheless held that the employer could not use bottom line statistics indicating racial balance in its work force to insulate itself from claims that it used a test that had proven adverse impacts, at least when the employment practice is a test used as a pass/fail barrier.

The Court and Congress seem, however, to have carved out exceptions to both the offensive and defensive use of bottom line statistics. The dissenters in Teal, who included Rehnquist and O’Connor, argued that employers could, within the reasoning of the majority in the case, still defend with bottom line statistics a multicomponent decisionmaking process that includes consideration of a scored test, if the test does not constitute a pass/fail barrier. Presumably, their compatriots Scalia, Thomas, and Kennedy might agree with Rehnquist and O’Connor, making theirs the position of a majority of the Court. And though those five might attempt to find a way to avoid the result, consistency would seem to require allowing plaintiffs to use bottom line statistics offensively when employers use such a decisionmaking process, and it results in a disproportionately constituted work force.

Congress also carved out an exception for the offensive use of bottom line statistics in the ‘91 Act. Section 703(k)(1)(B)(i) states

With respect to demonstrating that a particular employment practice causes a disparate impact . . . , the complaining party shall demonstrate that each particular challenged employment practice causes a disparate impact, except that if the complaining party can demonstrate that the elements of a respondent’s decisionmaking process are not capable of separation for analysis, the decisionmaking process may be analyzed as one employment practice.

Apparently attempting to give meaning to that provision of the statute, the legislative history provides the following guidance:

When a decision-making process includes particular, functionally integrated practices which are components of the same criterion, standard, method of administration, or test, such as the height and weight requirements designed to measure strength in Dothard v. Rawlinson . . . , the particular, functionally-integrated practices may be analyzed as one employment practice.

Federal courts have added little meaningful construction to that section, so what makes the elements of a decisionmaking process acceptably difficult to differentiate for the purposes of the ‘91 Act remains unclear. Two cases do provide some guidance in the area. In Caron v. Scott Paper Co., plaintiffs challenged their employer’s method of deciding whom to lay off.

First, each department head divided the jobs in his department--including those to be eliminated--into "job groups." Each job group was composed of jobs that required the same or similar skills on the part of the employees performing those jobs.

After these initial steps were taken, Plaintiffs were evaluated by a team of co-workers using six subjective factors and one objective factor. No single person or group graded all employees. A final rating was developed for each employee by members of the respective department’s evaluation team. Employees selected for termination were those with the lowest scores in their job group.

The court decided to treat the combined subjective and objective criteria as a single subjective process. In dealing with the same process in a later case, the same court held that the components were not separable and that limiting "the disparate impact model to a discrete component of the selection process would completely exempt the situation where the adverse impact is caused by the interaction of two or more components of the process." Thus, the court was willing to allow the use of bottom-line statistics to make out a prima facie case in the proper circumstances.

3. Selection Criteria.

Many of the principles in the preceding discussion are more well adapted to and many of the Court’s most important disparate impact cases deal with the use of so-called "objective criteria" in an employment decisionmaking process. Scored tests, diploma requirements, and height and weight minima are generally considered objective criteria. However, many employers use an interview, essays, or the general impressions of supervisors in their decisionmaking processes and base their decisions on their subjective impressions of applicants. The Supreme Court has said that such subjective criteria can cause adverse impacts cognizable as actionable discrimination under Title VII. In so holding, it added that "[h]owever one might distinguish ‘subjective’ from ‘objective’ criteria, it is apparent that selection systems that combine both types would generally have to be considered subjective in nature."

In Wards Cove, O’Connor and three other justices believed that the extension of disparate impact theory to the use of subjective employment criteria could encourage employers to adopt quotas or preferences to counterbalance the discriminatory effects of such practices. To "safeguard" against that possibility, the plurality attempted to alter the "evidentiary standards." First, they redefined "business necessity" as "legitimate business reasons." Second, they made the employer’s burden only one of producing evidence of the existence of legitimate business purposes for the use of the criteria under attack. Third, they rejected any requirement that employers validate subjective criteria to satisfy their burden of production. The majority in Wards Cove essentially adopted all of those principles, and the ‘91 CRA overruled that decision, eliminating the necessity but allowing the possibility of distinguishing between subjective and objective criteria for the purpose of making out a prima facie case. But even if all of the preceding and following principles of a disparate impact case apply to an attack on the use of subjective criteria, some questions remain.

Among the questions the courts and Congress have yet to answer are how a plaintiff must establish the disparate impact of subjective criteria, what the proper groups for comparison are, and whether the focus should be on how the employer applies the criteria or on the impact of applying those criteria at all. Perhaps the only answer currently available to such questions is to conform as closely as possible to disparate impact principles as articulated, to the extent that such articulations are consistent or capable of harmonization; and that is the approach in Part IV to adapting Title VII’s disparate impact principles to a Title VI case against Boalt.

B. Defendant’s Case: Business Necessity, Job Relatedness, and Validation.

Employers can defend themselves against claims of using employment practices that have discriminatory effects in two ways. In cases in which plaintiffs use statistics, employers can attack the sufficiency of those statistics, thereby preventing plaintiffs from even making out a prima facie case. Otherwise, employers must demonstrate the business necessity and job relatedness of the practice under attack. A common means of making that demonstration is through statistical validation. Of course, as in other areas of Title VII jurisprudence, the meanings of many of those terms and how to make those demonstrations are less than clear.

1. Attacking the Sufficiency of Plaintiff’s Statistics.

An employer can prevent a plaintiff from even making out a prima facie case by challenging the propriety of plaintiff’s statistics (when plaintiff attempts to prove disparate impact through the use of statistics). The most successful criticism of a plaintiff’s statistics is that they are based on too small a sample size; if justified, that criticism minimizes both the statistical and practical significance of a plaintiff’s statistics. Additionally, a defendant can rebut any inference of discrimination by providing the court more sophisticated statistics that show that the apparent disproportions are not statistically significant.

2. Business Necessity and Job Relatedness.

If a plaintiff can establish a prima facie case of disparate impact, Title VII now requires the defendant employer "to demonstrate that the challenged practice is job related for the position in question and consistent with business necessity." By defining the term "demonstrate" to mean meeting the burdens of production and persuasion, the ‘91 CRA overturned that portion of Wards Cove that said the employers burden was only one of production. Additionally, as the O’Connor plurality in Watson had characterized the employer burden of producing a "legitimate business reason" for its challenged practice, the Wards Cove Court characterized the burden as requiring the production of a "business justification." The ‘91 CRA amended Title VII by codifying the language of cases before Watson and Wards Cove; it also stated in its purposes that it was intended to restore the pre-Wards Cove meaning of those terms and prohibited courts from examining any legislative history but a specially prepared interpretive memorandum.

Besides being an obviously unusual lawmaking technique, Congress’ approach to overruling Wards Cove’s characterization of the burden of an employer in a disparate impact case is not as helpful as it could be, as the Court’s decisions before Wards Cove, which do include Watson, do not themselves provide consistent construction of the terms business necessity and job related. So, one has to scan the cases themselves. In Wards Cove, in getting to its statement that an employer need only produce a business justification, the Court differentiated that formulation by stating that "a mere insubstantial justification" would not suffice, but at the same time, "there is no requirement that the challenged practice be ‘essential’ or ‘indispensable’ to the employer’s practice." The Court thereby presented three possible interpretations of the employer’s burden: provide any justification for the challenged practice, provide a business justification for the practice, or prove that the practice is essential to the business. Since the ‘91 Act rejects the first two formulations, one could infer from Wards Cove that "consistent with business necessity" means that the employer must prove that the challenged requirement(s) is essential to the functioning of the business. However, such a reading might not be consistent with other pre-Wards Cove cases, nor does it give meaning to the full statutory language, as the employment requirement must also be "job related for the position in question."

In Watson, the Court nowhere mentions business necessity, stating only that a practice is job related if the employer can produce "legitimate business reasons" for it. As extremely tangential support for that position, it cites to New York Transit Authority v. Beazer and Washington v. Davis. In the former, the Transit Authority (TA) prohibited from employment drug users, including those on methadone maintenance. Plaintiffs complained that the policy had a disproportionate impact on Black and Hispanic employees. The Court held that the TA’s goals of safety and efficiency were legitimate and permitted the exclusion from employment of any form of drug user in safety-sensitive positions. It added that the policy was job-related for all other (nonsafety-sensitive) positions because the exclusion "significantly served" the TA’s goals of safety and efficiency. Thus, the Court effectively defined job-relatedness as significantly serving a legitimate business goal. That cannot, however, provide a basis for using Watson’s as the definitive pre-Wards Cove reading of the terms "business necessity" and "job related," as neither Watson nor Beazer give any construction to "business necessity;" both deal only with whether the challenged employment practice is job related. Additionally, only a plurality joined in the part of the Watson decision that says "job related" means producing a legitimate business reason.

One might find more helpful, then, the Court’s decisions in Griggs, Albermarle Paper, and Dothard v. Rawlinson. In Griggs, the first Title VII disparate impact case, the Court held that the "touchstone" of whether an employment practice’s discriminatory effects are illegal "is business necessity. If an employment practice which operates to exclude Negroes cannot be shown to be related to job performance, the practice is prohibited." It then went on to add that the employer must show that the requirement bears "a demonstrable relationship to successful performance of the jobs for which it was used." The Court pointed to two things that undermined Duke Power’s ability to defend its high school diploma requirement and IQ test: the employer had adopted both "without meaningful study of their relationship to job-performance ability," and the evidence available showed that employees who had not taken the test or earned a high school diploma continued to perform adequately and to progress within the organization, implying that neither requirement was necessary to improve the quality of work (i.e. perform the business’ essential functions). It concluded by saying, "What Congress has forbidden is giving these devices and mechanisms controlling force unless they are demonstrably a reasonable measure of job performance. . . . What Congress has commanded is that any tests used must measure the person for the job and not the person in the abstract."

The Court more fully addressed the employer’s burden in Albermarle, in which plaintiffs challenged the use of a pair of standardized tests, concluding that the defendant had failed to meet the burden for several reasons. First, the employer had not validated the tests for the jobs at issue. Second, the paper company had made no effort to analyze the jobs in terms of the specific skills they might require. Third, the validation study that had been done had produced inconsistent results. Fourth, the study compared test scores with subjective supervisorial evaluation. Though a permissible technique, there was "no way of knowing precisely what criteria of job performance the supervisors were considering, whether each of the supervisors was considering the same criteria or whether, indeed, any of the supervisors actually applied a focused and stable body of criteria of any kind." In other words, when validation is based on supervisor rankings, the employer must make sure that the supervisors are all using the same relevant criteria. The Court also noted the EEOC’s concern, as expressed in its Guidelines on test validation, with the problem of bias creeping into subjective evaluations. One can, thus, take from the case that proof of job-relatedness is factual and empirical and not at all an easy burden for an employer. As for business necessity, one can take from the case the same point the Court made in Griggs: if those who have advanced in the company and performed satisfactorily on the job do poorly on the test, it may not be necessary to the business.

Finally, in Dothard, the Court dealt with an employer’s burden of justifying the use of objective criteria that are not standardized tests. There, plaintiffs challenged the use of a height and weight requirement for employment as a prison guard. The employer said that height and weight were related to strength, an unspecified amount of which one would need to perform the job effectively; because of the nature of the position, safety was as important as efficiency. The Court said that the employer had provided no evidence correlating height and weight with the requisite amount of strength necessary to good job performance. It added that when safety is, in addition to efficiency, a necessity of the job, "a discriminatory employment practice must be shown to be necessary to safe and efficient job performance . . . ." One can conclude two things from the Court’s holdings. First, job-relatedness and business necessity are two different things. Not only did the defendant have to show that height and weight were related to strength (job related), it had to show that strength, and thus a practice that evaluated a candidate’s strength, was necessary to the safe and efficient performance of the job (business necessity). Second, both parts of defendant’s burden require empirical demonstration.

3. Validation Strategies.

Section 703(h) of the Civil Rights act expressly authorizes employers to use "professionally developed tests," "provided that such test, its administration or action upon its results is not designed, intended or used to discriminate because of race, color, religion, sex, or national origin." The EEOC’s Guidelines on employment testing set out "detailed and relatively stringent technical standards for validity studies" for tests found to have significant disparate impacts on groups that Title VII protects. Although the Guidelines do not have the weight of promulgated regulations, the Court does give them deference and does cite to them when dealing with disparate impact cases that it believes require validation of employment practices. The message of the Guidelines and the cases is the same: tests with discriminatory effects "are impermissible unless shown, by professionally acceptable methods, to be ‘predictive of or significantly correlated with important elements of work behavior which comprise or are relevant to the job or jobs for which candidates are being evaluated.’"

There are three general methods of test validation: criterion related, content, and construct. An employer establishes criterion-related validity by showing that relatively successful performance on the test has a positive correlation with relative achievement on some measure of job performance. The three primary issues in establishing criterion validity are the selection of appropriate criteria; the selection, design, and reliability of each criterion’s measure; and the degree of correlativity necessary to establish validity. The criteria are the aspects of job performance the test is designed to predict; thus, an employer should choose skills necessary to the job. The most commonly used criterion measure is a subjective supervisory evaluation. Employers must make sure to examine the actual data to eliminate any distortions that might lead to an erroneous finding of validity. Although industrial psychologists consider a correlation coefficient of 0.20 sufficient to establish validity, some courts demand at least 0.30.

Establishing content validation focuses on whether a test directly measures the knowledge, skills, or abilities a job actually requires a candidate to perform. It does not compare on-the-job results to test results; the test should itself mirror the actions demanded on the job, so a good performance on the test correlates with ability to do the job. An examination is content valid if it "tests a representative sampling of specified job functions or the underlying knowledge or skills necessary to perform those functions." Five factors that the federal courts and the EEOC have identified as critical to good content validation are:

(1) a suitable job analysis;

(2) competence in test construction;

(3) test content related to the job’s content;

(4) test content representative of the job’s content; and

(5) a scoring system that selects those who can better perform the job.

An employer establishes construct validation when the test correlates significantly with a trait necessary to the performance of the job, such as intelligence. Employers rarely use it, so it has not seen sufficient litigation to give it much legal meaning.

4. Training Program Validity.

An occasionally recurring question in the courts has been the requirements for validating employment tests used not for the purposes of hiring or promotion but for entrance into a training program. The Court addressed the issue in Washington v. Davis, in which plaintiffs attacked a test for admission to a police training program as effectively discriminating against African-Americans. Plaintiffs argued, and the Court of Appeals found, that even if success on the test was demonstrably positively correlated with success in the training program, the crucial correlation to prove the validity of the test was, rather, between success on the test and successful job performance. The Supreme Court disagreed.

It is also apparent to us . . . that some minimum verbal and communicative skill would be very useful, if not essential, to satisfactory progress in the training regimen. Based on the evidence before him, the District Judge concluded that Test 21 was directly related to the requirements of the police training program and that a positive relationship between the test and training-course performance was sufficient to validate the former, wholly aside from its possible relationship to actual performance as a police officer.

At least one three-judge panel has followed the broad principle the Court enunciated in Washington v. Davis and held that an employer can validate a test used for entrance into a training program by proving a positive correlation between success on the test and performance in the program. Other district courts and circuits have read the decision more narrowly.

The Ninth Circuit, where any challenge to Boalt’s admission process would be heard, has held that "the skill, characteristic, or ability which the . . . [training program entrance] test purports to measure must be important to the requirements of the job. [An employer] may show the requisite relationship by demonstrating that the test correlates significantly with important elements of [training program] performance and that those elements of . . . performance are important to actual job performance." In reaching that holding, the court limited the decision in Washington v. Davis to its facts: the use of a verbal abilities test is valid to determine whether applicants to a training program possess the minimum skills necessary to understand and succeed in the academic portion of the training. The court read the case in that way because of its fear that if "employers were permitted to validate selection devices without reference to job performance, then non-job-related selection devices could always be validated through the simple expedient of employing them at both the pre-training and training stage."

Turning to the facts of the case, the court found that the employer had demonstrated a significant, indeed unusually high, correlation coefficient between test scores and academic performance in the training program as measured by a written examination at the end of the program. The employer thereby satisfied the first part of its burden: proving a correlation between the entrance exam and the program. Plaintiffs attacked the sufficiency of that showing on the ground that the validation study was unrepresentative of the applicant pool because the pool included only those who passed the entrance exam; thus, the validation study did not indicate whether those who failed the test could nevertheless successfully complete the training program. In support of the argument, they cited EEOC Guidelines that required that a criterion-related validity study must include a sample of subjects representative of the typical applicant group. The court agreed that such a sample would normally have to include applicants who passed and failed the test. It nevertheless found defendant’s validation study sufficient only because of the unusually high correlation between the entrance exam and performance in the program (0.60); it inferred from those results that those who failed the exam would not have succeeded in the program. However, it remanded to the trial court because the employer had not proven that the training given was manifestly related to the knowledge and skills that employees must have to perform their jobs satisfactorily.

The Fifth Circuit reached essentially the same conclusion in Ensley Branch, NAACP v. Seibels. It too limited Washington v. Davis’ holding on training program validity to cases in which employers use tests to ascertain whether applicants have minimum skills necessary to complete job-relevant training; since the employer in the case was using the results of the entrance exam to rank applicants and admitting only the highest scorers, a study showing a correlation between success on the test and performance in the program would not suffice. Instead, the court held that use of a test for ranking purposes, rather than as a device to screen out candidates without minimum skills, "is justified only if there is evidence showing that those with higher test scores do better on the job than those with lower test scores." The employer in the case had produced no such evidence. Indeed, the district court had found that the test was not a valid predictor of ability to pass the training curriculum. And although the validation studies showed a positive correlation between entrance exam score and grades in the program, the employer could not use grades as the criterion for comparison, as grades did not predict job performance. Thus, under Title VI, to use a positive correlation between performance on an entrance exam and performance in a training program to validate the test itself, an employer must be able to show one of the following: that the employer uses the test only to screen out those applicants without the minimum skills necessary to complete and benefit from the training; that the content of the test, the program, and the job are all the same; that the skills gained in the program are necessary to perform the job; or that performance in the training program predicts performance on the job (if the employer attempts to validate the test by comparing scores on the test to grades in the program).

C. Alternative Employment Practices.

Having made out a prima facie case, and defendant having demonstrated the business necessity and job relatedness of an employment practice with discriminatory effects, plaintiffs may still prevail by demonstrating the existence of an alternative employment practice that serves the business’ necessities, with less adverse impact, that the employer refuses to adopt. That demonstration must be "in accordance" with the law as it existed prior to Wards Cove, but as in other areas, the pre-Wards Cove law on the issue is not entirely clear.

Some cases, both before and after the 1991 Act, have suggested that the employer must show, as part of its burden of demonstrating job relatedness/business necessity, that there is no acceptable alternative practice that would adequately serve the employer’s legitimate interests with less adverse impact. The Uniform Guidelines at least partially endorse this concept by stating that an employer’s showing of job relatedness should include evidence that the employer conducted "an investigation of suitable alternative selection procedures and suitable alternative methods of using the selection procedure which have as little adverse impact as possible." And some courts have held that an employer’s failure to consider the degree of adverse impact resulting from practices such as the use of cutoff scores and test-score rankings, and the failure to implement reasonable alternatives having less adverse impact, may result in a finding of liability even where the challenged practice is shown to be job related. Other courts, however, have held that the burden is on the plaintiff to demonstrate the existence of comparably effective alternative selection devices having less adverse impact, and that employers are under no duty to conduct any search for alternative practices.

The Supreme Court has elucidated this part of a disparate impact case very little. Its statement on the issue in Albermarle Paper remains its most significant: "If an employer does then meet the burden of proving that its tests are ‘job related,’ it remains open to the complaining party to show that other tests or selection devices would also serve the employer’s legitimate interest in ‘efficient and trustworthy workmanship.’"

iv

The Disparate Impact of Boalt’s 1997 Admission Process

Having laid the legal groundwork to explore an answer, the question nevertheless remains whether the Office of Civil Rights has good reason to believe that the admissions process Boalt used in 1997 produced an adverse racial impact that violated Title VI’s regulations, assuming, of course, that the Court continues to hold that the regulations prohibiting the use of effectively discriminatory polices are a permissible reading of Title VI. In exploring that question, this part first summarizes the admission procedure at issue. It then examines whether the process as a whole produced statistically significant racially disproportionate results. Assuming the ability of a plaintiff to make out or the OCR to find a prima facie case, this part will then examine the school’s ability to prove the "educational necessity" and "educational relatedness" of the identified criteria and the process as a whole. The discussion will include the validity of the LSAT, particularly its validity as a tool for selecting entrants to a training program. Finally, this part examines the existence of other selection methods that would serve the school’s legitimate interests but have less disparate impact on racial minority groups.

A. Boalt’s 1997 Admissions Process.

In response to SP-1, the faculty of Boalt Hall made three general changes to the school’s admissions process. First, it abandoned the use of enrollment goals for underrepresented racial groups. Second, whereas the school had previously given an applicant’s Index score -- the combined total of an applicant’s LSAT score and UGPA, weighted according to undergraduate institution -- "the greatest weight," the faculty decided to accord that score "substantial weight." The admissions policy also gives "substantial consideration" to such things as "letters of recommendation, graduate training, special academic distinctions and honors, difficulty of the academic program successfully completed, work experience, and significant achievement in nonacademic activities or public service." Third, purportedly "to get a sense of" the applicant "as a person and as a potential student and graduate of Boalt Hall," the faculty expanded the permissible length of the required personal statement. More generally, in evaluating an applicant, the Admissions Committee is supposed to consider "disadvantages that adversely affected his or her performance" and "to select students who will attain the highest standards of professional excellence and integrity, and who will bring vision, creativity, and commitment to their professional endeavors." Essentially, then, the school has invested tremendous discretion in both the Admissions Office and Committee in making admissions decisions, and that discretion may have pushed in one of two directions. Either it may have encouraged improper and unexpected reliance on the numerical data, particularly the LSAT. Or it left decisionmakers free to make such unprincipled decisions that one cannot distinguish what factors led to any individual decision to admit or reject an applicant (other, perhaps, than those almost automatically admitted because of their Index scores).

1. How the Process Worked.

When an application arrived at Boalt during the 1997 admissions cycle, the school’s first step was to assign it an Index score. In computing index scores, whereas Boalt’s peer schools typically weight an applicant’s LSAT score at about 70%, Boalt gave the LSAT and UGPA equal weight in 1997. However, that year it also adjusted applicants’ UGPAs before computing the Index score. The school adjusted the UGPAs according to the age of the degree (to compensate for grade inflation) and the rank of the undergraduate school (to account for "quality of the student body and the grading patterns at the school"). Boalt used the LSAT to rank schools by averaging the LSAT percentile for all test-takers from the school and the LSAT percentile ranking of students at the school who have a UGPA of at least 3.6. The school then assigned all applications to one of five strata--A, B, C, D, and deny--based on the applicant’s Index score. Moving from stratum A to D, Index scores declined, as did the ratio of admitted to total students per stratum. Indeed, the Admissions Office admitted the majority of applicants in stratum A without further consideration and accorded the same treatment to some students in stratum B.

The Admissions Director read all applications of those whom the Admissions Office had not administratively admitted or rejected and recommended which should move on to the "Committee read" stage of the process. Although the faculty’s general pronouncements on diversity controlled the Director’s decisions about which applications to send to the Committee, since as indicated, the faculty did not provide detailed instructions about how to evaluate different types of claims of disadvantage and types of diversity, the Director exercised a great deal of discretion in deciding which files to send to Committee. Committee members received some instructions explaining the Index number and providing guidance on evaluating grades by considering such factors as exceptionally high UGPA, grade trends, discrepancies with in a transcript, graduate school grades, and disadvantage. If an applicant experienced such performance-affecting disadvantage as language barriers, low socioeconomic status (SES), or level of parental education, and an application shows the effects of the disadvantage lessening markedly over time, a Committee member could expect improved performance in law school. However, applications did not identify the applicant’s race, and the Committee received no instructions about how to evaluate claims of disadvantage attributable to an applicant’s race. Also faculty-student teams within the Committee disagreed internally and among themselves about how to substantiate claims of racial disadvantage and how to make decisions generally.

2. General Critique of the 1997 Admissions Process.

As touched on above, for the purposes of analyzing Boalt’s 1997 admissions process for disparate impact, one might conclude that the process the faculty and administration adopted pushed those who made admissions decisions in one of two directions. Either, as the students who prepared the report called New Directions in Diversity argue, the unstructured discretion led those making decisions to rely inappropriately on the hard numbers. Or considering the multiple additional factors reviewers were supposed to take into account and the lack of instruction about how to take them into account, one could conclude that , with the possible exception of students in the A and B strata admitted almost solely on the basis of their numbers, the process was so discretionary that one cannot be sure what factors led to any individual decision to admit or deny, so one should and may evaluate the process as a whole.

The authors of New Directions give a number of reasons why the unprincipled discretion granted by the 1997 admissions process might push decisionmakers toward overreliance on the numbers. First, and perhaps most obviously, the numbers provide a convenient way to make decisions. Especially when one considers the level of turmoil surrounding admissions decisions after SP-1 and Proposition 209, as well as the highly pitched (and poorly rationalized) rhetoric about admitting only the "most qualified" applicants, one might reasonably expect decisionmakers to do the seemingly least controversial (i.e. convenient) thing: rely almost exclusively on the purportedly objective factors. Second, the school told those making admissions decisions that one of the goals of the admissions process was to admit an experientially diverse class with a proven history of overcoming disadvantage, but it did not provide them clear criteria or procedures for identifying and evaluating those factors or for weighing them against other factors, such as the Index score. Without structure for discerning, measuring, or weighing those subjective factors, evaluators are more likely to place greater reliance on data they perceive as objective. Third, the New Directions authors point to the lack of a standardized policy for evaluating claims of disadvantage caused by race as pushing decisionmakers toward reliance on numbers. Although the application encouraged students to discuss disadvantages that may have inhibited past performance, those who wrote about experiences with racial prejudice may have been subjected to higher scrutiny. For instance, since some readers required substantiation of such claims, if an essay lacked support for the claims of racial disadvantage, those readers may have disregarded the essay altogether, leaving them with even less material to weigh against an applicant’s Index score. Finally, since the school established no numerical goals for the admission of diverse applicants, indeed it seemingly did not differentiate among different types of diversity, readers lacked a firm mandate for admitting such students, likely causing them to place greater emphasis on Index scores to differentiate.

However, the school did obligate application readers to include subjective criteria in their considerations. So, if the Admissions Director, the committees, and the committee members did not rely solely on Index scores and did take the other, subjective criteria into account, then in doing a disparate impact analysis, one must consider the effects of those subjective criteria on the outcome of the decisionmaking process. I certainly would not argue that Boalt or other schools should not consider such factors as performance-affecting disadvantage, socio-economic status, character, letters of recommendation, work and other nonacademic experience, etc., since I believe such factors probably more important than grades and LSAT scores in discerning which applicants will contribute to society and the profession, but to the extent the decisionmakers considered such factors, those factors impacted the outcome of the admission process and are, thus, subject to scrutiny as part of a disparate impact analysis. As discussed in Part III.A.3, the areas of uncertainty are even larger in a case of subjective-criteria disparate impact than in cases involving claims that a particular objective criteria caused the discriminatory effects. But if one could discern the impact of the consideration of some of those criteria and show that the impact was statistically significant, the fact that the school did not even attempt to validate their use would weigh heavily in a finding that Boalt’s admission process had impermissible discriminatory effects.

For the purpose of evaluating possible disparate impact in Boalt’s 1997 admissions process, one might not need or want to demonstrate either that a particular objective factor(s), the LSAT and/or GPA, or one of the subjective criteria caused the statistically significant impact. Thus, those of the New Directions authors are not the only or the necessary conclusions to draw from Boalt’s 1997 admissions process. One could, just as reasonably, conclude that the unprincipled discretion delegated to application readers makes it impossible to discern a single factor of the process that may have caused the disparate impact. Since it may be unclear how the Admissions Director decided which applications to send to committee or how the committees as a whole or their individual members decided which applicants to admit, identifying the impact-causing factor may be impossible, meaning one may and should examine the process as a whole. Thus, in doing statistical analysis, one may use a "black box" or "bottom line" approach in determining whether a statistically significant disparate racial impact resulted from the process.

3. The Results of the 1997 Admissions Process.

The numbers that resulted from the 1997 admissions process at Boalt are, put nicely, not pretty. Boalt received 4171 applications from people seeking to enter the class that will graduate in 2000. Of those, 499 came from Asians, 316 from Hispanics, 256 from African-Americans, 28 from Native Americans, and an overwhelming 3072 from White applicants. Boalt made 860 offers of admission: 122 to Asians, 46 to Latinos, 15 to African-Americans, 2 to Native Americans, and 675 to White applicants. Of those offered admission, 268 applicants opted to enroll; 215 of them were White, 38 were Asian, 14 were Latino, one was African American, and none were Native American. That means that 73.6% of applicants and 78.5% of admittees were White, 11.96% of applicants and 14.2% of admittees were Asian, 7.5% of applicants and 5.3% of admittees were Hispanic, 6.1% of applicants and 1.7% of admittees were Black, and 0.67% of applicants and 0.2% of admittees were Native American.

B. A Prima Facie Case.

A decent prima facie case exists against Boalt’s 1997 admissions process. Recalling from Part III.A, a prima facie case under Title VII requires plaintiffs to show as an initial matter that a facially neutral policy had a statistically significant disparate impact; the law primarily contemplates plaintiffs making that showing for a particular decisionmaking criterion, but it also allows, in some circumstances, for plaintiffs to focus on the bottom-line results of the process as a whole. Here, the bottom-line data for Boalt’s admissions process show statistically significant disparate outcomes in offers made to African-Americans and Latinos, and such an analysis is appropriate and permissible in this matter because the elements of the decisionmaking process are not capable of separation. However, even if one focuses on the LSAT as the particular criterion that caused the impact, one can make out a prima facie case in several ways. As explained in greater detail in Part IV.B.2, a statistical analysis of the impact of the LSAT on different racial groups in the admissions process was not possible, but other means of making out a prima facie case based on Boalt’s use of the LSAT exist. OCR’s draft guidelines on Fairness in Testing allow it to find a prima facie case without the complex statistical analysis that would be required of the test itself when use of a test contributes to a statistically significant bottom-line disparity. Also, Boalt made several misuses of the LSAT in its admissions process that constitute per se violations under those guidelines; such per se violations allow OCR to find impermissible discrimination regardless of the validity or educational necessity of the test, or whether alternative practices exist.

1. The Bottom Line.

Use of bottom-line statistics is appropriate in analyzing Boalt’s admission process for disparate impact, and such statistics show that the process had impermissible discriminatory effects against African-Americans and Latinos. As discussed in Part III.A, the ‘91 Civil Rights Act and, seemingly, the Court have carved out exceptions to the general rule of particularity for the offensive use of bottom-line statistics. In Connecticut v. Teal, though the majority held that an employer may not use bottom- line statistics indicating racial balance in its work force to insulate itself from claims that it used a test that had proven adverse impacts, at least when the employment practice is a test used as a pass/fail barrier, the dissenters, who included Rehnquist and O’Connor, argued that employers could, within the reasoning of the majority, still defend with bottom line statistics a multicomponent decisionmaking process that includes consideration of a scored test, if the test does not constitute a pass/fail barrier. If Scalia, Thomas, and Kennedy, agreed, theirs would be the position of a majority of the Court. Consistency would then seem to require allowing plaintiffs to use bottom line statistics offensively when employers use such a decisionmaking process, and it results in a disproportionately constituted work force. Additionally, Congress also carved out an exception for the offensive use of bottom line statistics in the ‘91 Act for when "the complaining party can demonstrate that the elements of a respondent’s decision making process are not capable of separation for analysis." A decisionmaking process is so constituted when it "includes particular, functionally integrated practices which are components of the same criterion, standard, method of administration, or test."

The decisionmaking process Boalt instituted for its 1997 admissions cycle conforms to the models the Court and Congress have established as permitting bottom-line analysis. Like the one that the dissenters in Teal said would permit a bottom line defense, Boalt’s was a multicomponent process that included a scored test not used as a pass/fail barrier. Also, because of the unprincipled discretion granted to application readers, one could reasonably conclude that, with the possible exception of the administrative admits, the elements of the process are not capable of separation; in other words, it may be impossible to tell which criterion or criteria formed the basis for an individual decision to admit or deny. And all of the elements were a functionally integrated part of a single method of administration that attempted to determine which applicants were "best qualified." Finally, like the process at issue in Graffam v. Scott Paper Co., in which the court held that plaintiffs could use bottom line statistics offensively, Boalt’s admissions process combined objective and subjective criteria to form a single subjective process in which multiple criteria interacted to lead readers to their decisions, thereby causing the disparate impact. For those reasons, the admissions process is subject to a bottom-line attack.

a. Z Test Results.

The following are the Z-test results for Boalt’s 1997 admission process at the bottom line. A Z score of more than 1.96 is enough to conclude that discrimination occurred at the 95% confidence level. The Z-scores for African-American (5.39) and Latinos (2.45) are enough to make out a prima facie case of disparate impact at the bottom line for those groups because the results show that there is a less than 5% chance that the results for those groups are random.

 

Applied

Accepted

% Protected Class in Pool (PC)

% Others in Pool (NPC)

Expected Selections

Z

White/Other

3072

675

73.65

26.35

633

3.26

Asian

499

122

6.14

93.86

103

1.99

African-American

256

15

7.58

92.42

53

5.39

Latino

316

46

11.96

88.04

65

2.45

Native American

28

2

0.67

99.33

6

1.67

Total

4171

860

       

b. Chi Square Results.

The results of a chi square test of the bottom-line of Boalt’s 1997 admission process also show a likelihood of discrimination sufficient to make out a prima facie case. Using a two-by-two chi square test, one may conclude that a decisionmaking process resulted in impermissible discrimination at the 95% confidence level if X2 is greater than 3.841 and at the 99% confidence level if X2 is greater than 6.635. For African-American applicants, the result was 35.34; for Latinos, it was 7.28; for Native Americans, it was 2.35. Thus, a prima facie case of disparate impact against African-Americans and Latinos exists at the bottom line.

 

Admit

Reject

 

African-Americans

15

241

 

Other

845

3070

X2 = 35.34

 

 

Admit

Reject

 

Latinos

46

270

 

Others

814

3041

X2 = 7.28

 

Admit

Reject

 

Native Americans

2

26

 

Others

858

3285

X2 = 2.35

Assuming that a court agreed that it is appropriate to examine the bottom line results of Boalt’s admission process, rather than a particular selection criterion, either of the foregoing would be more than sufficient to make out a prima facie case against that process. However, it is possible to level an attack against Boalt’s use of the LSAT as the particular impact-causing criterion without performing the same kind of statistical analysis of the effects of its use on the racial composition of the group of applicants it admitted in 1997.

2. The LSAT under OCR Guidelines.

If this were a perfect world, or if, as the economists like to say, people operated with perfect information, this section would begin with a statistical analysis of the LSAT scores of those who applied to Boalt in 1997. But this is not a perfect world (if it were, I would not be writing this paper), and information flow is not perfect. Thus, this section does not contain such an analysis, mostly because it is simply beyond the statistical skills I could master in the course of preparing this paper. Also because, in the wake of Proposition 209 and SP-1, I am not sure that Boalt has done such a racialized categorization of LSAT scores, nor am I sure that it should be preparing such records under those laws. This part of the paper nevertheless contains several ways to make out a prima facie case against Boalt on the basis of its use of the LSAT. According to the Office for Civil Rights’ draft guidelines on Fairness in Testing, it may/should conclude that there was discrimination in violation of Title VI if use of a test caused or contributed to a bottom-line disparate impact. And if the test so contributed to a disparate impact and was either the sole or principal criterion used in decisionmaking, or the decisionmaking process used the test for purposes for which it was not designed, then the use constituted a per se violation of Title VI. Boalt’s use of the LSAT in 1997 falls within the circumstances under which OCR may find a prima facie case of impermissible discrimination.

The OCR guidelines described in the following section are subject to a number of attacks, most deriving from the deviation of those guidelines from the format for making out a disparate impact case that the Court has constructed for Title VII. Education is, however, different from employment, the Court has not given as much detailed attention to the nature of a disparate impact case under Title VI in the educational setting, and when and if it devotes such attention to the matter, it may find that the differences between education and employment justify some departures from the Title VII format. Thus, I use the OCR guidelines not because they conclusively make out either a prima facie case of disparate impact or a per se violation of the law, but because they provide a starting point for thinking about how disparate impact law in the educational setting should be constructed.

a. OCR’s Standards for Finding a Prima Facie Case of Disparate Impact.

Focusing on the LSAT as the particular criterion that caused the disparate impact, as the New Directions authors suggest is appropriate, one could make out a prima facie case of disparate impact against Boalt’s use of the test in its 1997 admissions process under the OCR’s guidelines. Those guidelines provide that when a test is the sole or principal criterion in making educational decisions, investigators should determine whether the test caused or contributed to the statistical disparity in the denial of educational benefits or opportunities. The first step is to do a bottom-line analysis using a "Z test," as I did above. If that analysis indicates disparate impact at the bottom line, an investigator should examine the average (mean) scores for all relevant populations. If the mean scores are significantly different for the groups, and the difference is "in the same direction" as the disparity at the bottom line, the test is contributing to the disparity. The contribution of a test to a bottom-line disparity is enough to make out a prima facie case against the use of the test.

The bottom-line results of Boalt’s 1997 admissions process are provided above, so the first question is whether the average scores on the LSAT of the relevant populations are in the same direction as the bottom-line disparity. In 1990-91, the mean LSAT score for Native Americans was 30.27 (on the then-used 40-point scale), 33.22 for Asian Americans, 25 for African-Americans, 30.13 for Hispanics, 29.70 for Mexican-Americans, and 34.35 for Whites. Thus, the average scores on the relevant test are different for each racial groups in the same direction as the bottom-line results of the admission process, meaning that under OCR guidelines, Boalt’s use of the test violated Title VI.

b. Per Se Violations under OCR Guidelines.

Having established a prima facie case under OCR standards, however, OCR also instructs that if the use of a test caused or contributed to a disparate impact, OCR should conclude that there was discrimination in violation of Title VI either if "1) the test is being used as the sole or principal criterion for an educational decision and the test was clearly not designed to be so used[, or if] 2) the test is clearly not being used for the purposes for which it was designed . . . ." Thus, OCR defines such uses of a test contributing to a disparate impact as per se violations of Title VI; users of such tests do not receive the opportunity to show the educational necessity of that use of the test, and those who challenge such use do not have to identify less discriminatory alternatives. So, the next question is whether Boalt’s use of the LSAT also constituted a per se violation of Title VI.

First, use of the LSAT is a per se violation if Boalt used it as the sole or principal criterion in its decisionmaking process, and the LSAT was not designed to be so used. This particular line of inquiry has proceeded on the assumption that the LSAT was the principal or particular criterion that caused the impact, since as the New Directions authors suggest, the unprincipled delegation of discretion in decisionmaking that Boalt gave its admissions decisionmakers pushed them toward almost exclusive reliance on the LSAT. And the Law School Admissions Council explicitly informs law schools that they should not place principal reliance on the LSAT in determining the qualifications of applicants. Therefore, if consideration of the LSAT did overwhelm all other factors in the admission process, Boalt committed a per se violation of Title VI.

Second, Boalt committed a per se violation if its use of the LSAT, a test that contributed to a disparate impact, was for purposes for which it was not designed. As just discussed, using the LSAT as the principal criterion for decisionmaking would itself be enough to constitute a per se violation under that OCR standard. Boalt also used the LSAT improperly by employing cut-off scores in assigning applicants to one of the five strata discussed in Part IV.A.1; those assignments in turn affected whether an applicant would be admitted. In part because the LSAT has a standard error in measurement (SEM), the LSAC warns law schools against using cut-off scores and making decisions based on fine distinctions in scores. Boalt nevertheless did both in assigning applicants to the five admission strata, since applicants with scores below a certain point would not be placed in certain strata, and the stratum in which the Admissions Office placed applicants influenced outcomes. Most placed in the A stratum were administratively admitted, many in the B stratum were administratively admitted, and the stratum in which an applicant was placed was likely to influence the decisions of application readers. Yet the Admissions office assigned applicants to the strata on the basis of differences in LSAT scores of as little as one point. That was a misuse of the LSAT.

Additionally, Boalt misused the LSAT in the way it used the test to adjust applicants’ UGPAs. Boalt determined the quality of an undergraduate institution by averaging the LSAT percentile for all test-takers from the school and the LSAT percentile ranking of students at the school who have a UGPA of at least 3.6. Boalt then ranked schools on the basis of the resulting score and adjusted applicants’ grades based on the undergraduate institution they attended. That simply is not a use of the LSAT that LSAC contemplated or explicitly sanctioned as proper. And Boalt says it makes the adjustment to account for "both the quality of the student body and the grading patterns at the school," but that is not a use of the LSAT for which it has been validated. Thus, Boalt misused the LSAT in three ways, making possible the conclusion that Boalt’s 1997 admission process constituted a per se violation of Title VI under OCR’s guidelines; that conclusion precludes an educational necessity defense, which in turn makes unnecessary any demonstration of the existence of alternative practices that have less disparate impact but nevertheless serve the school’s legitimate interests.

C. Educational Necessity and Validity.

Assuming the establishment of a prima facie case of impermissible discriminatory effects by Boalt’s 1997 admission process, the next step would be to require Boalt to justify both/either that process as a whole and/or its disparate-impact-causing components. Adapting the requirements of Title VII, as federal courts have done, Boalt would have to prove the "educational necessity" and "law school relatedness" of the process or impact-causing criterion. Focusing on the process as a whole, Boalt would have to prove that the characteristics it instructed the admission decisionmakers to seek are related to some set of skills or characteristics necessary to performance as a law student, and that the process is a valid predictor of performance as a law student. Focusing on the LSAT, Boalt would have to do more than simply show that the test is a valid predictor of first-year grades, it would have to show that first-year grades are the result of professors using the same set of criteria to judge performance, and that those criteria are actually necessary to perform well as a law student. Additionally, looking at law school as a training program, Boalt would have to do more than show that the LSAT has a statistically significant correlation with first year grades, it would have to show that those grades are, in turn, valid predictors of graduation from law school and of performance as an attorney. And more generally, the LSAT is subject to criticism both for the way in which it is constructed and for the limited skill set it tests.

1. The Admission Process and the LSAT.

Consistent with the rest of the paper, this part examines the defensibility of both the admission process as a whole and the LSAT individually. Boalt has not validated its overall admission process, and that process is not sufficiently educationally necessary or related to the requirements of legal education. The LSAT, though valid for predicting first-year grades, is not necessarily educationally necessary or related to the requirements of legal education.

Those federal courts that have addressed educational necessity have required defendants to prove that their practices have a manifest relationship to their purposes. Thus, in Georgia State Conference of Branches of NAACP v. State of Georgia, the Eleventh Circuit found that Georgia’s grouping of students by ability was educationally necessary because the practice had a manifest relationship to classroom education in that the technique had pedagogical acceptance and was meeting students educational needs. Similarly, though with a different outcome, a New York court held in Sharif v. New York State Educ. Dept. that use of the SAT in awarding merit scholarships to high school students was not educationally necessary, as the SAT purportedly measured ability to succeed in college and, thus, did not have a manifest relationship to measuring past success.

Neither court, however, seems to have given the idea of educational necessity detailed consideration, and from their use of "relationship" and "relatedness," they seem to have considered relatedness and necessity as synonymous. In developing its "manifest relationship" standard, the Eleventh Circuit cursorily examined the language in Griggs and Dothard to pull out the manifest relationship language, so it provides a good starting point. Recall, though, from Part III.B.2 that the Court in Griggs found that the tests and the high school diploma requirements at issue there did not bear a manifest relationship to the jobs at issue because the employer had adopted the requirements without consideration of their relationship to job performance (and thus with no justification for believing that they were so related), and because employees who had neither received a high school diploma nor been able to succeed on the tests had nevertheless performed their jobs adequately and been promoted. Additionally, in Albermarle, the Court held that though the tests at issue there had been validated, and the validation study had shown that test scores correlated positively with good supervisorial evaluations, it was unclear what criteria supervisors were using, whether they were all using the same criteria, and whether those criteria were actually relevant to job performance. And in Dothard, though the Court was willing to accept defendant’s assertion that strength was necessary to the job of prison guard, it required some proof that height and weight, the requirements at issue, were actually related to strength. So, defendants must identify just what quality or qualities are supposedly necessary to perform satisfactorily in the position in question and show that the criteria they examine in making decisions are related to those qualities.

a. What Is Educationally Necessary and School Related.

Unfortunately, academics and courts have devoted essentially no time to the meaning of educational necessity and academic program relatedness. I cannot, in this paper, provide full understandings of the terms, but the following may at least provide a starting point for a dialog about whether the selection criteria that educational institutions use are necessary or related to the programs in the way that the law requires. Recall from Part III.B.2 that business necessity and job relatedness mean more than that the reason or business may justify the practices; those terms may, however, mean less than that the practices must be absolutely necessary or related to the employment in question in an exactingly precise way.

To translate the terms that the Court used in Griggs, Albermarle, and Dothard into the educational setting, one might first want to take at least a couple of steps back and try to understand some aspects of the nature of education, since it is not the same as employment. If nothing else differentiates them, education is not a boss giving an employee a job, it is a student, in some sense, hiring teachers to provide a service. Conceding that, one might inquire into the objectives of the parties for seeking and providing that service. Just to get them out of the way, both have certain financial interests. Teachers want jobs, schools want tuition to pay teachers, administrators want successful graduates who can make large donations or enhance the prestige of the institution. Those who run the institutions also have less material objectives: they want to advance the scholarship in the fields taught, both by training future thinkers on the subjects and providing active sounding boards and fresh ideas for present academics; they want, in the law school setting, for instance, to produce lawyers able to advance the law through litigation, legislation and other public policy work; and, again specific to law school, they want to produce graduates able to serve all who need access to the various legal structures, meaning clients ranging from multinational corporations to recent immigrants to major urban centers who cannot speak English. The students and educators also share an objective--one that, sufficiently broadly defined, includes the students’ financial interests: they want the educational service to provide the students with further options and opportunities so that graduates can pursue those things that interest them.

Yet another objective for the educational service, and one that is also often articulated as a means of reaching the others, is the admission of students whom the educators think will be most likely to fulfill those other objectives, take advantage of the options and opportunities that the service can provide, and contribute to the ability of the other students to do all of those things. Thus, particularly at a school like Boalt, which is part of a much larger and more diverse educational/academic institution, one goal is to admit student whom the educators believe will help to construct an overall learning, teaching, researching, writing, and (in law school) lawyering (in all its many forms) environment conducive to the fulfillment of all of those other goals.

Identification of those objectives of and for the parties to the educational process leads to a number of further inquiries. One is, quite obviously, what kind of an educational/academic environment is conducive to the fulfillment of the other objective. Another is what kinds of qualities in students are likely to contribute to the fulfillment of all of those goals. And a third is what kind of criteria are necessary and related to the qualities necessary in students to fulfill the objectives of the individual programs in a school and the academic/educational environment in general. In other words, how will schools identify those students most likely to help each other and the school succeed in their multiple missions.

I for one am dubious that most, if any, colleges and universities have ever, let alone recently, thought through those questions. And apparently, the world of academics who write on academics has not done so either, at least not in any systematic and conclusive way (not that such individuals could reach a consensus on the answers), or schools such as Boalt, which for whatever reasons change their admissions policies, would have a guide to which they could cite, and Boalt certainly cited to no such thing when it reformulated its policies. Rather, I suspect, schools continue to follow the same formulae and examine the same credentials that they have for decades, without ever pausing to (re)consider whether they are looking for the right things in applicants and whether those credentials actually help them discern those things. If the terms "educational necessity" and "related to the academic program in question" are to have any meaning, they must require more of a school than that the standardized tests it uses have been validated against academic performance and the other credentials it examines are the ones they or other schools have always used. And if they are going to continue to use those same credentials/criteria, schools should have to invest more in finding out if they are looking for the correct things in applicants according to the goals of an educational program, if the teachers are using the same standards to make the assessments of academic performance, if those assessments are actually telling them anything about how well they and the students are succeeding in accomplishing their mutual missions, and if the criteria are measuring things that are actually necessary to perform in the program. Put differently, those who are making the decisions are simply assuming that they are doing it right, and educational necessity and academic program relatedness should require more than assumptions--they should require some convincing justifications for why schools are making decisions in the way they are, especially if those decisions are having racially disparate impacts.

Brought back to where this began, in the language of Griggs, sure, the SAT, the LSAT, and the other things at which schools look may bear some demonstrable relationship to success in school, but those applicants whom schools admit on the basis of those criteria are not the only qualified people. What if those denied admission could perform satisfactorily in the schools, benefit from the programs, and participate in the successful attainment of the educational objectives? If they can, what does that say about the process of admission? In the language of Albermarle, sure, the MCAT and the GRE may be validated on the basis of a correlation between scores on them and grades in medical school and graduate programs, but how rigorously are schools ascertaining the bases on which professors are basing those grades to assure that all professors are using the same standards, and that the standards used are accurately assessing all of the qualities that the schools claimed they were using the selection criteria to discern? If they are not rigorously assuring those things, how can they justifiably claim that the criteria are necessary to the education and related to the program? And in the language of Dothard, sure, schools might well have identified all of the qualities and combinations of them that would lead to one of the many possible types of successful performance in the program, but can the schools demonstrate a correlation between those qualities and all of the criteria they are using to make admission decisions? Perhaps to demonstrate a sufficient relationship between the criteria used and the qualities sought, schools should not have to validate formally each criterion or their processes as wholes, but especially when their admission processes and the criteria they use cause disparate impacts, they should have to do more to justify their use, investigate them more fully, than they have as yet done.

b. The Educational Necessity and Law School Relatedness of Boalt’s Criteria.

Boalt sought in 1997 "to enroll students whose quality of mind and character suggest that they have the capacity to make a contribution to the learning environment of the Law School and to distinguish themselves in serving the needs of the public through the practice of law, the formulation of public policy, legal scholarship, and other law-related activities." As methods of evaluating those qualities, the school’s faculty gave substantial consideration both to the LSAT and UGPA and to "letters of recommendation, graduate training, special academic distinctions or honors, difficulty of the academic program successfully completed, work experience, and significant achievement in nonacademic activities or public service." It also considered disadvantages that adversely affected past performances. And it sought to enroll a student body with a broad set of interests, backgrounds, life experiences, and perspectives because it believed such diversity would best serve society and the educational and professional development of the students.

From the faculty’s Admission Policy, one could conclude it believes that it should admit applicants who have a quality of mind and character that suggest they will contribute to the school and the profession. One could also conclude that it believes that the LSAT, UGPA, and all of the other credentials are related to those qualifications. Without disputing that those qualifications might be necessary to being a successful law student, other qualities might be equally important, especially when one considers how diverse the legal profession is and how different legal education might be for two different students--one may enroll only in "bar classes," whereas another might pursue only a clinical program, when available. Applying the Griggs and Dothard considerations, it is not at all clear that the faculty had any justification for believing that those qualities were necessary, or that the credentials considered were manifestly related to those qualifications; in other words, even if the overall decisionmaking process could be validated, Boalt did not even attempt such validation. Indeed, one might wonder with what someone attempting to validate the process as a whole would seek to correlate the process. Correlating that quality of mind and character with first-year grades might be statistically possible but practically inapt, as one would then have to wonder whether first-year grades can actually indicate anything significant about one’s ability to contribute to the school or the profession. Thus, without arguing that Boalt was looking for the wrong qualities in applicants, it is unclear that the process as a whole that the school used was educationally necessary.

That same problem applies to the school’s use of the LSAT. As mentioned at various points in this paper, the LSAT has been validated in that an applicant’s score on the test is positively correlated with first-year grades. A LSAC study using 1990 data found that, the median coefficient of correlation between LSAT scores and first-year grade point average was .41, which is statistically significant.

[L]aw schools do think first-year grades are relevant, because we use grades as our representations to the students and the profession of crucial qualities--although not all of them--relevant in evaluating professionals. If we are testing the right things in our grading and if the LSAT is testing the skills--reading, analysis, verbal reasoning--that help determine the outcomes of what we are testing, then, again, using the LSAT is legitimate.

The Ad Hoc Task Force does not indicate just what qualities first-year grades are supposed to represent to the students and the professionals, but whatever those qualities are, first-year grades, even according to the Task Force, are not themselves those qualities, nor are they the qualities that the faculty indicated it would look for in applicants: a quality of mind and character that would make applicants likely to contribute to the school and the profession. Thus, the mere fact that the LSAT is valid for predicting first-year grades does not mean that it is educationally necessary or related to law school ability, since nothing demonstrably or necessarily links it to the qualities the faculty sought in applicants. Indeed, one could conclude from the Task Force’s report that first-year grades, and thus the LSAT, are a representation of entirely other qualities. Applying the Court’s considerations in Albermarle, to be able to use first-year grades as such a representation of those qualities supposedly sought, the school would have to be prove that all professors were basing grades on the demonstrated ability of students to contribute to the school and the profession, and that they were all using the same criteria in making that assessment. Absent such a showing, despite its validity, the LSAT is not educationally necessary or related to law school ability for the purposes of Boalt’s 1997 admission process.

2. Training Program Validity.

Even if, because it is a valid predictor of first-year grades, the LSAT were educationally necessary and related to law school ability for Boalt’s 1997 admission process under the general disparate impact analysis adapted from the Title VII context, it is not valid if one views law school as a training program. Courts have usually only applied the standards of training program validity when plaintiffs challenge selection criteria used for an educational opportunity that an employer offers as a means of selecting and preparing future employees; courts are willing to scrutinize such criteria more closely, in part, because if the employer were allowed to validate those criteria by comparing success on the entry requirements and success in the program, the employer could always justify use of the criteria by employing them at both the pre-training and training stages. Groves seems to be the only instance of a court applying it to an independent educational institution’s selection criteria. The court did not explain why it chose to require the stronger showing of the defendant, but it does provide precedent for such an approach to Boalt’s admissions.

However, there are reasons why it might have been appropriate in Groves but not here. The former involved the use of a minimum score on the ACT exam as a requirement for admission to undergraduate teacher training programs at state colleges and universities. One could argue that since the Alabama State Board of Education imposed the requirement and would be granting teacher certifications, and since most graduates who went on to teach would, in a sense, be state employees, the use of training program validity there was appropriate, but that none of the same circumstances exist in the instant manner in a way that would justify requiring Boalt to meet such higher validity standards. But in the matter of admissions, the Boalt faculty and administration are an extension of the State of California, which also decides who will be admitted to the State Bar and, thus, able to practice law. And considering the extent of state regulation of the legal profession and the fact that all lawyers are officers of the court, a number of reasons exist not to reject out of hand the suggestion that something more than the basic validation standards should be required of Boalt and schools like it. Thus, the more appropriate question might be why Boalt should not have to prove that its 1997 admission process meets the higher requirements of training program validity.

As discussed in Part III.B.4, under Title VI, to use a positive correlation between performance on an entrance exam and performance in a training program to validate the test itself, an employer must be able to show one of the following: that the employer uses the test only to screen out those applicants without the minimum skills necessary to complete and benefit from the training; that the content of the test, the program, and the job are all the same; or that performance in the training program predicts performance on the job (if the employer attempts to validate the test by comparing scores on the test to grades in the program).

The last showing is the one relevant to assessing Boalt’s admission process under the standards the Ninth Circuit established in Craig v. City of Los Angeles, holding that when a training program entrance exam is valid in that scores on it are positively correlated to performance in the program, the employer must additionally prove that performance is positively correlated with job performance. The first showing is not relevant because, presumably, Boalt does not use the LSAT to screen out those who could not complete or benefit from the training; it uses it to select those most likely to contribute to the skill and the profession, so some of those not admitted might still possess the quality of mind and character that would allow them to make those contributions. The second showing is not relevant to this inquiry because probably no test could adequately examine the skills necessary to law school and lawyering, though an attempt to develop such a test would be interesting. Thus, Boalt must make the showing required by Craig.

Of course, as the Ad Hoc Task Force wrote, law schools would like to believe that first-year grades are relevant, but that does not mean that they are valid predictors of ability to perform as a lawyer, or as any of the other types of professionals that law school graduates become. Even if they were, Boalt has done no study to validate them that way. And a survey of the literature on the subject shows that first-year grades are a poor predictor of later success in school or the profession. Thus, use of the LSAT was not valid for Boalt’s 1997 admission process, if one examines law school as a training program.

3. General Problems with the LSAT.

The generalized challenges to the use of the LSAT and other standardized tests are many, and some of them are quite trenchant. Three of them seem worth discussing in the context of a Title VI challenge, even though they would not themselves satisfy any of the requirements the Court, Congress, and OCR have laid out for a disparate impact case. The first is that, though the LSAT is a valid predictor of first-year grades, it is most highly correlated with family income. As a friend has pointed out, that should hardly come as a surprise, implying that that fact alone is not sufficient to convince people that standardized tests are not valid (in the nonstatistical sense) admission tools; however, to the extent that the LSAT is a coachable test and not a measure of innate qualities, that also means that those applicants whose families can afford to provide them with both special training and access to better educational institutions throughout their lives are more likely to do well on the test and, thus, gain admission to law school. The second is that, despite the arguments of affirmative action’s detractors that those who do best on putatively objective measures (LSAT and UGPA) are "best qualified" for opportunities such as law school, those who do poorly on the LSAT may nevertheless be more than minimally qualified to contribute to their schools and the legal profession. And a disproportionate number of those more than minimally qualified students that the LSAT screens out are racial minorities. The third is that the LSAT is simply not as objective as many would like to believe. The LSAT suffers from bias not just because the cognitive bias of its makers, of which affirmative action’s detractors are so dismissive, seeps into the questions, but because of the common method of constructing such tests.

a. Favoring the Affluent.

Anyone who has paid any attention to the debate about standardized tests has by now heard that one’s score on such tests, including the LSAT, is more likely to predict family income than any other attribute of the testtaker. Although that fact alone might not be surprising or even troubling by itself, considering the benefits the affluent enjoy in our society generally, it becomes more disturbing when one takes into account the coachability of the LSAT and the cost of such special training. Some studies have found that simply taking the LSAT more than once will increase one’s score, which implies that the test is measuring something other than aptitude. But studies have also shown that coaching courses can increase one’s score on the LSAT, further diminishing any claim that the test is measuring aptitude or merit. As a result, an entire industry has grown up around "test prep" courses, as applicants seek any edge they can find.

However, not all people seeking entry to law school are equally able to make use of such courses. Many are prohibitively expensive, and perhaps as a result, applicants of color are less likely to enroll in courses that could help them as much, if not more, than their White colleagues. Thus, the larger points that often get missed when one hears that LSAT scores are more highly correlated with family income than any other attribute is that those who do best on the test are more likely and more able to take advantage of special training for the test, and that since coaching affects score, the test is, to a perhaps unknowable extent, measuring neither aptitude nor merit but one’s ability to master the skill of taking the test itself. The test, then, is hardly an appropriate measure to receive the kind of deference and weight that law schools usually give it in their admission decisions.

b. Poorly Assessing Qualification.

The argument that standardized tests are necessary and appropriate to admission decisions because they objectively measure qualification to perform in school has come under another form of challenge in a recent article by Linda Wightman. Examining data from law school applicants in 1990-91 and from Fall 1991 first-year students, she ran sophisticated statistical analyses of the effects of removing affirmative action from the admission process and relying solely on a LSAT/UGPA index score. She found that those students whom the statistical model predicted would not have been admitted to any law school (not just the law schools to which they applied) were just as likely to graduate and to pass the bar exam as those students whom the model predicted would have been admitted to law school; in other words, the LSAT and UGPA are not valid predictors of graduation or bar passage for any racial group. One can conclude two things from those findings. First, schools using affirmative action policies are not admitting unqualified students of color. Second, the heavy reliance that schools place on the LSAT and UGPA, even if they also have an affirmative action policy, is probably screening out students, especially students of color, who are perfectly qualified to do the work of law school and of the legal profession; indeed, one could also infer that a school like Boalt that now has no affirmative action policy and may be relying almost solely on the numbers is particularly screening out qualified students of color. As a Boalt professor and Chair of the Admission Committee put the matter in nonstatistical terms, "It’s not like these are the only 270 qualified people."

Consider, then, the full force of the all of the foregoing. The LSAT is a valid predictor of first-year grades, but those grades have not been validated as predictors of later success in law school, graduation, or ability to practice law. It is a valid predictor of first-year grades, but those grades are not based on examinations that evaluate the full set of skills necessary to a diverse law practice. It is a valid predictor of first-year grades, but those grades are not necessarily representative of the qualities Boalt, or any other school, proclaims to seek in applicants, nor has Boalt demonstrated a positive correlation between grades and those qualities. It is a valid predictor of first-year grades, but it cannot predict graduation or bar passage, so it is not a full measure of qualification to attend law school or practice law. It is a valid predictor of first-year grades, but it disproportionately screens out applicants of color. Considering, then, that all the LSAT is good for is predicting first-year grades, though schools and most of society might be reticent to abandon its use, one must at least question whether its use, let alone the heavy reliance that law schools, and particularly Boalt in 1997, place on it, can be justified on even a practical level. If it cannot tell someone reading a law school application anything meaningful about an applicant’s qualifications, then it is practically useless to that someone, and considering its discriminatory effects, its use is wholly unjust.

c. Bias Caused by the Method of Test Construction.

Perhaps one of the most frustrating elements of the debate over standardized tests is the failure or unwillingness of their supporters to consider the possibility that the tests may be biased. Indeed, listening to them, I begin to think that I am being transported to the Platonic realm of pure forms (i.e. Princeton, New Jersey) where philosopher-kings pluck from the heavens ideal questions able perfectly to determine who is most, next most, minimally, and not at all qualified to be a law student. Those supporters seem to forget that people design the LSAT, and they seem as (willfully?) uninformed as most about how such tests are constructed and how, wholly unrelated to the cognitive bias of those who construct the test, the process of construction itself might result in a biased test.

Literature on the LSAT indicates that the testmakers include wrong answers likely to attract certain testtakers, but for the test to differentiate testtakers consistently, those whom the wrong answers distract must not be those who are doing well on the rest of the test. If the testtaking population consists of two or more diverse cultures, the process of constructing a test that consistently differentiates testtakers can introduce bias.

Consider the impact of consistency specifications on two hypothetical items in a pretest. The first contains irrelevant difficulties for the majority group because the correct response to the item assumes familiarity with the culture of the minority group. Thus, candidates from the majority group will correctly answer the item if they are familiar with the minority culture but not necessarily if they possess the ability the test item was designed to identify. The second item contains material familiar to the majority culture and therefore irrelevant bias for the minority group. Similar inconsistent patterns will result for members of the minority group.

The pretest process will eliminate those items found to have produced inconsistent scoring patterns. However, in this example only the first item is likely to be eliminated. The inconsistent response patterns among ninety percent of the candidates will be unacceptable. Yet the inconsistent response patterns among ten percent of the candidates on the second item may go undetected in the pretest and the question could remain on the final test. Thus, although both questions contain irrelevant material that made the question inappropriate for one of the groups taking the test, only one of the questions is likely to be eliminated from the test. This means that there is subtle, systematic bias in tests which are constructed according to consistency specifications.

White’s point is that standardized test questions contain cultural material irrelevant to the purpose of the test itself. He provides a rather lengthy list, with accompanying examples from actual test questions, of possible kinds of material derived from majority culture that might interfere with minority performance: insensitivity to minority traditions, different interpretations of purposely ambiguous wording, ignorance of minority community values, assumptions contrary to those of minority group members, reinforcement of prejudicial stereotypes about minority group members, tests of specific legal knowledge, preferences for urban testtakers, insistence on harsh results, anti-labor sentiment, disregard for bilingual concerns, unnecessary confusion with nonstandard (e.g. Black standard) English, ignorance of the history of minority cultures, and distrust of popular revolutions. Culturally-based material harms all testtakers, but only those questions that contain majority-culture material are likely to remain on the test because the testtaking population consists mostly of members of the majority culture, and making the results of the test consistent means eliminating those questions that most testtakers answer incorrectly. Thus, the very way in which LSAC constructs the LSAT incorporates cultural bias into the test that interferes with the performance of members of minority cultures. For that reason, the LSAT cannot be an appropriate measure of even the most basic skills that law schools search for in applicants, regardless of what the test is valid for predicting.

d. The Problem of Stereotype Threat.

Dr. Claude Steele and others have identified a phenomenon they call "stereotype threat," which they believe negatively affects the performance on standardized tests of groups subject to negative societal stereotypes. They posit that in situations in which members of such groups are aware that others are assessing their intellectual abilities, the awareness of the assessment and, thus, the risk that failure or poor performance will fulfill the stereotype, affects their performance in the situation. To test the theory, Steele set up four studies involving the taking of the SAT. The first two varied the stereotype vulnerability of African-American testtakers by varying whether or not their performance was ostensibly diagnostic of ability and, as a result, whether they were subject to a stereotype threat. Reflecting the pressure of their vulnerability, African-Americans underperformed in relation to White testtakers in the ability-diagnostic condition but not in the nondiagnostic condition. A third study validated that ability diagnosticity cognitively activated the racial stereotype in the Black participants, motivating them not to conform to it or be judged by it. Study Four showed that the mere implication of the salience of the stereotype could impair Blacks’ performance even when the test was not ability diagnostic. The implication of the findings is that the societally pervasive stereotype that people of color and women are intellectually inferior to White men itself affects the performance of members of those groups on tests such as the LSAT that are purported to measure intellectual ability. They consistently underperform because of the threat that they will underperform, thereby fulfilling the stereotype; if not for the generally accepted characterization of such tests as true measures of intellectual ability, and therefore their capacity to confirm the stereotype, female and minority testtakers would perform as well as their White male counterparts on the tests. That being true, one should have a little more difficulty accepting the LSAT as an objective measure of the abilities of female and minority testtakers.

D. Alternative Practices.

Ultimately, the final part of a disparate impact case, responding to a defendant’s showing of educational necessity and law school relatedness by demonstrating the existence of alternative practices that have less impact, is the subject of a paper itself. However, this part of this paper identifies some processes that might satisfy that requirement of a disparate impact case (though it is important to remember that plaintiffs should always be ready to make that showing, but a court will not require them to do so if defendants cannot prove educational necessity and law school relatedness). First, Wightman provides statistical evidence that socioeconomic status, selectivity of undergraduate institution, and undergraduate major (three considerations often suggested as substitutes for consideration of race as part of an affirmative action program), when used independent of or without knowledge of race, will not result in a group of admitted students both racially diverse and highly qualified. Thus, she suggests that affirmative action itself may be the only alternative practice that has no disparate impact and produces a highly qualified student body. Second, the New Directions authors suggest the creation of a "character index." Though interesting, and perhaps ultimately useful, it has not been shown to produce less disparate impact than the process Boalt used in 1997, or to produce as highly qualified a student body as explicit consideration of race. Third, the Ad Hoc Task Force recommended changes that it believed would produce lesser impact but still result in a qualified student body. Fourth, Boalt might have considered only UGPA. Though undergraduate grades are not as highly correlated with first-year law school grades as the LSAT, it has also been shown by statistical models to have less discriminatory effect. Finally, the school might adjust an applicant’s LSAT scores according to his race, though that alternative may itself be impermissible.

1. Restoring Consideration of Race.

In her article, Wightman considered the ability of alternatives to consideration of race to produce a racially diverse and highly qualified student body, specifically socioeconomic status (SES), selectivity of undergraduate school, and undergraduate major. Her analysis showed that none of those factors "produced a highly qualified, ethnically diverse student body when considered in the admission process without simultaneous consideration of race." Put differently, the analysis "failed to provide evidence that any of the three factors, when used independent of or without knowledge of race, would result in an admitted student pool that mirrors the ethnic diversity achieved under current admission practice." The important implication of Wightman’s findings for the present analysis is that the one alternative practice that can produce a highly qualified student body with less racially discriminatory effects is the one that Boalt was forced to abandon: consideration of race as one factor in the admission process as part of an affirmative action program.

The obvious response to that possibility is that Proposition 209, as well as SP-1, forbids the state to "grant preferential treatment" to anyone on the basis of race. However, Section H of 209 states, "If any part or parts of [the CCRI] are found to be in conflict with federal law or the United States Constitution, the [Initiative] shall be implemented to the maximum extent that federal law and the United States Constitution permit." Similarly, Section 6 of SP- provides, "Nothing in Section 2 [prohibiting use of race as a criterion for admission] shall prohibit any action which is strictly necessary to establish or maintain eligibility for any federal or state program, where ineligibility would result in a loss of federal or state funds to the University." If consideration of race as part of an affirmative action program is the only way to create a student body that is sufficiently diverse and highly qualified to serve a law school’s legitimate interests and allow it to avoid a finding of disparate impact under Title VI, then one could argue that federal law indirectly requires a law school to use affirmative action, as well as requiring a school to use affirmative action to remain eligible to receive federal funds. And if federal law requires Boalt to use affirmative action, then 209 conflicts with federal law, and the state may not implement 209 to prohibit Boalt from resuming its consideration of race as one factor in its admission process.

2. The New Directions "Character Index."

Responding to Regents’ language in SP-1 and Boalt’s language in its admission policy, the authors of the New Directions report attempted to develop some directions for integrating considerations of "character," socioeconomic disadvantage, and diversity of experience into the admission process. The index they proposed would primarily evaluate applicants on the basis of numerical criteria--an equally weighted combination of academic and socioeconomic indicators--that they would supplement with subjective information from the rest of an applicant’s file (letters of recommendation, personal statement, strength of undergraduate curriculum, extracurricular activities, etc.). The construction of the index would depend on applicants self-reporting about the individual factors that would go into the index. Applicants would receive a "plus" if they possess relative disadvantage or underrepresented life experience as compared to the overall applicant pool. The individual factors that would go into the index would be household wealth as a percentage of median applicant pool wealth; parental income as a percentage of median applicant pool income; level of parental education; primary language spoken at home; history of busing during K-12 schooling; disability; dependents; full-time work experience; number of hours worked during college; and neighborhood factors. Though the authors did not attempt to demonstrate that use of such a policy would increase racial diversity (i.e. have less racial impact), or that it could produce a qualified student body, the Character Index nevertheless provides an interesting starting point for the construction of a new way of administering law school admissions that responds to some of the concerns of opponents and proponents of affirmative action.

3. The Ad Hoc Task Force’s Recommendations.

Boalt itself seems already to have recognized that alternatives to its 1997 admission process existed; the report did not demonstrate whether those alternatives would result in less disparate impact, but it at least indicates that other methods could presumably satisfy the school’s interest in admitting a qualified student body. The Task Force recommended more outreach and recruiting prior to and after admission, including the appointment of an Assistant Director of Admissions for Student Recruitment.

With regard to actual admission processes, the group recommended cessation of the practices of adjusting UGPA and of identifying files as in one of the five admission strata. Additionally, it suggested that application readers consider score bands, which LSAC began reporting in 1997 (but after Boalt made its admission decisions), instead of the single numerical score an applicant received on the LSAT, and that the bands instead of the score be used in computing an applicants Index. It also agreed that Admission Committee members should read more files and attend workshops to help them read files in uniform ways that give full consideration to non-numerical indicators and contributions to diversity. To avoid overreliance on the LSAT, the Task Force recommended that the school experiment with submitting applications without the applicants LSAT score (band) included and identify for the Committee applicants who outperform their test scores. It also asked for more validity studies of the LSAT and greater cooperation with alumni to identify the nonacademic qualifications for law practice.

The Task Force also recommended changing the faculty policy to increase consideration of the "whole person" and to assure that all indicators are accorded appropriate weight. The policy should encourage applicants to discuss linguistic, social, and education barriers encountered and overcome. The policy should consider the relevance of disadvantage beyond using it as a way to explain weak academic performance in the past. And Boalt should request and consider information about an applicant’s SES.

Again, what effects those changes in policy might have remains unclear, but they are at least a recognition from the school itself that other alternative practices were available to it in 1997.

4. Consideration of UGPA but not LSAT.

Undergraduate GPA is not as strong a predictor of first-year law school grades as the LSAT alone or as the two combined. But the question for this part disparate impact analysis is whether an alternative practice exists that has less impact and can satisfy the decisionmaker’s interests. Since UGPA has a positive statistical correlation with first-year law school grades, and since it has practical significance as an indication of ability to perform in an academic setting, and since it would have less racially disparate impact if considered instead of, rather than in conjunction with, the LSAT, it is an alternative worth consideration.

5. Race Norming of the LSAT.

A final alternative practice would be for Boalt to adjust the LSAT scores of applicants to correct for its disparate racial results. The obvious and quick answer to that option is that it would be "preferential treatment" on the basis of race in violation of Proposition 209. However, Justice Powell, in Bakke, wrote that

Racial classifications in admissions conceivably could serve a fifth purpose, one which petitioner does not articulate: fair appraisal of each individual’s academic promise in light of some cultural bias in grading or testing procedures. To the extent that race and ethnic background were considered only to the extent of curing established inaccuracies in predicting academic performance, it might be argued that there is no "preference" at all.

In that case, there was no evidence that any of the factors Davis considered in its admission process suffered from bias, so the exception did not apply. But if one assumes based on the foregoing that the LSAT is culturally biased, one would have a decent argument that Boalt would not be granting anyone racially preferential treatment by adjusting applicants’ scores to remedy that bias. Thus, race norming of LSAT scores could be an effective and permissible alternative.

The less obvious and ultimately more difficult response to that alternative is that the ‘91 CRA prohibited employers from race norming employment related tests. Since a disparate impact case under Title VI is supposed to follow the model for Title VII, one could argue that it would be impermissible to recommend as an alternative admission practice something that Title VII now explicitly forbids an employer to do. It is an argument to which I have no response, except that perhaps that is one aspect in which a Title VI disparate impact case should differ from one under Title VII, since race norming LSAT results might be the simplest and most effective alternative practice.

Conclusion

Operating in the background of this paper is the question that could make most of it moot: whether the Supreme Court, if presented with an opportunity to decide whether disparate impact is available under Title VI or its regulations, such as a case against Boalt for its 1997 admissions process, would rule that the discrimination that Title VI prohibits in federally funded programs does not include facially neutral policies with discriminatory effects. Such a holding seems likely. Nevertheless, if the Court decided to let stare decisis govern, then under established principles and OCR guidelines on testing, Boalt did discriminate in 1997 in violation of Title VI. The bottom-line numbers, which are a permissible focus, show a prima facie case of disparate impact; under OCR guidelines, the bottom line combined with the school’s misuse of the LSAT is enough to attack the use of the test as a particular selection criterion. The admission process as a whole has not been validated or otherwise proven to be educationally necessary or law school related; though the LSAT is valid for predicting first-year grades, that alone is not enough under general disparate impact principles or the rules for training program validity. And the school itself admits that it could have used other practices that might have had less racially disparate impact.

[after-the-fact addendum to conclusion: Also operating in the background of this paper is the (seemingly slowing) process of eliminating affirmative action at public universities.[insert cite to texas and michigan] As one can see from the discussion of the rhetoric that informed the movement to eliminate affirmative action in UC admissions, advocates of elimination wanted to implement "neutral" and "objective" selection processes (like those that existed in the mythic pasts conceived of by such fevered "intellects") that would not "discriminate" on the basis of race against Whites. Boalt’s objective, then, was to construct such a process to replace its use of affirmative action. The result, as demonstrated by this paper, was effective discrmination against everyone Latinos and African-Americans and in favor of ("prefering") Whites and Asians. For me, that implies at least one question that we all must answer before further eliminating affirmative action in public institutions: can we create selection processes that do not consider race that do not also effectively discriminate on the basis of race? I fear we cannot and should not even attempt to do so until we have some assurances that we can.

Appendix A

Z Score Calculations

Formula

African-Americans

Latinos

Asians

Native Americans

Whites

Appendix B

Chi Square Calculations

Formula

African-Americans

Latinos

Native Americans