In April 2015, the FBI made an admission that was nothing short of catastrophic for the field of forensic science. In an unprecedented display of repentance, the Bureau announced that, for years, the hair analysis testimony it had used to investigate criminal suspects was severely and hopelessly flawed. But questions of forensic science’s reliability go well beyond hair analysis, and the FBI’s blunders aren’t the only reason to wonder how often fantasy passes for science in courtrooms. Recent years have seen a wave of scandal, particularly in drug testing laboratories. Among the areas determined to be flawed and in need of more research are: accuracy and error rates of forensic analyses, sources of potential bias and human error in interpretation by forensic experts, fingerprints, firearms examination, tool marks, bite marks, impressions (tires, footwear), bloodstain-pattern analysis, handwriting, hair, coatings (for example, paint), chemicals (including drugs), materials (including fibers), fluids, serology, and fire and explosive analysis.
The Innocence Project’s M. Chris Fabricant and legal scholar Tucker Carrington classify the kind of hair analysis the FBI performs as “magic,” and it is not hard to see why. By the Bureau’s own account, its hair analysis investigations were unscientific, and the evidence presented at trial unreliable. In more than 95 percent of cases, analysts overstated their conclusions in a way that favored prosecutors. The false testimony occurred in hundreds of trials, including thirty-two death penalty cases. Not only that, but the FBI also acknowledged it had “trained hundreds of state hair examiners in annual two-week training courses,” implying that countless state convictions had also been procured using consistently defective techniques.
But questions of forensic science’s reliability go well beyond hair analysis, and the FBI’s blunders aren’t the only reason to wonder how often fantasy passes for science in courtrooms. Recent years have seen a wave of scandal, particularly in drug testing laboratories. In 2013 a Massachusetts drug lab technician pled guilty to falsifying tests affecting up to 40,000 convictions. Before that, at least nine other states had produced lab scandals. The crime lab in Detroit was so riddled with malpractice that in 2008 the city shut it down. During a 2014 trial in Delaware, a state trooper on the witness stand opened an evidence envelope from the drug lab supposedly containing sixty-four blue OxyContin pills, only to find thirteen pink blood-pressure pills. That embarrassing mishap led to a full investigation of the lab, which found evidence completely unsecured and subject to frequent tampering.
There have also been scores of individual cases in which forensic science failures have led to wrongful convictions, the deficiencies usually unearthed by the Innocence Project and similar organizations. In North Carolina, Greg Taylor was incarcerated for nearly seventeen years thanks to an analyst who testified that the blood of a murder victim was in the bed of his truck. But later investigation failed to confirm that the substance was blood, or even of human origin. Forensics experts have used “jean pattern” analysis to testify that only a certain brand of blue jeans could leave their distinctive mark on a truck, as occurred in the trial of New Yorker Steven Barnes, who spent twenty years in prison for a rape and murder he didn’t commit.
Some wrongful convictions can never be righted—for example, that of Cameron Todd Willingham, who was convicted by a Texas court of intentionally setting the fire that killed his three young daughters. After the state executed Willingham, an investigative team at the Texas Commission on Forensic Science concluded that the arson science used to convict him was worthless, and independent fire experts condemned the investigation as a travesty. But those findings came too late to do Willingham any good.
The mounting horror stories, and the extent of corruption and dysfunction, have created a moment of crisis in forensic science. But the real question is not just how serious the problems are, but whether it is even possible to fix them. There are reasons to suspect that the trouble with forensics is built into its foundation—that, indeed, forensics can never attain reliable scientific status.
Some of the basic problems of forensic science are hinted at in the term itself. The word forensics refers to the Roman forum; forensics is the “science of the forum,” oriented toward gathering evidence for legal proceedings. This makes forensics unusual among the sciences, since it serves a particular institutional objective: the prosecution of criminals. Forensic science works when prosecutions are successful and fails when they are not.
That purpose naturally gives rise to a tension between science’s aspiration to neutral, open-ended inquiry on the one side and the exigencies of prosecution on the other. Likewise, while true understanding is predicated on doubt and revision, the forum must reach a definitive result. The scientist’s tentativeness is at odds with a judicial process built on up-or-down verdicts, a point the Supreme Court has emphasized in order to justify allowing judges wide deference as the gatekeepers of evidence.
It shouldn’t be controversial to point out that forensic science is not really a science to begin with, not in the sense of disciplines such as biology and physics. Forensic science covers whatever techniques produce physical evidence for use in law. These may be derived from various actual scientific disciplines, including medicine, chemistry, psychology, and others, but they are linked less by their inherent similarity than by their usefulness during investigation and prosecution. Law enforcement agencies themselves have invented a number of the techniques, including blood-spatter and bite-mark analysis.
Law is a poor vehicle for the interpretation of scientific results.
Much forensic knowledge has thus developed by means unlike that of ordinary scientific research. Comparatively few major universities offer programs in forensic science; joint training in forensic sciences and policing is common. Forensic laboratories themselves are a disparate patchwork of public and private entities, with varying degrees of affiliation with police and prosecutors. The accountability of some subfields such as “forensic podiatry” (the study of footprints, gait, and other foot-related evidence) can be dubious, with judges taking the place of accreditation boards. In such a decentralized system, it can be difficult to keep track not only of whether forensic investigation is working well but also of how it even works in the first place.
The close association between forensics and law enforcement is particularly controversial. According to Frederic Whitehurst, a chemist and former FBI investigator, forensic scientists can “run into a sledgehammer” when they contradict prosecutors’ theories. “What we seem to know in the world of science is that there are some real problems in the world of forensic science,” Whitehurst told a reporter from the journal Nature. “We’d rather work on something cleaner.” It is easy to see why a chemist might consider forensics “unclean”; criminal investigations regularly flout scientific safeguards against bias. Analysts often know the identity of the suspect, potentially biasing results in favor of police’s suspicions. Even more concerning, some crime labs are paid not by the case but by the conviction, creating a strong incentive to produce incriminating evidence.
Whitehurst’s comments echoed a major report in 2009 by the National Academy of Sciences (NAS), which painted a damning portrait of forensic practices. “Many forensic tests—such as those used to infer the source of tool marks or bite marks—have never been exposed to stringent scientific scrutiny,” the report concluded.
Fingerprint Analysis
One serious problem with those tests is that they allow for high levels of subjectivity. The NAS authors wrote that fingerprint analysis, for example, is “deliberately” left to human interpretation, so that “the outcome of a friction ridge analysis is not necessarily repeatable from examiner to examiner.” I saw this up close while working at the public defender’s office in New Orleans. Explaining his procedure for determining a match, a fingerprint examiner said in court that he would look at one, look at the other, and see if they match. When asked how he knew the two prints definitely matched, the examiner merely repeated himself. That very logic leads the FBI to claim fingerprint matches are “100 percent accurate.” Of course they are, if the question of a match is settled entirely by the examiner’s opinion. Without any external standard against which to check the results, the examiner can never be wrong.
Firearm Analysis
The NAS faulted a number of methods for this kind of shortcoming. Tool-mark and firearm analysis, for example, suffer the same weaknesses as fingerprint evidence, in that they depend strongly on unverified individual judgment. The report ultimately reached the forceful determination:
With the exception of nuclear DNA analysis . . . no forensic method has been rigorously shown to have the capacity to consistently, and with a high degree of certainty, demonstrate a connection between evidence and a specific individual or source.
That sentence should give any honest forensic examiner some sleepless nights.
DNA Analysis
But what about DNA? The report affirms that DNA maintains its place of integrity, the pinnacle of sound forensic science. It is not hard to see why DNA has long been the gold standard, deployed to convict and to exonerate the unfortunate defendants victimized by faultier methods of identification. DNA also has the advantage of producing falsifiable results; one can actually prove an interpretation incorrect, in contrast to the somewhat postmodern, eye-of-the-beholder sciences such as tool-mark and fingerprint analysis.
Yet forensic science involves both knowledge and practice, and while the science behind DNA is far from the prosecutorial voodoo of jeans and bite marks, its analysis must be conducted within a similar institutional framework. Analysts themselves can be fallible and inept; the risk of corruption and incompetence is no less pronounced simply because the biology has been peer-reviewed.
Such risk isn’t merely theoretical. While Florida exoneree Chad Heins had DNA to thank for the overturning of his conviction, DNA was also responsible for the conviction itself, with an analyst giving faulty testimony about DNA found at the site where Heins’s sister-in-law was murdered. Josiah Sutton was wrongfully convicted after a Houston analyst identified DNA found on a rape victim as an “exact match” for Sutton, even though one in sixteen black men shared the DNA profile in question. Earlier this year in San Francisco, thousands of convictions were thrown into doubt after a DNA technician and her supervisor were found to have failed a proficiency exam. In preparing evidence for a trial, the two had also covered up missing data and lied about the completeness of a genetic profile, despite having been disciplined internally for previous faulty DNA analyses.
DNA failures can border on the absurd, such as an incident in which German police tracked down a suspect whose DNA was mysteriously showing up every time they swabbed a crime scene, from murders to petty thefts. But instead of nabbing a criminal mastermind, investigators had stumbled on a woman who worked at a cotton swab factory that supplied the police. That case may seem comical, but a 2012 error in New York surely doesn’t. In July of that year, police announced that DNA taken off a chain used by Occupy Wall Street protesters to open a subway gate matched that found at the scene of an unsolved 2004 murder. The announcement was instantly followed by blaring news headlines about killer Occupiers. But officials later recanted, explaining that the match was a result of contamination by a lab technician who had touched both the chain and a piece of evidence from the 2004 crime. Yet the newspapers had already linked the words “Occupy” and “murder.” The episode demonstrates how the consensus surrounding DNA’s infallibility could plausibly enable government curtailment of dissent. Given the NYPD’s none-too-friendly disposition toward the Occupiers, one might wonder what motivated it to run DNA tests on evidence from protest sites in the first place.
The high degree of confidence placed in DNA is especially worrying because successful DNA analysis requires human institutional processes to function smoothly and without mistakes. The four authors of Truth Machine: The Contentious History of DNA Fingerprinting (2008) describe how DNA actually comes to be used in criminal proceedings: as “an extended, indefinitely complicated series of fallible practices through which evidence is collected, transported, analyzed, and quantified.” There are endless ways in which analysts can bungle their task. Furthermore, in the courtroom itself, DNA evidence must be contextualized and given significance. Even with well-conducted testing, poor explanation to a jury can enable a situation in which, as the geneticist Charalambos Kyriacou says, “Human error and misinterpretation could render the results meaningless.” A cautious approach is therefore valuable, even where DNA is concerned.
Facial Recognition Software
A police department that relies on facial recognition software has admitted that it has a false positive rate of over 90 percent. What this means is that nearly every person who is marked as a suspect by this system is actually an innocent person who will be interrogated by police, or possibly worse, because they were wrongly identified by this faulty technology. According to a report from the Guardian, the South Wales Police scanned the crowd of more than 170,000 people who attended the 2017 Champions League final soccer match in Cardiff and falsely identified thousands of innocent people. The cameras identified 2,470 people as criminals but 2,297 of them were innocent, and only 173 of them were criminals, a 92 percent false positive rate. According to a Freedom of Information request filed by Wired, these are actually typical numbers for the facial recognition software used by the South Wales Police.
Similar numbers were released by the FBI in 2016, with the agency also admitting that their facial recognition database consisted of mostly innocent people since they use drivers license and passport photos for their searches, in addition to mug shots. In fact, there is a 50/50 chance that your picture is in a facial recognition database. Also in 2016, another study found that facial recognition software disproportionately targeted people with dark skin. The Free Thought Project reported that police will soon be scanning everything in their path with facial recognition software installed in their body cameras.
It would be unreasonable to expect any human endeavor to be completely without error, and one might wonder just how systemic the problems of forensic science truly are. The claim of crisis is far from universally shared. Forensic scientist John Collins calls this “a fabricated narrative constructed by frustrated defense attorneys, grant-seeking academics, and justice reform activists who’ve gone largely unchallenged.” Those who defend current practices say that the scandals are exceptions, that the vast majority of forensic scientists are diligent practitioners whose findings stand up under scrutiny. For every person exonerated, hundreds of convictions remain untouched.
But this defense actually points to one of the key problems with evaluating forensic science. The measures of its success are institutional: we see the failures of forensics when judges overturn verdicts or when labs contradict themselves. There is a circularity in the innocence cases, where the courts’ ability to evaluate forensic science is necessary to correct problems caused by the courts’ inability to evaluate forensic science. At no point, even with rigorous judicial review, does the scientific method come into play. The problem is therefore not that forensic science is wrong, but that it is hard to know when it is right.
Breaking the cycle of uncertainty has therefore been a key part of reform proposals. The NAS report recommended numerous steps to introduce objectivity and accountability, including the adoption of consistent standards in every subfield and the creation of a unified federal oversight entity. One can hear in the lengthy recommendations of the NAS committee members pleas for the introduction of basic quality control.
But so far changes have been sluggish. In fact, in some labs quality may be declining as state budget cuts have reduced resources available for forensics. In Congress, the Forensic Science and Standards Act, which would massively overhaul the field and introduce unprecedented scrutiny and coordination, has repeatedly stalled. Last year, in keeping with the NAS’s recommendations, the Department of Justice and the National Institute of Standards and Technology finally put together a forensic science commission to oversee the field and set protocols. But the commission is still in its infancy, and its effects remain to be seen.
The Supreme Court attempted to elucidate some standards in Daubert v. Merrell Dow Pharmaceuticals (1993) and two subsequent cases, which govern the admissibility of scientific evidence. The court ruled that evidence must be generally accepted in the field and open to empirical testing. But even as the Court ostensibly limited testimony to that which is sound and reliable, it undercut the ruling’s effectiveness by offering lower courts a high level of flexibility in their decision-making. Ironically, that hands-off approach may have helped to create the very nightmare that the Daubert court feared, in which “befuddled juries are confounded by absurd and irrational pseudoscientific assertions.”
Nobody can state with certainty the degree of pseudoscience that clogs the American courts. But even if forensic science largely faces a “bad apples” problem, it may still be in bad shape. As legal scholar and forensic science specialist Daniel Medwed notes, “An absence of careful oversight can allow rogue scientists to flourish.” Even if there is no reason to doubt forensic podiatry itself, there might still be good reason to doubt forensic podiatrists. The localized, disparate, and unmonitored nature of so much forensic practice makes for massive nationwide inconsistency.
In fact, so long as forensic science remains forensic—i.e., conducted to meet the demands of the forum rather than those of the scientific method—it is hard to see how it can warrant confidence. For countless reasons, law is a poor vehicle for the interpreting of scientific results. That people’s lives must depend on the interpretive decisions of judges and juries is in some respects unsettling to begin with. The chaotic state of forensic science—in theory and practice—and the possibility that unsupported flimflam is passing itself off as fact make the everyday criminal justice process even more alarming.
Fire Forensic Analysis
According to John J. Lentini, author of the definitive book Scientific Protocols for Fire Investigation (CRC Press, second edition, 2012), the field is filled with junk science. “What does that pattern of burn marks over there mean?” he recalled asking a young investigator who joined him on one of his more than 2,000 fire investigations. “Absolutely nothing” was the correct answer. Most of the time fire investigators find nonexistent patterns, Lentini elaborated, or they think a certain mark means the fire burned “fast” or “slow,” allegedly indicated by the “alligatoring” of wood: small, flat blisters mean the fire burned slow; large, shiny blisters mean it burned fast. Nonsense, he said. It may take a while for a fire to get going, but once a couch or bed burns and reaches a certain temperature, you are not going to be able to discern much about its cause.
Lentini debunked the myth of window “crazing” in which cracks indicate rapid heating supposedly caused by an accelerant (arson). In fact, the cracks are caused by rapid cooling, as when firefighters spray water on a burning building with windows. He also noted that burn marks on the floor are not the result of a liquid deliberately poured on it. When a fire consumes an entire room, the extreme heat burns even the floor, along with melting metal and leaving burn marks under a doorway threshold, which many investigators assume implies the use of an accelerant. “Most of the ‘science’ of fire and explosive analysis has been conducted by insurance companies looking to find evidence of arson so they don’t have to pay off their policies,” Lentini explained to me when I asked how his field became so fraught with pseudoscience.
Itiel Dror of the JDI Center for the Forensic Sciences at University College London spoke about his research on “cognitive forensics”—how cognitive biases affect forensic scientists. For example, the hindsight bias can lead one to work backward from a suspect to the evidence, and then the confirmation bias can direct one to find additional confirming evidence for that suspect even if none exists. Dror discussed studies that show “that the same expert examiner, evaluating the same prints but within different contexts, may reach different and contradictory decisions.” Not just fingerprints. Even DNA analysis is subjective. “When 17 North American expert DNA examiners were asked for their interpretation of data from an adjudicated criminal case in that jurisdiction, they produced inconsistent interpretations,” Dror and his co-author wrote in a 2011 paper in Science and Justice.
No one knows how many innocent people have been convicted based on junk forensic science, but the National Research Council report recommends substantial funding increases to enable labs to conduct experiments to improve the validity and reliability of the many forensic subfields. Along with a National Commission on Forensic Science, which was established in 2013, it’s a start. Thus even as we try various fixes, rooting out bad apples and introducing oversight, a systemic and elementary problem remains: a science of the forum can never be science at all.
See also: