University of Washington statistician urges shift from profile probabilities to match probabilities in forensic DNA reporting

American Academy of Forensic Sciences — Forensic Biology (DNA) Session · February 9, 2026

Get AI-powered insights, summaries, and transcripts

Subscribe
AI-Generated Content: All content on this page was generated by AI to highlight key points from the meeting. For complete details and context, we recommend watching the full video. so we can fix them.

Summary

Dr. Bruce Weir presented empirical and theoretical evidence that multiplying single-locus genotype frequencies can understate match probabilities as more loci are used, and he recommended the likelihood-ratio/match-probability framework and larger theta values to avoid overstating evidence.

Dr. Bruce Weir, a professor of biostatistics and director of the Institute of Public Health Genetics at the University of Washington, said forensic DNA reporting should emphasize match probabilities and likelihood ratios rather than product-based "profile probabilities." He told the American Academy of Forensic Sciences symposium that multiplying single-locus genotype frequencies as loci accumulate produces astronomically small numbers that are hard to interpret and can misstate evidentiary strength.

"I want to put match back into match probabilities," Weir said, arguing that a proper match probability compares a pair of profiles — the evidence profile and a person-of-interest profile — under competing propositions. He said the commonly reported random match probability, defined in a NIST example as the product over loci of genotype frequency estimates, rests on independence assumptions that cannot be tested when multi-locus profiles are essentially unique in databases.

Weir illustrated the problem with examples from CODIS and database analyses. He noted that CODIS profiles expanded from 13 loci to 20–24 loci, moving reported product-based probabilities from roughly 10^-15 to 10^-25 and toward 10^-30. "These numbers are getting smaller and smaller," he said, and become difficult to convey credibly in court.

Turning to empirical work, Weir presented simulations and comparisons using about 2,800 FBI profiles (about 350,000 pairwise comparisons) and larger Australian profile datasets. He said observed multi-locus match proportions often lie above the values predicted by multiplying single-locus match proportions — meaning the product rule can understate the true match probability and, therefore, overstate the strength of evidence against a person of interest.

To address population structure and locus dependence, Weir discussed the population-structure parameter theta and showed that reported conservatism depends heavily on the theta value used. He said small theta values (0 or 0.001) leave observed data below product-rule expectations in some comparisons, while larger theta values (0.01 or 0.03) bring the product rule into safer, conservative territory. "If we want to make statements about match probabilities of the order of 10 to the minus 30, we would need a database that allowed 10 to the power of 30 comparisons or more," he said, adding, "I'm guessing theta of 10% would not be unreasonable."

Weir recommended adopting a likelihood-ratio framework that formally compares the probability of the observed pair of profiles under two hypotheses — that the person of interest is the source versus an alternative source — and said further empirical work is required to determine defensible parameter choices and reporting conventions. He described the current proposal as a "small, simple idea, big consequences" and framed the work as ongoing research rather than an immediate change in laboratory practice.

During a brief question period, an attendee asked whether inbreeding coefficients drive the observed dependence. Weir responded that dependence arises largely from how whole genomes are transmitted as packaged gametes and from recent relatedness in populations, so dependence is to be expected even under random mating.

Weir concluded by saying the approach requires much larger databases and further methodological work before being adopted in routine forensic reporting. The presentation left open next steps: further simulation studies, empirical data collection, and discussion about acceptable theta values and how to present match-strength metrics to courts.