JCVI researcher describes Forensic Microbiome Database and evidence that bacterial profiles can indicate sample origin

JCVI symposium presentation · February 9, 2026

Get AI-powered insights, summaries, and transcripts

Subscribe
AI-Generated Content: All content on this page was generated by AI to highlight key points from the meeting. For complete details and context, we recommend watching the full video. so we can fix them.

Summary

Lauren of JCVI reported that bacterial profiles from human microbiome samples—analyzed with 16S rRNA sequencing and machine learning—often cluster by geographic location; the team launched a public Forensic Microbiome Database (FMD) with nearly 14,000 samples to support geosourcing and future forensic use-cases.

Lauren, a researcher at JCVI, presented results from a National Institute of Justice-funded project that uses bacterial profiles from 16S rRNA sequencing to predict geographic origin and support forensic investigations. The work, now in its second year of a two-year NIJ award, produced a publicly accessible Forensic Microbiome Database (FMD) designed to let researchers and practitioners compare new samples against a broad set of publicly available data.

The study tested three proof-of-concept datasets: the Human Microbiome Project (HMP), a 2012 global gut microbiome study that included Amerindian villages in Venezuela, and an internally funded hair-shaft microbiome project. Using weighted principal-coordinate analysis, random-forest classifiers, and a targeted feature-selection step, the team reported that stool samples from two U.S. cities (Houston and St. Louis) clustered by location and that classification accuracy improved after selecting genera that best discriminated sites. "We were able to select precise geolocation indicators that accurately predict the geographic location," Lauren said.

Not all body sites performed equally. Skin sites and oral-cavity samples showed the highest potential to classify geographic origin in weighted analyses, while stool and the posterior fornix initially showed lower potential but improved after feature selection. In the hair-shaft project, which is under peer review, Lauren reported that scalp hair showed greater geolocation potential than pubic hair.

The FMD is available online at fmd.jcvi.org and, according to the presentation, contains close to 14,000 individual samples across 20 body sites; stool comprises 57% of the current collection. The database aggregates material from 33 studies, with the HMP making up about 52% of samples and a twin gut study about 22%. Lauren acknowledged a Western-data bias in the resource and said the project is recruiting paired oral and stool samples from 100 healthy adult females in each of five distinct global regions to expand geographic representation and strengthen the prediction algorithm.

Lauren demonstrated interactive tools on the FMD website, including taxonomic distribution plots, sample-similarity maps and a geographic overlay that links a user-supplied sample to the most similar entries in the database. In one test described during the presentation, a throat microbiome sample collected in Argentina matched most closely to an Argentinian stool sample in the FMD.

The presenter emphasized caution about the limits of available data and next steps: continued algorithm refinement, broader data collection (including underrepresented regions), and testing how inclusion of unhealthy individuals affects model performance. Lauren credited the project’s engineers (Toby Clark, Herrinder Singh and Chris Greco), JCVI leadership and funding from the National Institute of Justice and internal JCVI support for the hair-shaft study. "The ultimate goal is to predict the geographic location of the sample," Lauren said.

Next steps announced at the presentation include continuing database expansion, public-tool refinement, and planned publication of the hair-shaft results. The team said collaborators are expected to collect over 1,000 additional samples as part of ongoing recruitment.