Research team demonstrates proof-of-concept for quantitative footwear-impression analysis

An unidentified presenter described a proof-of-concept system intended to give forensic footwear examiners quantitative, reproducible context for impression comparisons. Funding, the presenter said, comes from the National Institute of Justice and "NIST internal funds." The research was framed as a direct response to recent national reviews saying footwear identification is "largely subjective" and raising questions about reliability and scientific validity.

The presenter said the project’s goal is to develop measurement-science underpinnings and algorithmic methods that practitioners can use in casework. The proposed pipeline accepts a crime‑scene impression and a test impression, extracts image features (with human markup when necessary), runs feature‑based matching to produce a numeric comparison score, and places that score within case‑relevant score distributions and ROC charts so examiners can see associated error rates.

The team described two classes of data used in experiments: staged crime‑scene impressions (examples mentioned: prints made in red paint and in mud) and "augmented" crime‑scene data, where test impressions are intentionally degraded by image‑processing routines to mimic crime conditions. The presenter emphasized that a case‑relevant ground‑truth dataset is necessary for producing meaningful score distributions.

For feature extraction the group favors a hybrid human–computer approach. The presenter described a graphical markup tool in which human users label polygons and the software can auto‑adjust boundaries using image gradients; demo videos were referenced showing auto‑adjust, copy‑and‑paste of templates, path‑finding for repeated circular features, and automated overlays indicating detected boundaries.

On matching, the presenter said the team implemented several preliminary algorithms and highlighted a Delaunay‑triangulation maximum‑clique graph‑matching approach that yields a single numeric comparison score. As a demonstration, the presenter said the algorithm returned a score of "0.303" for a staged casework pair and showed how that score can be plotted against mated and non‑mated score distributions to obtain contingency‑table rates.

Using the example threshold of 0.303, the presenter described implications for error rates: if the pair is labeled a match at that threshold, about 2.2% of non‑mated pairs in the case‑relevant dataset would have higher scores and therefore be false matches (a false‑positive rate on that order); conversely, if the pair is labeled a non‑match, approximately 75% of mated pairs would lie below that score and be mislabeled (a high false‑negative rate). The presenter also noted that score‑based likelihood ratios are another possible summary but cautioned about known problems and said examiners must be made aware of limitations.

The presenter concluded that a comparison score, when presented with score distributions, ROC charts and a carefully described case‑relevant reference dataset, can help examiners make weight‑of‑evidence assessments. The team said the work remains at the proof‑of‑concept stage, with further experiments and tool refinement planned.

Research team demonstrates proof-of-concept for quantitative footwear-impression analysis

Summary