Witnesses urge red-teaming, shared datasets and secure-by-design standards for AI used in government systems

3789115 · June 13, 2025

Get AI-powered insights, summaries, and transcripts

Subscribe
AI-Generated Content: All content on this page was generated by AI to highlight key points from the meeting. For complete details and context, we recommend watching the full video. so we can fix them.

Summary

Witnesses told the House subcommittee that red teaming, transparency about training data, common benchmarks and secure-by-design practices are essential for trustworthy AI in federal systems and for enabling agencies to assess products without exposing sensitive data.

Witnesses before the House Homeland Security Subcommittee recommended a coordinated program of red teaming, shared benchmark datasets, and secure-by-design requirements to help federal agencies and private-sector customers evaluate AI security without exposing sensitive production data.

Jonathan Danbrodt, CEO of Cranium, described his company’s red‑teaming platform, saying it allows developers to “pull [a system] into a place where we can simulate that system, attack it, provide vulnerability mitigation support, and then make sure that that system, whether it's preproduction or postproduction, is being monitored.” Danbrodt said red teaming should be scalable and integrated into development workflows.

Multiple witnesses supported the idea of public datasets and synthetic-data pipelines to enable benchmarking and validation. Steve Fale of Microsoft cited benefits from common datasets for comparison and testing, referencing NIST benchmarking activity as an example that “fuels innovation.” He noted agencies are often hesitant to provide production data for testing and recommended common benchmarks and synthetic data where necessary.

Kiran Chinnagongon Nagari proposed a transparency model akin to labeling for consumer products: “When you buy food or when you buy a drug, there is label and ingredients on it. You know exactly what you're consuming, so you know what the impact of it is. We need a similar one for AI, understanding the entire bill of materials, including the data.”

Witnesses also discussed secure hosting and access controls, with Microsoft stressing default secure-by-design choices in platforms such as Azure AI Foundry and Copilot Studio to help prevent accidental data exposure. Gareth McLaughlin and others emphasized that model outputs and prompts should be treated as potentially untrusted and instrumented for monitoring.

Members asked about legal protections for red-team researchers and whether the federal government should fund or enable red-teaming and benchmarking programs. Witnesses recommended clearer acquisition models, workforce readiness programs, and technical support to allow agencies to adopt secure AI tools confidently.