Current
RAMPART: Adversarial Agent Safety Testing via pytest
Microsoft releases RAMPART, a pytest-based framework for automated adversarial safety testing of AI agents, allowing developers to define security scenarios as executable tests with configurable pass/fail thresholds in CI pipelines.
Signal
RAMPART: Adversarial Agent Safety Testing via pytest · GitHub · 2026-05-23
Microsoft released RAMPART, an open-source framework for automated adversarial safety testing of AI agents. Built on PyRIT, it enables developers to author adversarial scenarios as pytest tests, allowing security evaluations to run as pass/fail checks within CI pipelines. The tool supports configurable thresholds, such as requiring agents to maintain safety compliance across multiple reruns.
Context
RAMPART shifts agent safety evaluation from ad-hoc manual review or black-box benchmarking to deterministic, code-based testing. By leveraging pytest, it integrates safety checks into standard software development workflows, treating agent robustness as a verifiable property alongside functional correctness. This approach allows safety metrics to be tracked, versioned, and enforced alongside code changes.
Relevance
Addresses the operationalization of agent safety in continuous integration. Provides a mechanism to quantify safety metrics (e.g., pass rates) and enforce them via thresholds, reducing reliance on subjective assessment. Enables teams to reject agent updates that fail safety regression tests, establishing a quality gate for autonomous behavior.
Current State
Available as a Microsoft open-source project. Relies on PyRIT for underlying adversarial generation capabilities. Focuses on the test harness and CI integration layer rather than model training or runtime enforcement. Requires configuration of test scenarios and threshold parameters by the operator.
Open Questions
- How does RAMPART handle stateful multi-turn interactions in test scenarios?
- What is the computational overhead of running 1000 reruns in CI for large models?
- Does the framework support custom reward functions or only binary pass/fail outcomes?
- How are adversarial prompts generated and curated for specific agent toolsets?
Connections
RAMPART complements runtime governance tools like agent-governance-toolkit by providing pre-deployment validation for the policies enforced during execution. It aligns with agent-execution-sandboxing-infrastructure patterns if tests are executed in isolated environments, though the signal emphasizes the testing logic over the execution environment.