TLDR: ScenicProver is a new framework for verifying complex AI-enabled cyber-physical systems. It allows breaking down systems into components, defining contracts for their behavior, and combining evidence from testing, formal proofs, and assumptions. A case study on an autonomous emergency braking system showed it can achieve significantly stronger safety guarantees much faster than traditional monolithic testing by intelligently focusing verification efforts.
The world of cyber-physical systems, especially those powered by artificial intelligence, is becoming increasingly complex. From self-driving cars to smart factories, these systems integrate physical processes with computational elements, often relying on ‘black-box’ machine learning components. Ensuring their safety and reliability has been a monumental challenge for engineers and researchers.
Traditional methods of verifying these systems often treat them as a single, indivisible unit. This ‘monolithic’ approach struggles with the sheer complexity, the difficulty of modeling diverse real-world environments, and the fact that even minor updates to an AI component can invalidate all previous safety analyses. This is where a new framework, ScenicProver, steps in to offer a more effective solution.
ScenicProver, built upon the Scenic probabilistic programming language, introduces a novel approach to verification. Instead of tackling the entire system at once, it allows engineers to break down complex systems into smaller, manageable components. Each component, whether it’s a piece of interpretable code or a mysterious black-box AI model, can have clearly defined interfaces and ‘contracts’ that specify its expected behavior. These contracts use an advanced form of logic that can incorporate real-world expressions and conditions.
How ScenicProver Verifies Systems
The framework supports multiple ways to gather evidence that a component meets its contract:
-
Testing: Using simulations, ScenicProver can systematically test components in various scenarios, providing probabilistic guarantees about their performance.
-
Formal Proofs: For critical components or specific conditions, engineers can provide explicit mathematical proofs, often integrated with tools like Lean 4, to formally guarantee correctness.
-
Assumptions: In cases where formal proofs or extensive testing aren’t feasible, or when leveraging existing manufacturer guarantees (like sensor specifications), the framework allows for incorporating well-reasoned assumptions.
What truly sets ScenicProver apart is its ability to systematically combine these different types of evidence using ‘contract operators’. These operators allow for the composition, conjunction, and refinement of contracts, building a comprehensive safety argument for the entire system. A particularly innovative feature is the ‘weak merge’ operator, which helps reduce the burden of proof by allowing for more relaxed contract combinations when appropriate.
Also Read:
- Automating Dafny Program Verification with LLM-Powered Assertion Inference
- Real-DRL: Bridging the Gap for Safe AI in Physical Systems
A Real-World Example: Autonomous Emergency Braking
To demonstrate its power, the researchers applied ScenicProver to a case study: an automatic emergency braking (AEB) system in an autonomous vehicle, which uses sensor fusion from radar, laser, and camera systems. This system is a perfect example of a learning-enabled cyber-physical system with black-box components and complex environmental interactions.
By using ScenicProver, the team was able to leverage existing manufacturer guarantees for the radar and laser sensors. This meant they didn’t have to spend valuable computational resources testing scenarios where these sensors were already known to perform reliably. Instead, they could focus their testing efforts on the more uncertain conditions and the AI-powered camera system.
The results were striking. ScenicProver achieved a stronger probabilistic safety guarantee for the AEB system – 94.59% correctness – compared to 91.55% achieved by testing the system as a monolith, all within the same computational budget. In fact, the optimized approach with ScenicProver delivered a stronger result in just 15 minutes than what the monolithic testing could achieve in 8 hours. This significant improvement highlights the framework’s potential to accelerate the verification process and enhance the safety of autonomous systems.
The framework also automatically generates an ‘assurance case’, which is a structured argument tracking the provenance of all system-level guarantees. This is crucial for regulatory compliance and building trust in complex AI-driven systems. For more technical details, you can refer to the full research paper: ScenicProver: A Framework for Compositional Probabilistic Verification of Learning-Enabled Systems.
Looking ahead, the creators of ScenicProver plan to expand its capabilities, including integrating more advanced analysis techniques for neural networks, further automating complex contract operations, and addressing the critical ‘sim-to-real’ gap to ensure that simulation results translate accurately to real-world performance. This framework represents a significant step towards making AI-enabled cyber-physical systems safer and more trustworthy.


