SkillFortify: Formal Verification for OpenClaw AI Agent Skills

Q: How does SkillFortify differ from traditional antivirus scanning?

Traditional tools use pattern matching and heuristics that miss novel threats. SkillFortify uses formal verification with five mathematical theorems to prove a skill cannot exceed its declared capabilities. This provides deterministic guarantees rather than probabilistic detection, achieving 96.95% F1 score with zero false positives in benchmarks.

Q: What types of skills does SkillFortify support?

SkillFortify supports Claude Code skills, MCP servers, and OpenClaw manifests. It analyzes JavaScript, Python, and TypeScript implementations against declared capability files in markdown or JSON formats. The tool recognizes standard skill definition patterns and can be extended to custom formats via plugins.

Q: Can SkillFortify detect all types of malicious agent behavior?

SkillFortify detects capability violations where a skill exceeds its declared permissions. It cannot detect bugs in legitimate capabilities or social engineering attacks that stay within declared bounds. It specifically addresses the 'excess authority' vulnerability class seen in ClawHavoc, where skills claim limited access but attempt broader system access.

Q: How do I integrate SkillFortify with my existing OpenClaw deployment?

Run skillfortify scan in your project root to discover all skills, then add the verify step to your CI/CD pipeline before deployment. Generate skill-lock.json files to pin verified versions and use skillfortify trust to check provenance before production deployment. The tool outputs exit codes compatible with standard CI systems.

Q: What is the performance overhead of formal verification?

Verification takes 2-5 seconds per skill on modern hardware. The tool caches theorem proofs and supports parallel verification for large skill sets. SBOM generation adds negligible overhead compared to the security guarantees provided. For large marketplaces, run verification asynchronously and cache results.

SkillFortify is a formal verification engine for AI agent skills that mathematically proves a skill cannot exceed its declared capabilities. Unlike heuristic scanners that failed to catch 6,487 malicious agent tools during the ClawHavoc campaign, SkillFortify uses five mathematical theorems to guarantee soundness. When SkillFortify certifies a skill as safe, you get provable certainty that the code cannot perform actions beyond what is documented in its capability declaration. This addresses the fundamental vulnerability exposed by CVE-2026-25253, where malicious skills exploited the gap between claimed and actual behaviors. You install SkillFortify via pip, run it against your OpenClaw projects, and get deterministic security guarantees instead of probabilistic scans that carry the caveat “no findings does not mean no risk.”

What You Will Accomplish in This Guide

By the end of this guide, you will have SkillFortify running in your OpenClaw development environment with verified skill configurations ready for production. You will scan existing projects to discover all skill surfaces, write formal capability declarations that act as security contracts, and generate mathematical proofs that your skills behave exactly as documented. You will create reproducible skill-lock.json files for your team and integrate verification into your CI/CD pipeline to block unverified code. You will understand how SkillFortify detected the ClawHavoc vulnerabilities that VirusTotal missed, and you will know how to interpret trust scores and SBOM outputs. This is hands-on security engineering. You will run actual commands against real skill files, handle verification failures, and see mathematical proofs that you can deploy to production with confidence. This guide aims to equip you with the practical knowledge to robustly secure your AI agent ecosystem, ensuring that every skill operates strictly within its defined boundaries.

Prerequisites and System Requirements

You need Python 3.10 or higher installed on your system. SkillFortify runs on Linux, macOS, and Windows with WSL2. You should have an existing OpenClaw project with at least one skill file, or you can use the sample skills from the SkillFortify GitHub repository. Install pip and ensure you have 2GB of free RAM for the Z3 theorem prover that powers the verification engine. You will need a text editor capable of handling markdown files for writing capability declarations. If you plan to integrate with CI/CD, ensure your runner supports Python environments and has network access to PyPI for installation. No prior knowledge of formal methods is required. The tool abstracts the mathematical complexity while exposing the guarantees. You should also have basic familiarity with OpenClaw manifest formats or MCP server configurations to understand the skill structures being verified. A stable internet connection is recommended for initial installation and dependency resolution.

Installing SkillFortify

Open your terminal and run the installation command. SkillFortify distributes via PyPI and installs the CLI, theorem prover bindings, and verification libraries in one shot. This includes the Z3 SMT solver backend and default policy templates for OpenClaw skills. The installation process is designed to be straightforward, pulling all necessary components to get you started quickly.

pip install skillfortify

Verify the installation worked by checking the version. You should see output confirming the CLI is ready and displaying the version number along with the Z3 solver version bundled. This step confirms that the SkillFortify executable is correctly added to your system’s PATH and is ready for use.

skillfortify --version

If you are running in a containerized environment, such as Docker, ensure you have build tools installed for the native extensions. On Alpine Linux, for instance, you need to install python3-dev and g++ before running pip to compile certain dependencies. Once installed, the skillfortify command becomes available globally in your PATH. You can now run it from any directory containing OpenClaw skills or MCP server configurations. Test the installation by running skillfortify --help to see the available commands including scan, verify, lock, trust, and sbom. This confirms that all sub-commands are accessible and the tool is fully operational.

Understanding the ClawHavoc Vulnerability Pattern

The ClawHavoc campaign exposed a critical gap in agent security that traditional tools cannot close. In January 2026, attackers uploaded 1,200 malicious skills to the OpenClaw marketplace that passed heuristic scans because they contained no known malicious patterns or virus signatures. Instead, these skills declared limited capabilities like “read local files” but actually implemented remote code execution via hidden imports, dynamic code evaluation, and obfuscated network calls. Traditional YARA rules and LLM-as-judge systems look for signatures of bad behavior or ask probabilistic questions about code intent. SkillFortify inverts this model entirely. It formally verifies that the implementation cannot perform any action not explicitly listed in the capability declaration. This catches the “excess authority” bug class where skills overstep their documented permissions. You need to understand this threat model because it explains why pattern matching failed against novel attacks and why formal verification provides the only definitive proof of safety. This fundamental shift in security philosophy is what makes SkillFortify effective against sophisticated, zero-day attacks that leverage capability misrepresentation.

Step 1: Scanning Your Project for Skills

Navigate to your OpenClaw project root directory. Run the discovery command to find all skill files, MCP servers, and manifest declarations. SkillFortify recursively searches for files matching OpenClaw conventions and parses their structure. This initial scan provides a comprehensive inventory of all agent skills within your project, regardless of their specific implementation language or framework.

cd ~/my-openclaw-project
skillfortify scan .

The output lists every discovered skill with its location, type, and initial capability surface estimate. You will see entries for Claude Code skills, OpenClaw manifests, and MCP server configurations. The scanner identifies entry points, API calls, and system interactions without executing any code. This static analysis phase prepares the verification dataset. Save this output to a file if you are auditing a large project by redirecting to JSON. The scan command also generates a baseline report showing which skills lack capability declarations. You need this inventory before you can run formal verification. Each discovered skill gets a unique identifier that persists through subsequent verification steps, allowing you to track verification state across your entire project. This systematic approach ensures no skill is overlooked in your security assessment.

Step 2: Analyzing the Capability Surface

Before writing declarations, examine what each skill actually does at the system level. SkillFortify extracts the call graph and identifies resources accessed by the code. Run the analyze command on a specific skill to see its behavioral profile and potential capability violations. This deep analysis helps you understand the true operational footprint of your agent skills, providing crucial insights for crafting accurate capability declarations.

skillfortify analyze skill.py

The output shows file system operations, network calls, and subprocess spawns detected in the static analysis. You will see categories like “filesystem:read”, “network:http”, and “process:spawn” with line numbers referencing the source code. Compare this against what the skill claims to do in its documentation. If the analyzer shows “network:outbound” but the skill description mentions only local file processing, you have found a capability mismatch. This is exactly the vulnerability pattern exploited by ClawHavoc. Document these findings because they inform your capability declarations. The analysis output uses standardized capability tokens that match the verification schema. You can export this to JSON for automated comparison against your security policies or import it into risk assessment dashboards. This step is crucial for identifying discrepancies between expected and actual behavior before formal verification.

Step 3: Writing Formal Capability Declarations

Create a skill.md file alongside your skill implementation. This markdown file declares exactly what the skill is allowed to do using SkillFortify’s capability syntax. Be explicit and restrictive rather than permissive. These declarations serve as a formal contract, outlining the precise boundaries of your skill’s operations.

# Skill Capabilities

## Allowed Operations
- filesystem:read:/home/user/data/*
- filesystem:write:/tmp/output/*
- network:none
- process:none

## Constraints
- max_file_size: 10MB
- no_dynamic_code_execution: true
- allowed_imports: ["os", "json", "pathlib"]

Save this as skill.md in the same directory as your skill implementation. The declaration uses allow-list semantics where anything not explicitly permitted is forbidden. This is the contract that SkillFortify will verify against the actual code. If your skill needs network access for specific endpoints, list them explicitly with full URLs. Avoid wildcards where possible to enhance security. The stricter your declaration, the stronger the verification guarantee. You can use YAML instead of markdown if you prefer, but markdown is the default format for OpenClaw skills and renders nicely in GitHub repositories. This practice of explicit declaration is a cornerstone of robust AI agent security.

Step 4: Running Formal Verification

Now execute the verification command to prove your implementation matches the declaration. SkillFortify converts both the code and the declaration into mathematical models and checks for satisfiability using the Z3 theorem prover. This process is complex under the hood but presented simply to the user, providing a clear pass/fail outcome.

skillfortify verify skill.md

The tool generates verification conditions based on the five soundness theorems. It checks every code path to ensure no operation exceeds the declared capabilities. You will see output indicating which theorems passed or failed with specific timing information. A successful verification outputs “VERIFIED: Skill behavior is subset of declared capabilities”. If verification fails, you get a specific counterexample showing which code path violates which capability constraint. This deterministic output eliminates the “no findings does not mean no risk” problem of heuristic scanning. Verification takes 2-5 seconds per skill depending on code complexity and control flow depth, offering a balance between thoroughness and practical application.

Step 5: Interpreting Verification Results

Successful verification means the skill provably cannot perform actions outside its declaration. This is a mathematical guarantee backed by formal logic, not a statistical likelihood based on training data. If you see “CAPABILITY_VIOLATION”, examine the counterexample trace provided in the output. The trace shows the specific function call, line number, and call stack where the code attempts an undeclared operation. Common violations include hidden imports, dynamic eval statements, network calls not listed in the manifest, or file system access outside declared paths. Fix the code to remove the violation or update the declaration to match actual requirements if the behavior is legitimate. If you see “TIMEOUT”, your skill may have complex control flow that exceeds the default solver limits. Increase the timeout with --solver-timeout 30. Zero false positives means every violation is a real security issue requiring attention. This precise feedback mechanism is invaluable for debugging and refining your agent skills for security.

The Five Mathematical Theorems Behind SkillFortify

SkillFortify rests on five theorems that guarantee soundness and separate it from heuristic approaches. Theorem 1 ensures that all system calls are captured in the static analysis without omission, providing comprehensive coverage. Theorem 2 proves that the capability declaration language cannot express contradictory permissions that would confuse the verifier, maintaining logical consistency. Theorem 3 guarantees that the Z3 encoding preserves the semantics of the source code during translation to logical formulas, ensuring accuracy. Theorem 4 ensures that verification failure implies a concrete capability violation exists in the code, offering actionable insights. Theorem 5 proves that verified skills compose safely without capability escalation when combined, critical for complex agent systems. Together, these theorems mean that when SkillFortify says “safe”, the skill mathematically cannot exceed its bounds. You do not need to understand the formal logic involving SMT solvers and Hoare logic to use the tool, but knowing these guarantees exist separates SkillFortify from pattern matching tools that offer no proof of correctness or coverage. These foundational principles provide the bedrock for SkillFortify’s unparalleled security assurances.

Step 6: Generating Skill Locks for Reproducibility

Once verified, pin your skill configuration to prevent configuration drift between environments. The skill lock file captures the exact version, verification hash, and capability declaration. This ensures your production deployment matches your tested configuration exactly, providing a crucial layer of consistency and security.

skillfortify lock

This creates skill-lock.json in your project root. The file contains cryptographic hashes of the skill code, the verification certificate, the theorem prover version used, and timestamps. Commit this file to version control and treat it like a package-lock.json for security. When another developer clones the repository, they can run skillfortify verify --lock to ensure their environment matches the verified state. This prevents “works on my machine” security gaps where unverified code slips into production. The lock file also includes the SBOM reference and trust score computed at verification time. You can distribute this file to deployment systems that enforce only locked skills may run in production environments. This mechanism is vital for maintaining a secure and auditable deployment pipeline for your OpenClaw agents.

Step 7: Computing Trust Scores

Trust scores combine provenance data with behavioral analysis to give you a quantitative security metric. Run the trust command to evaluate a skill’s supply chain integrity and runtime risk profile beyond just capability verification. This holistic approach provides a richer understanding of a skill’s overall security posture.

skillfortify trust skill.md

The output shows a score from 0 to 100 based on factors like code signing status, author reputation, dependency vulnerability counts, and verification status. A skill with verified capabilities, signed commits from known contributors, and no known CVEs scores above 90. Skills from unknown authors with dynamic imports and broad network calls score lower. This quantitative metric helps marketplace operators like OpenClaw rank skills by security posture and helps developers set policy thresholds. You can configure your CI/CD to require trust scores above 80 for production deployment. The trust calculation uses the same formal model as verification, so scores reflect mathematical properties of the code and its supply chain rather than heuristic guesses about intent. This allows for objective, policy-driven decisions about skill deployment.

Step 8: Generating CycloneDX SBOMs

Supply chain visibility matters for agent skills just like any other software component. Generate a CycloneDX 1.6 compliant SBOM that lists all dependencies, capabilities, and verification artifacts. This standard format facilitates integration with existing security tools and compliance frameworks.

skillfortify sbom --format cyclonedx > skill-sbom.json

The output includes standard CycloneDX fields plus agent-specific extensions for capability declarations and verification status. Import this into your dependency scanner or vulnerability management platform like Dependency-Check or Sonatype. The SBOM links each dependency to its verification certificate, showing which components have been formally proven safe. This addresses the gap left by traditional SBOM tools that list components but cannot verify their behavioral constraints or authority boundaries. You can upload this to security dashboards to demonstrate compliance with software supply chain regulations, such as those mandated by government agencies for critical infrastructure. The SBOM includes hashes of the skill-lock.json for integrity verification, ensuring that the documented state matches the deployed state.

Integrating SkillFortify into CI/CD Pipelines

Automate verification in your build process to catch capability violations before they reach production. Add a stage that runs after unit tests but before deployment packaging. This proactive integration prevents insecure code from ever making it to your operational environment, significantly reducing security risks.

verify_skills:
  stage: test
  script:
    - pip install skillfortify
    - skillfortify scan .
    - skillfortify verify --all
    - skillfortify lock --check
  artifacts:
    reports:
      sbom: skill-sbom.json
  only:
    - main

This pipeline fails if any skill violates its declared capabilities or if the lock file is out of sync with the current code. Run verification in parallel by using --jobs 4 to check multiple skills simultaneously, optimizing for speed in larger projects. Store the verification certificates as artifacts for audit trails and compliance documentation. You can also integrate the trust score check with --min-trust 80 to block deployment of low-trust skills, enforcing a minimum security standard. This ensures that only formally verified code reaches your OpenClaw production environment and prevents the ClawHavoc scenario where malicious skills pass basic linting but fail formal verification. By embedding SkillFortify into your continuous integration and deployment workflow, you build security directly into your development lifecycle.

Handling Verification Failures and Edge Cases

When verification fails, you have three resolution paths. First, fix the code to remove the capability violation by refactoring to use only allowed APIs, ensuring strict adherence to the declared boundaries. Second, expand the capability declaration if the behavior is legitimate but was undocumented during initial development, reflecting the true intent of the skill. Third, suppress the warning with an annotation if you accept the risk after manual review.

# skillfortify: ignore capability:network:outbound
import requests

Use suppressions sparingly and document them in your security runbook, as each suppression reduces the verification guarantee for that specific line. If you encounter timeouts on complex skills with heavy recursion, simplify the control flow or increase solver resources with --solver-timeout 60. For skills using dynamic imports that cannot be statically analyzed, declare the maximum possible capability set and wrap the skill in a sandbox runtime. Remember that SkillFortify catches excess authority, not malicious logic within declared capabilities. A skill that declares “filesystem:delete” and deletes files is working as declared, even if the user did not expect data loss. This distinction is crucial for understanding the scope of SkillFortify’s protection and for designing your agent skills responsibly.

SkillFortify vs Heuristic Scanning: A Comparison

Pattern matching tools like YARA and LLM-as-judge systems dominated the initial response to ClawHavoc but failed to catch novel attacks. They look for symptoms of bad behavior rather than proof of good behavior, inherently limiting their effectiveness against sophisticated threats.

Feature	SkillFortify	Heuristic Scanning
Detection Method	Formal verification (mathematical proof of behavior)	Pattern matching, signature analysis, LLM-based intent analysis
False Positive Rate	0% (deterministic guarantee)	Variable (5-20% depending on heuristic complexity and training data)
Novel Threat Detection	Yes (catches all excess authority violations, even zero-day)	No (requires prior knowledge or signatures of threats)
Mathematical Guarantee	Yes (backed by 5 formal theorems)	No (statistical likelihood or probabilistic assessment)
Performance per Skill	2-5 seconds (can be parallelized)	<1 second (faster for simple scans, misses complex issues)
CI/CD Integration	Deterministic pass/fail, strong policy enforcement	Probabilistic risk scores, requires human interpretation
ClawHavoc Detection	100% (successfully identified all 270 known ClawHavoc skills)	0% (failed to detect any of the novel ClawHavoc exploits)
Core Principle	Prove what a skill cannot do	Try to detect what a skill might do badly
Output	Counterexamples, proof certificates, SBOMs with verification status	Alert lists, risk scores, vague descriptions of potential issues

SkillFortify trades speed for certainty. It catches the architectural vulnerabilities that heuristic tools miss by design, providing the only known defense against the “excess authority” attack class that defined the ClawHavoc campaign. This fundamental difference in approach makes SkillFortify an indispensable tool for securing AI agent ecosystems against advanced persistent threats.

Troubleshooting Common Issues

If you see “ModuleNotFoundError” after installation, upgrade pip and setuptools to the latest versions using pip install --upgrade pip setuptools. For “Z3 solver not found”, install the system Z3 library with apt-get install z3 on Debian/Ubuntu, brew install z3 on macOS, or use the official Docker image that includes all dependencies. When verification reports spurious violations on valid OpenClaw patterns, check that you are using the latest policy templates with --upgrade-templates to ensure compatibility with the most recent OpenClaw specifications. If the scan command misses skills, verify your file extensions match OpenClaw conventions (.claw, .mcp, or standard .py/.js with manifest comments) and that the files are accessible to the skillfortify command. For performance issues on large projects, use --exclude node_modules --exclude .git to skip dependency directories and focus the scan on relevant skill code. If trust scores seem inconsistent between runs, ensure your git history is clean and commits are signed with GPG, as provenance data significantly impacts trust calculations. Check the GitHub issues page for platform-specific workarounds and community support. The tool includes verbose logging with -v and --debug flags for complex diagnoses involving the Z3 solver state, providing detailed insights into the verification process.

Future Enhancements and Roadmap

The SkillFortify team is continuously working on expanding its capabilities and improving performance. Upcoming features include support for additional agent frameworks beyond OpenClaw, such as custom LLM orchestration platforms, and deeper integration with cloud-native security services. We plan to introduce a graphical user interface (GUI) for easier visualization of capability graphs and verification results, making the tool more accessible to a wider audience of developers and security engineers. Enhanced support for multi-language agent skills, including Go and Rust, is also on the roadmap. Furthermore, we are exploring the use of machine learning to suggest optimal capability declarations based on code analysis, assisting developers in writing more precise and secure contracts. The goal is to make formal verification an integral, seamless part of every AI agent’s development lifecycle, preempting future ClawHavoc-like campaigns. Community contributions and feedback are always welcome to help shape the future direction of SkillFortify.

Conclusion

SkillFortify provides a robust and mathematically sound approach to securing your OpenClaw AI agent skills. By leveraging formal verification, it offers deterministic guarantees that your skills will not exceed their declared capabilities, effectively mitigating the “excess authority” vulnerability that led to the ClawHavoc campaign. This guide has walked you through the installation, scanning, declaration, verification, and integration of SkillFortify into your development and CI/CD workflows. You now possess the tools and knowledge to build and deploy AI agents with unparalleled security and confidence. Embracing SkillFortify means moving beyond probabilistic security assessments to a paradigm of provable safety, ensuring the integrity and trustworthiness of your AI agent ecosystem. The future of AI agent security lies in formal methods, and SkillFortify is at the forefront of this critical evolution.

Frequently Asked Questions

How does SkillFortify differ from traditional antivirus scanning?

Traditional tools use pattern matching and heuristics that miss novel threats. SkillFortify uses formal verification with five mathematical theorems to prove a skill cannot exceed its declared capabilities. This provides deterministic guarantees rather than probabilistic detection, achieving 96.95% F1 score with zero false positives in benchmarks. This fundamental difference means SkillFortify offers a higher level of assurance against unknown or sophisticated attacks that traditional antivirus software cannot detect.

What types of skills does SkillFortify support?

SkillFortify supports Claude Code skills, MCP servers, and OpenClaw manifests. It analyzes JavaScript, Python, and TypeScript implementations against declared capability files in markdown or JSON formats. The tool recognizes standard skill definition patterns and can be extended to custom formats via plugins. This broad support ensures that most common AI agent development environments can benefit from SkillFortify’s rigorous security checks.

Can SkillFortify detect all types of malicious agent behavior?

SkillFortify detects capability violations where a skill exceeds its declared permissions. It cannot detect bugs in legitimate capabilities or social engineering attacks that stay within declared bounds. It specifically addresses the “excess authority” vulnerability class seen in ClawHavoc, where skills claim limited access but attempt broader system access. It’s important to understand that SkillFortify guarantees adherence to declared capabilities, but it does not evaluate the inherent benevolence or malicious intent of those declared capabilities themselves.

How do I integrate SkillFortify with my existing OpenClaw deployment?

Run skillfortify scan in your project root to discover all skills, then add the verify step to your CI/CD pipeline before deployment. Generate skill-lock.json files to pin verified versions and use skillfortify trust to check provenance before production deployment. The tool outputs exit codes compatible with standard CI systems, making it straightforward to automate and enforce security policies within your existing development infrastructure.

What is the performance overhead of formal verification?

Verification takes 2-5 seconds per skill on modern hardware. The tool caches theorem proofs and supports parallel verification for large skill sets. SBOM generation adds negligible overhead compared to the security guarantees provided. For large marketplaces, run verification asynchronously and cache results. While formal verification is computationally intensive, SkillFortify is optimized to provide practical performance within typical development and deployment cycles, ensuring that security doesn’t become a bottleneck.

Conclusion

Learn how SkillFortify provides formal verification for OpenClaw AI agent skills. Step-by-step guide to securing against ClawHavoc vulnerabilities with mathematical proof.