🧮 Not all AI vulnerabilities are equally dangerous — but most organizations treat them as if they are. The OWASP AI Vulnerability Severity Scoring System (AIVSS) gives security teams a structured, repeatable method to score AI-specific vulnerabilities by real-world impact — so you can prioritize the risks that matter most and stop treating every finding as equally urgent. This 2026 guide explains every dimension of AIVSS with real scoring examples and a copy-paste template.
Last Updated: May 2, 2026
Every organization that deploys AI systems discovers vulnerabilities. Some of those vulnerabilities are genuinely catastrophic — a prompt injection flaw that allows an attacker to exfiltrate customer data, a poisoned training dataset that causes an AI to systematically discriminate against a protected class, a misconfigured agentic system that can be manipulated into taking irreversible real-world actions. Others are theoretical edge cases that would require an implausible chain of circumstances to produce any meaningful harm.
The problem is that most AI security programs treat all of these vulnerabilities the same way — they list them, flag them, and add them to a remediation backlog without any structured method for determining which ones to fix first. The result is security teams spending resources on low-impact theoretical risks while genuinely dangerous vulnerabilities remain unaddressed because they appeared on the same list.
The OWASP AI Vulnerability Severity Scoring System (AIVSS) is designed to solve exactly this problem. Developed by the Open Web Application Security Project as part of its AI security framework, AIVSS provides a structured, multi-dimensional scoring methodology specifically designed for the unique characteristics of AI vulnerabilities — going beyond the traditional CVSS approach that was built for conventional software vulnerabilities and does not adequately capture the probabilistic, context-dependent nature of AI security risks.
This guide explains every dimension of the AIVSS scoring methodology, walks through real scoring examples for common AI vulnerability types, and provides a practical copy-paste scoring template that security teams can use immediately to bring structured prioritization to their AI vulnerability management programs.
📖 New to AI terminology? Visit the AI Buzz AI Glossary — 65+ essential AI terms explained in plain English, each linking to a full in-depth guide.
1. 🎯 Why Standard Vulnerability Scoring Fails for AI
Before examining AIVSS, it is essential to understand why the existing industry-standard vulnerability scoring approach — the Common Vulnerability Scoring System (CVSS) — is insufficient for AI vulnerabilities. CVSS was designed for conventional software vulnerabilities: buffer overflows, SQL injections, authentication bypasses — deterministic flaws where a specific input reliably produces a specific malicious output.
AI vulnerabilities are fundamentally different in three ways that CVSS does not adequately capture:
- Probabilistic Nature: An AI vulnerability may allow an attack to succeed 40% of the time rather than 100% of the time. CVSS has no mechanism for capturing this probabilistic dimension — treating a vulnerability that reliably produces catastrophic outcomes the same as one that occasionally produces minor ones.
- Context Dependency: The severity of an AI vulnerability depends heavily on how the AI is deployed and what it is authorized to do. A prompt injection vulnerability in a read-only AI assistant is very different from the same vulnerability in an agentic AI with access to email, databases, and financial systems — but CVSS treats both identically.
- Novel Attack Vectors: AI-specific attack types — data poisoning, model extraction, membership inference, adversarial examples — do not map cleanly to the CVSS attack vector taxonomy, which was built around network, adjacent, and local access concepts that do not translate meaningfully to AI threat models.
The Core Gap: CVSS scores tell you how severe a conventional software vulnerability is. AIVSS scores tell you how severe an AI vulnerability is — accounting for the probability of exploitation, the AI system’s specific deployment context, the autonomy level of the system, and the real-world impact of successful exploitation across technical, business, and societal dimensions.
| Dimension | CVSS (Traditional) | AIVSS (AI-Specific) |
|---|---|---|
| Attack Model | Deterministic — same input, same output | Probabilistic — accounts for attack success rate variability |
| Context Sensitivity | Limited — deployment context not captured | High — deployment context and AI autonomy level are core scoring dimensions |
| AI Attack Vectors | Not covered — network/local taxonomy only | Native support for poisoning, extraction, injection, and adversarial attacks |
| Impact Dimensions | Confidentiality, Integrity, Availability | Technical + Business + Societal impact dimensions |
| Designed For | Software and network vulnerabilities | AI and machine learning system vulnerabilities |
2. 🏗️ The AIVSS Scoring Architecture
AIVSS scores AI vulnerabilities across three primary scoring groups — each capturing a different dimension of risk that is essential for accurate severity assessment. The final AIVSS score is a weighted composite of scores across all three groups, producing a numerical result on a 0–10 scale that maps to five severity levels.
Scoring Group 1: Exploitability
The Exploitability group captures how difficult it is for an attacker to successfully exploit the vulnerability. AIVSS uses four exploitability dimensions specifically calibrated for AI attack types:
| Dimension | What It Measures | Score Range | AI Example |
|---|---|---|---|
| Attack Vector | How the attacker accesses the AI system to execute the attack | Physical (0.2) to Network (0.85) | A public-facing chatbot scores Network (0.85) |
| Attack Complexity | Technical skill and resources required to execute the attack | High (0.44) to Low (0.77) | Simple jailbreak = Low (0.77) |
| Attack Requirements | Prerequisites the attacker must satisfy before the attack is possible | Requires training access (0.1) to None (0.85) | No account needed = None (0.85) |
| Privileges Required | What level of privilege the attacker needs in the target system | High (0.27) to None (0.85) | Anonymous user access = None (0.85) |
Scoring Group 2: AI-Specific Context
This is the scoring group that most differentiates AIVSS from CVSS — capturing the AI deployment context dimensions that fundamentally affect the real-world severity of an AI vulnerability.
| Dimension | What It Measures | Score Range | AI Example |
|---|---|---|---|
| AI Autonomy Level | How autonomously the AI system operates — the higher the autonomy, the greater the potential impact of exploitation | Supervised (0.2) to Fully Autonomous (1.0) | Agentic AI with tool access = High Autonomy (0.85) |
| Data Sensitivity | The sensitivity classification of data the AI accesses or processes | Public (0.1) to Highly Sensitive (0.9) | Medical records access = Highly Sensitive (0.9) |
| Human Oversight | The degree of human review and approval applied to AI outputs before they take effect | Full oversight (0.1) to No oversight (0.9) | Auto-execute without review = No oversight (0.9) |
| Deployment Scale | The scale at which the AI system is deployed — affecting how many people are exposed to successful exploitation | Single user (0.1) to Millions of users (0.9) | Enterprise-wide deployment = Large scale (0.8) |
Scoring Group 3: Impact
The Impact group captures the consequences of successful exploitation across three dimensions — technical, business, and societal — providing a more complete picture of real-world harm than the traditional CIA triad (Confidentiality, Integrity, Availability) used by CVSS.
| Impact Dimension | What It Captures | Examples of High Impact |
|---|---|---|
| Technical Impact | Damage to data confidentiality, system integrity, and service availability | Complete data exfiltration, model weight theft, system unavailability |
| Business Impact | Financial damage, regulatory penalties, reputational harm, and operational disruption | GDPR fines, customer churn from trust breach, AI system shutdown costs |
| Societal Impact | Harm to individuals, groups, and society beyond the immediate organization — including bias, discrimination, and safety risks | Discriminatory AI decisions at scale, safety-critical system failures, public trust in AI erosion |
3. 📏 The AIVSS Severity Scale
The composite AIVSS score — calculated by applying defined weights to the scores across all three groups — produces a final numerical score on a 0–10 scale that maps to five severity levels. These levels drive remediation prioritization decisions in the same way that CVSS severity levels drive traditional vulnerability management.
| Score Range | Severity | Recommended Response | Example AI Vulnerability |
|---|---|---|---|
| 9.0 – 10.0 | Critical | Immediate remediation or system suspension — escalate to CISO within 24 hours | Prompt injection enabling data exfiltration from a public-facing agentic system with no human oversight |
| 7.0 – 8.9 | High | Remediation within 7 days — compensating controls implemented immediately | Training data poisoning in a high-stakes decision support system with limited human review |
| 4.0 – 6.9 | Medium | Remediation within 30 days — include in next sprint planning cycle | System prompt leakage in an internal chatbot with low data sensitivity and strong human oversight |
| 1.0 – 3.9 | Low | Remediation within 90 days — document and schedule for standard maintenance cycle | Theoretical model extraction attack requiring physical access to air-gapped deployment |
| 0.0 – 0.9 | Informational | Document for awareness — no remediation action required at this time | Theoretical hallucination risk in a fully human-reviewed, low-stakes internal tool |
4. 🔬 Real-World AIVSS Scoring Examples
The best way to understand AIVSS is to see it applied to real vulnerability scenarios. The following three examples illustrate how the same general vulnerability type produces very different AIVSS scores depending on deployment context — demonstrating exactly why context-aware scoring is essential for accurate AI risk prioritization.
Example 1: Prompt Injection in a Public Agentic System — Critical Severity
Scenario: A publicly accessible AI customer service agent has unrestricted access to the customer database, can send emails on behalf of customers, and has no human oversight gate before executing actions. A prompt injection vulnerability has been identified that allows an attacker to override the system prompt through indirect injection via a malicious document the agent reads.
| AIVSS Dimension | Assessment | Score |
|---|---|---|
| Attack Vector | Network — publicly accessible API | 0.85 |
| Attack Complexity | Low — indirect injection via document is well-documented | 0.77 |
| Attack Requirements | None — anonymous attacker can submit malicious documents | 0.85 |
| Privileges Required | None — no account required | 0.85 |
| AI Autonomy Level | High — agent executes actions without human approval | 0.85 |
| Data Sensitivity | Highly sensitive — full customer database access | 0.9 |
| Human Oversight | None — fully automated execution | 0.9 |
| Technical Impact | High — complete data access and email capability | 0.9 |
| Business Impact | Critical — regulatory breach, mass data exfiltration, reputational damage | 0.95 |
| Societal Impact | High — personal data of many individuals exposed | 0.8 |
| COMPOSITE AIVSS SCORE | CRITICAL | 9.4 |
Example 2: The Same Prompt Injection in an Internal Read-Only Tool — Medium Severity
Scenario: The same prompt injection vulnerability exists in an internal knowledge base assistant used by 50 employees. The AI can only read from a non-sensitive internal FAQ database. All outputs are reviewed by a human before any action is taken. The system is accessible only from inside the corporate network.
| AIVSS Dimension | Assessment | Score |
|---|---|---|
| Attack Vector | Adjacent — internal network only | 0.62 |
| Attack Complexity | Low — same injection technique | 0.77 |
| Attack Requirements | Requires corporate network access | 0.44 |
| AI Autonomy Level | Low — read-only, human reviews all outputs | 0.2 |
| Data Sensitivity | Low — internal FAQ only, no PII | 0.2 |
| Human Oversight | Full — human reviews every output | 0.1 |
| Technical Impact | Low — read-only access to non-sensitive data | 0.2 |
| Business Impact | Low — minimal regulatory or financial risk | 0.2 |
| COMPOSITE AIVSS SCORE | MEDIUM | 4.2 |
The same vulnerability type — prompt injection — scored 9.4 (Critical) in Example 1 and 4.2 (Medium) in Example 2. This is precisely the insight that AIVSS is designed to generate: context determines severity, and treating these two findings as equivalent would be a significant misallocation of remediation resources.
Example 3: Training Data Poisoning in a Healthcare AI — High Severity
Scenario: A healthcare AI used for preliminary patient symptom triage has been identified as potentially vulnerable to training data poisoning through a third-party dataset that was not verified before use. The model’s outputs are reviewed by a clinician before any clinical action is taken, but the volume of cases means clinicians rely heavily on the AI’s prioritization recommendations.
| AIVSS Dimension | Assessment | Score |
|---|---|---|
| Attack Vector | Network — third-party data supply chain | 0.85 |
| Attack Complexity | High — requires compromising third-party data source | 0.44 |
| AI Autonomy Level | Medium — clinician reviews but relies heavily on AI prioritization | 0.6 |
| Data Sensitivity | Highly sensitive — patient medical data | 0.9 |
| Technical Impact | High — systematic model bias could affect triage accuracy | 0.75 |
| Societal Impact | Critical — patient safety risk at population scale | 0.95 |
| COMPOSITE AIVSS SCORE | HIGH | 7.8 |
5. 📝 The AIVSS Copy-Paste Scoring Template
Use this template during your OWASP AI Testing Guide assessments, your LLM Red Team exercises, and your AI compliance audits to produce standardized AIVSS scores for every AI vulnerability finding. Complete one template per vulnerability identified.
AIVSS Vulnerability Scoring Template
Vulnerability Reference: [ID/Name]
AI System: [System name and version]
Date Scored: [Date]
Scored By: [Name/Team]
OWASP LLM/Agentic Category: [e.g., LLM01, LLM04, OWASP Agentic Risk 3]
GROUP 1 — EXPLOITABILITY
Attack Vector: [ ] Network (0.85) [ ] Adjacent (0.62) [ ] Local (0.55) [ ] Physical (0.2)
Attack Complexity: [ ] Low (0.77) [ ] Medium (0.62) [ ] High (0.44)
Attack Requirements: [ ] None (0.85) [ ] Low (0.62) [ ] High (0.44) [ ] Requires Training Access (0.1)
Privileges Required: [ ] None (0.85) [ ] Low (0.62) [ ] High (0.27)
GROUP 2 — AI-SPECIFIC CONTEXT
AI Autonomy Level: [ ] Fully Autonomous (1.0) [ ] High (0.85) [ ] Medium (0.6) [ ] Low (0.4) [ ] Supervised (0.2)
Data Sensitivity: [ ] Highly Sensitive (0.9) [ ] Sensitive (0.6) [ ] Internal (0.3) [ ] Public (0.1)
Human Oversight: [ ] None (0.9) [ ] Minimal (0.7) [ ] Partial (0.4) [ ] Full (0.1)
Deployment Scale: [ ] Millions of users (0.9) [ ] Large enterprise (0.8) [ ] Team (0.4) [ ] Single user (0.1)
GROUP 3 — IMPACT
Technical Impact: [ ] Critical (0.9–1.0) [ ] High (0.7–0.89) [ ] Medium (0.4–0.69) [ ] Low (0.1–0.39)
Business Impact: [ ] Critical (0.9–1.0) [ ] High (0.7–0.89) [ ] Medium (0.4–0.69) [ ] Low (0.1–0.39)
Societal Impact: [ ] Critical (0.9–1.0) [ ] High (0.7–0.89) [ ] Medium (0.4–0.69) [ ] Low (0.1–0.39) [ ] None (0.0)
COMPOSITE AIVSS SCORE: [0.0–10.0]
SEVERITY RATING: [ ] Critical [ ] High [ ] Medium [ ] Low [ ] Informational
RECOMMENDED REMEDIATION TIMELINE: [Immediate / 7 days / 30 days / 90 days / Document only]
COMPENSATING CONTROLS: [List any interim mitigations applied pending full remediation]
REMEDIATION OWNER: [Name/Team responsible for fix]
🔒 Building an AI governance framework? Browse the AI Buzz Governance & Security Hub — 30+ in-depth guides covering OWASP, NIST, ISO 42001, AI risk management, and enterprise AI security frameworks.
6. 🔗 Integrating AIVSS into Your AI Security Program
AIVSS delivers the most value when it is integrated into your broader AI security program rather than used as a standalone scoring exercise. Here is how AIVSS connects to your existing governance and security architecture:
- AI Risk Assessment: Use AIVSS scores to populate the risk register component of your AI Risk Assessment — providing quantified severity ratings for each identified risk rather than qualitative high/medium/low estimates.
- Red Team Findings: Standardize the output of your LLM Red Teaming exercises with AIVSS scoring — enabling consistent comparison of findings across different red team exercises and different AI systems.
- Vendor Due Diligence: Request AIVSS-scored vulnerability disclosures from AI vendors as part of your AI Vendor Due Diligence process — providing a standardized basis for comparing the security posture of competing AI products.
- Compliance Audit Evidence: AIVSS scoring records provide structured audit evidence for your AI compliance audit — demonstrating that your organization applies a rigorous, standards-aligned methodology to AI vulnerability assessment and prioritization.
- AI System Cards: Include AIVSS scores for known vulnerabilities in your AI System Cards — providing downstream operators and users with quantified risk information about the AI systems they deploy.
- Monitoring Integration: Connect AIVSS severity thresholds to your AI Monitoring and Observability alerting framework — so that behavioral signals indicating exploitation of High or Critical AIVSS-rated vulnerabilities trigger immediate incident response.
According to IBM’s Cost of a Data Breach Report 2025, organizations that apply structured vulnerability prioritization frameworks to their AI security programs resolve critical vulnerabilities 47% faster than those using ad-hoc prioritization — and experience significantly lower average breach costs when incidents do occur.
7. 🆚 AIVSS and the Broader OWASP AI Security Ecosystem
AIVSS does not exist in isolation — it is one component of OWASP’s comprehensive AI security framework that provides interlocking tools for AI vulnerability identification, testing, and scoring:
| OWASP AI Framework | Purpose | How It Works With AIVSS |
|---|---|---|
| OWASP LLM Top 10 | Identifies the ten most critical LLM vulnerability categories | Provides the vulnerability categories that AIVSS is used to score and prioritize |
| OWASP Agentic Top 10 | Identifies the ten most critical risks specific to agentic AI applications | AIVSS’s AI Autonomy dimension is specifically calibrated for agentic risk scoring |
| OWASP AI Testing Guide | Provides structured test plans for discovering AI vulnerabilities | AIVSS scores the vulnerabilities discovered through OWASP AI Testing Guide test execution |
| OWASP AIBOM Generator | Creates AI Bills of Materials for AI supply chain visibility | AIBOM components can be associated with AIVSS scores for component-level risk tracking |
🏁 Conclusion: Score It, Prioritize It, Fix What Matters
AI vulnerability management without structured severity scoring is not security — it is security theater. A backlog of unscored AI findings treated with equal urgency produces organizations that are simultaneously overwhelmed with remediation work and genuinely exposed to their most dangerous vulnerabilities because those vulnerabilities were never distinguished from the theoretical edge cases that appeared on the same list.
AIVSS changes this. By capturing the dimensions that actually determine how dangerous an AI vulnerability is — the exploitability in the specific deployment context, the autonomy level of the AI system, the sensitivity of the data it accesses, the degree of human oversight in place, and the real-world impact across technical, business, and societal dimensions — AIVSS produces severity scores that reflect reality rather than theory. The result is security investment directed where it matters most, remediation effort focused on the vulnerabilities that create genuine risk, and an AI security program that delivers measurable protection rather than impressive-looking lists.
📌 Key Takeaways
| ✅ | Takeaway |
|---|---|
| ✅ | CVSS was not designed for AI vulnerabilities — AIVSS fills this gap with a scoring methodology built specifically for the probabilistic, context-dependent nature of AI security risks. |
| ✅ | AIVSS scores across three groups: Exploitability, AI-Specific Context, and Impact — producing a composite 0–10 score that maps to five severity levels. |
| ✅ | The AI-Specific Context group — covering AI Autonomy Level, Data Sensitivity, Human Oversight, and Deployment Scale — is what most differentiates AIVSS from conventional vulnerability scoring. |
| ✅ | The same vulnerability type can score Critical (9.4) in one deployment context and Medium (4.2) in another — demonstrating why context-aware scoring is essential for accurate risk prioritization. |
| ✅ | AIVSS Impact scoring captures three dimensions — Technical, Business, and Societal — providing a more complete picture of real-world harm than the traditional CIA triad. |
| ✅ | AIVSS integrates with OWASP LLM Top 10, OWASP Agentic Top 10, OWASP AI Testing Guide, and the OWASP AIBOM Generator to form a complete AI security assessment ecosystem. |
| ✅ | Organizations using structured vulnerability prioritization frameworks resolve critical vulnerabilities 47% faster than those using ad-hoc approaches, according to IBM research. |
| ✅ | The copy-paste AIVSS scoring template in this guide provides a standardized, repeatable method for scoring every AI vulnerability finding across red team exercises, audits, and ongoing security assessments. |
🔗 Related Articles
- 📖 OWASP Top 10 Risks for LLMs and GenAI Apps Explained
- 📖 OWASP AI Testing Guide v1 Explained: A Practical Standard for Testing AI Trustworthiness
- 📖 LLM Red Teaming for Beginners: Test Your AI Before Attackers Do
- 📖 OWASP Top 10 for Agentic Applications (2026) Explained
- 📖 The AI Audit Checklist: How to Prove Your Company is Compliant in 2026
❓ Frequently Asked Questions: OWASP AIVSS
1. Is OWASP AIVSS only useful for security teams or can business stakeholders use it too?
Business stakeholders can and should use it. The AIVSS scoring output — a numerical severity rating with a plain-English risk summary — is designed to be boardroom-readable. A CISO can use the raw score to prioritize remediation budgets, while a legal team can use the same output to assess AI Liability exposure before a system goes live.
2. How is AIVSS different from the traditional CVSS score used for standard software vulnerabilities?
CVSS was built for static code vulnerabilities — it does not account for the probabilistic, context-dependent behavior of AI systems. AIVSS adds AI-specific dimensions like “Autonomy Level” and “Human Oversight Availability” to the scoring model — factors that are irrelevant in traditional software but critical when assessing the blast radius of a rogue AI agent.
3. Can AIVSS scores change over time for the same vulnerability without any code change?
Yes — and this is one of its most important features. An AIVSS score can increase if the deployment context changes. An agent granted new tool access, connected to a new MCP server, or moved into a higher-stakes environment inherits a higher severity score for the same underlying vulnerability — even if the model itself has not changed.
4. Does a high AIVSS score automatically trigger a mandatory remediation process?
Not by default — but it should in any mature governance program. Organizations should define internal AIVSS thresholds in their Corporate AI Policy that trigger escalation paths — for example, any score above 7.0 requires a Human-in-the-Loop review gate before the system can remain in production. Without defined thresholds, scores are just numbers.
5. Can AIVSS be used to compare the risk profiles of two competing AI vendor products?
Yes — and this is an underused application. Running the same vulnerability scenarios through AIVSS for two competing vendor systems produces comparable, objective severity scores. This gives procurement teams a data-driven security comparison to include in their AI Vendor Due Diligence review — moving the conversation beyond marketing claims to measurable risk evidence.





Leave a Reply