The Business of AI, Decoded

OWASP AIVSS Explained: How to Score AI & Agent Vulnerabilities (with the AIVSS Calculator) + Copy/Paste Template

81. OWASP AIVSS Explained: How to Score AI & Agent Vulnerabilities (with the AIVSS Calculator) + Copy/Paste Template

🧮 Not all AI vulnerabilities are equally dangerous — but most organizations treat them as if they are. The OWASP AI Vulnerability Severity Scoring System (AIVSS) gives security teams a structured, repeatable method to score AI-specific vulnerabilities by real-world impact — so you can prioritize the risks that matter most and stop treating every finding as equally urgent. This 2026 guide explains every dimension of AIVSS with real scoring examples and a copy-paste template.

Last Updated: May 2, 2026

Every organization that deploys AI systems discovers vulnerabilities. Some of those vulnerabilities are genuinely catastrophic — a prompt injection flaw that allows an attacker to exfiltrate customer data, a poisoned training dataset that causes an AI to systematically discriminate against a protected class, a misconfigured agentic system that can be manipulated into taking irreversible real-world actions. Others are theoretical edge cases that would require an implausible chain of circumstances to produce any meaningful harm.

The problem is that most AI security programs treat all of these vulnerabilities the same way — they list them, flag them, and add them to a remediation backlog without any structured method for determining which ones to fix first. The result is security teams spending resources on low-impact theoretical risks while genuinely dangerous vulnerabilities remain unaddressed because they appeared on the same list.

The OWASP AI Vulnerability Severity Scoring System (AIVSS) is designed to solve exactly this problem. Developed by the Open Web Application Security Project as part of its AI security framework, AIVSS provides a structured, multi-dimensional scoring methodology specifically designed for the unique characteristics of AI vulnerabilities — going beyond the traditional CVSS approach that was built for conventional software vulnerabilities and does not adequately capture the probabilistic, context-dependent nature of AI security risks.

This guide explains every dimension of the AIVSS scoring methodology, walks through real scoring examples for common AI vulnerability types, and provides a practical copy-paste scoring template that security teams can use immediately to bring structured prioritization to their AI vulnerability management programs.

📖 New to AI terminology? Visit the AI Buzz AI Glossary — 65+ essential AI terms explained in plain English, each linking to a full in-depth guide.

1. 🎯 Why Standard Vulnerability Scoring Fails for AI

Before examining AIVSS, it is essential to understand why the existing industry-standard vulnerability scoring approach — the Common Vulnerability Scoring System (CVSS) — is insufficient for AI vulnerabilities. CVSS was designed for conventional software vulnerabilities: buffer overflows, SQL injections, authentication bypasses — deterministic flaws where a specific input reliably produces a specific malicious output.

AI vulnerabilities are fundamentally different in three ways that CVSS does not adequately capture:

  • Probabilistic Nature: An AI vulnerability may allow an attack to succeed 40% of the time rather than 100% of the time. CVSS has no mechanism for capturing this probabilistic dimension — treating a vulnerability that reliably produces catastrophic outcomes the same as one that occasionally produces minor ones.
  • Context Dependency: The severity of an AI vulnerability depends heavily on how the AI is deployed and what it is authorized to do. A prompt injection vulnerability in a read-only AI assistant is very different from the same vulnerability in an agentic AI with access to email, databases, and financial systems — but CVSS treats both identically.
  • Novel Attack Vectors: AI-specific attack types — data poisoning, model extraction, membership inference, adversarial examples — do not map cleanly to the CVSS attack vector taxonomy, which was built around network, adjacent, and local access concepts that do not translate meaningfully to AI threat models.

The Core Gap: CVSS scores tell you how severe a conventional software vulnerability is. AIVSS scores tell you how severe an AI vulnerability is — accounting for the probability of exploitation, the AI system’s specific deployment context, the autonomy level of the system, and the real-world impact of successful exploitation across technical, business, and societal dimensions.

DimensionCVSS (Traditional)AIVSS (AI-Specific)
Attack Model Deterministic — same input, same output Probabilistic — accounts for attack success rate variability
Context Sensitivity Limited — deployment context not captured High — deployment context and AI autonomy level are core scoring dimensions
AI Attack Vectors Not covered — network/local taxonomy only Native support for poisoning, extraction, injection, and adversarial attacks
Impact Dimensions Confidentiality, Integrity, Availability Technical + Business + Societal impact dimensions
Designed For Software and network vulnerabilities AI and machine learning system vulnerabilities

2. 🏗️ The AIVSS Scoring Architecture

AIVSS scores AI vulnerabilities across three primary scoring groups — each capturing a different dimension of risk that is essential for accurate severity assessment. The final AIVSS score is a weighted composite of scores across all three groups, producing a numerical result on a 0–10 scale that maps to five severity levels.

Scoring Group 1: Exploitability

The Exploitability group captures how difficult it is for an attacker to successfully exploit the vulnerability. AIVSS uses four exploitability dimensions specifically calibrated for AI attack types:

DimensionWhat It MeasuresScore RangeAI Example
Attack Vector How the attacker accesses the AI system to execute the attack Physical (0.2) to Network (0.85) A public-facing chatbot scores Network (0.85)
Attack Complexity Technical skill and resources required to execute the attack High (0.44) to Low (0.77) Simple jailbreak = Low (0.77)
Attack Requirements Prerequisites the attacker must satisfy before the attack is possible Requires training access (0.1) to None (0.85) No account needed = None (0.85)
Privileges Required What level of privilege the attacker needs in the target system High (0.27) to None (0.85) Anonymous user access = None (0.85)

Scoring Group 2: AI-Specific Context

This is the scoring group that most differentiates AIVSS from CVSS — capturing the AI deployment context dimensions that fundamentally affect the real-world severity of an AI vulnerability.

DimensionWhat It MeasuresScore RangeAI Example
AI Autonomy Level How autonomously the AI system operates — the higher the autonomy, the greater the potential impact of exploitation Supervised (0.2) to Fully Autonomous (1.0) Agentic AI with tool access = High Autonomy (0.85)
Data Sensitivity The sensitivity classification of data the AI accesses or processes Public (0.1) to Highly Sensitive (0.9) Medical records access = Highly Sensitive (0.9)
Human Oversight The degree of human review and approval applied to AI outputs before they take effect Full oversight (0.1) to No oversight (0.9) Auto-execute without review = No oversight (0.9)
Deployment Scale The scale at which the AI system is deployed — affecting how many people are exposed to successful exploitation Single user (0.1) to Millions of users (0.9) Enterprise-wide deployment = Large scale (0.8)

Scoring Group 3: Impact

The Impact group captures the consequences of successful exploitation across three dimensions — technical, business, and societal — providing a more complete picture of real-world harm than the traditional CIA triad (Confidentiality, Integrity, Availability) used by CVSS.

Impact DimensionWhat It CapturesExamples of High Impact
Technical Impact Damage to data confidentiality, system integrity, and service availability Complete data exfiltration, model weight theft, system unavailability
Business Impact Financial damage, regulatory penalties, reputational harm, and operational disruption GDPR fines, customer churn from trust breach, AI system shutdown costs
Societal Impact Harm to individuals, groups, and society beyond the immediate organization — including bias, discrimination, and safety risks Discriminatory AI decisions at scale, safety-critical system failures, public trust in AI erosion

3. 📏 The AIVSS Severity Scale

The composite AIVSS score — calculated by applying defined weights to the scores across all three groups — produces a final numerical score on a 0–10 scale that maps to five severity levels. These levels drive remediation prioritization decisions in the same way that CVSS severity levels drive traditional vulnerability management.

Score RangeSeverityRecommended ResponseExample AI Vulnerability
9.0 – 10.0Critical Immediate remediation or system suspension — escalate to CISO within 24 hours Prompt injection enabling data exfiltration from a public-facing agentic system with no human oversight
7.0 – 8.9High Remediation within 7 days — compensating controls implemented immediately Training data poisoning in a high-stakes decision support system with limited human review
4.0 – 6.9Medium Remediation within 30 days — include in next sprint planning cycle System prompt leakage in an internal chatbot with low data sensitivity and strong human oversight
1.0 – 3.9Low Remediation within 90 days — document and schedule for standard maintenance cycle Theoretical model extraction attack requiring physical access to air-gapped deployment
0.0 – 0.9Informational Document for awareness — no remediation action required at this time Theoretical hallucination risk in a fully human-reviewed, low-stakes internal tool

4. 🔬 Real-World AIVSS Scoring Examples

The best way to understand AIVSS is to see it applied to real vulnerability scenarios. The following three examples illustrate how the same general vulnerability type produces very different AIVSS scores depending on deployment context — demonstrating exactly why context-aware scoring is essential for accurate AI risk prioritization.

Example 1: Prompt Injection in a Public Agentic System — Critical Severity

Scenario: A publicly accessible AI customer service agent has unrestricted access to the customer database, can send emails on behalf of customers, and has no human oversight gate before executing actions. A prompt injection vulnerability has been identified that allows an attacker to override the system prompt through indirect injection via a malicious document the agent reads.

AIVSS DimensionAssessmentScore
Attack Vector Network — publicly accessible API0.85
Attack Complexity Low — indirect injection via document is well-documented0.77
Attack Requirements None — anonymous attacker can submit malicious documents0.85
Privileges Required None — no account required0.85
AI Autonomy Level High — agent executes actions without human approval0.85
Data Sensitivity Highly sensitive — full customer database access0.9
Human Oversight None — fully automated execution0.9
Technical Impact High — complete data access and email capability0.9
Business Impact Critical — regulatory breach, mass data exfiltration, reputational damage0.95
Societal Impact High — personal data of many individuals exposed0.8
COMPOSITE AIVSS SCORE CRITICAL9.4

Example 2: The Same Prompt Injection in an Internal Read-Only Tool — Medium Severity

Scenario: The same prompt injection vulnerability exists in an internal knowledge base assistant used by 50 employees. The AI can only read from a non-sensitive internal FAQ database. All outputs are reviewed by a human before any action is taken. The system is accessible only from inside the corporate network.

AIVSS DimensionAssessmentScore
Attack Vector Adjacent — internal network only0.62
Attack Complexity Low — same injection technique0.77
Attack Requirements Requires corporate network access0.44
AI Autonomy Level Low — read-only, human reviews all outputs0.2
Data Sensitivity Low — internal FAQ only, no PII0.2
Human Oversight Full — human reviews every output0.1
Technical Impact Low — read-only access to non-sensitive data0.2
Business Impact Low — minimal regulatory or financial risk0.2
COMPOSITE AIVSS SCOREMEDIUM4.2

The same vulnerability type — prompt injection — scored 9.4 (Critical) in Example 1 and 4.2 (Medium) in Example 2. This is precisely the insight that AIVSS is designed to generate: context determines severity, and treating these two findings as equivalent would be a significant misallocation of remediation resources.

Example 3: Training Data Poisoning in a Healthcare AI — High Severity

Scenario: A healthcare AI used for preliminary patient symptom triage has been identified as potentially vulnerable to training data poisoning through a third-party dataset that was not verified before use. The model’s outputs are reviewed by a clinician before any clinical action is taken, but the volume of cases means clinicians rely heavily on the AI’s prioritization recommendations.

AIVSS DimensionAssessmentScore
Attack Vector Network — third-party data supply chain0.85
Attack Complexity High — requires compromising third-party data source0.44
AI Autonomy Level Medium — clinician reviews but relies heavily on AI prioritization0.6
Data Sensitivity Highly sensitive — patient medical data0.9
Technical Impact High — systematic model bias could affect triage accuracy0.75
Societal Impact Critical — patient safety risk at population scale0.95
COMPOSITE AIVSS SCOREHIGH7.8

5. 📝 The AIVSS Copy-Paste Scoring Template

Use this template during your OWASP AI Testing Guide assessments, your LLM Red Team exercises, and your AI compliance audits to produce standardized AIVSS scores for every AI vulnerability finding. Complete one template per vulnerability identified.

AIVSS Vulnerability Scoring Template

Vulnerability Reference: [ID/Name]
AI System: [System name and version]
Date Scored: [Date]
Scored By: [Name/Team]
OWASP LLM/Agentic Category: [e.g., LLM01, LLM04, OWASP Agentic Risk 3]

GROUP 1 — EXPLOITABILITY
Attack Vector: [ ] Network (0.85) [ ] Adjacent (0.62) [ ] Local (0.55) [ ] Physical (0.2)
Attack Complexity: [ ] Low (0.77) [ ] Medium (0.62) [ ] High (0.44)
Attack Requirements: [ ] None (0.85) [ ] Low (0.62) [ ] High (0.44) [ ] Requires Training Access (0.1)
Privileges Required: [ ] None (0.85) [ ] Low (0.62) [ ] High (0.27)

GROUP 2 — AI-SPECIFIC CONTEXT
AI Autonomy Level: [ ] Fully Autonomous (1.0) [ ] High (0.85) [ ] Medium (0.6) [ ] Low (0.4) [ ] Supervised (0.2)
Data Sensitivity: [ ] Highly Sensitive (0.9) [ ] Sensitive (0.6) [ ] Internal (0.3) [ ] Public (0.1)
Human Oversight: [ ] None (0.9) [ ] Minimal (0.7) [ ] Partial (0.4) [ ] Full (0.1)
Deployment Scale: [ ] Millions of users (0.9) [ ] Large enterprise (0.8) [ ] Team (0.4) [ ] Single user (0.1)

GROUP 3 — IMPACT
Technical Impact: [ ] Critical (0.9–1.0) [ ] High (0.7–0.89) [ ] Medium (0.4–0.69) [ ] Low (0.1–0.39)
Business Impact: [ ] Critical (0.9–1.0) [ ] High (0.7–0.89) [ ] Medium (0.4–0.69) [ ] Low (0.1–0.39)
Societal Impact: [ ] Critical (0.9–1.0) [ ] High (0.7–0.89) [ ] Medium (0.4–0.69) [ ] Low (0.1–0.39) [ ] None (0.0)

COMPOSITE AIVSS SCORE: [0.0–10.0]
SEVERITY RATING: [ ] Critical [ ] High [ ] Medium [ ] Low [ ] Informational

RECOMMENDED REMEDIATION TIMELINE: [Immediate / 7 days / 30 days / 90 days / Document only]
COMPENSATING CONTROLS: [List any interim mitigations applied pending full remediation]
REMEDIATION OWNER: [Name/Team responsible for fix]

🔒 Building an AI governance framework? Browse the AI Buzz Governance & Security Hub — 30+ in-depth guides covering OWASP, NIST, ISO 42001, AI risk management, and enterprise AI security frameworks.

6. 🔗 Integrating AIVSS into Your AI Security Program

AIVSS delivers the most value when it is integrated into your broader AI security program rather than used as a standalone scoring exercise. Here is how AIVSS connects to your existing governance and security architecture:

  • AI Risk Assessment: Use AIVSS scores to populate the risk register component of your AI Risk Assessment — providing quantified severity ratings for each identified risk rather than qualitative high/medium/low estimates.
  • Red Team Findings: Standardize the output of your LLM Red Teaming exercises with AIVSS scoring — enabling consistent comparison of findings across different red team exercises and different AI systems.
  • Vendor Due Diligence: Request AIVSS-scored vulnerability disclosures from AI vendors as part of your AI Vendor Due Diligence process — providing a standardized basis for comparing the security posture of competing AI products.
  • Compliance Audit Evidence: AIVSS scoring records provide structured audit evidence for your AI compliance audit — demonstrating that your organization applies a rigorous, standards-aligned methodology to AI vulnerability assessment and prioritization.
  • AI System Cards: Include AIVSS scores for known vulnerabilities in your AI System Cards — providing downstream operators and users with quantified risk information about the AI systems they deploy.
  • Monitoring Integration: Connect AIVSS severity thresholds to your AI Monitoring and Observability alerting framework — so that behavioral signals indicating exploitation of High or Critical AIVSS-rated vulnerabilities trigger immediate incident response.

According to IBM’s Cost of a Data Breach Report 2025, organizations that apply structured vulnerability prioritization frameworks to their AI security programs resolve critical vulnerabilities 47% faster than those using ad-hoc prioritization — and experience significantly lower average breach costs when incidents do occur.

7. 🆚 AIVSS and the Broader OWASP AI Security Ecosystem

AIVSS does not exist in isolation — it is one component of OWASP’s comprehensive AI security framework that provides interlocking tools for AI vulnerability identification, testing, and scoring:

OWASP AI FrameworkPurposeHow It Works With AIVSS
OWASP LLM Top 10 Identifies the ten most critical LLM vulnerability categories Provides the vulnerability categories that AIVSS is used to score and prioritize
OWASP Agentic Top 10 Identifies the ten most critical risks specific to agentic AI applications AIVSS’s AI Autonomy dimension is specifically calibrated for agentic risk scoring
OWASP AI Testing Guide Provides structured test plans for discovering AI vulnerabilities AIVSS scores the vulnerabilities discovered through OWASP AI Testing Guide test execution
OWASP AIBOM Generator Creates AI Bills of Materials for AI supply chain visibility AIBOM components can be associated with AIVSS scores for component-level risk tracking

🏁 Conclusion: Score It, Prioritize It, Fix What Matters

AI vulnerability management without structured severity scoring is not security — it is security theater. A backlog of unscored AI findings treated with equal urgency produces organizations that are simultaneously overwhelmed with remediation work and genuinely exposed to their most dangerous vulnerabilities because those vulnerabilities were never distinguished from the theoretical edge cases that appeared on the same list.

AIVSS changes this. By capturing the dimensions that actually determine how dangerous an AI vulnerability is — the exploitability in the specific deployment context, the autonomy level of the AI system, the sensitivity of the data it accesses, the degree of human oversight in place, and the real-world impact across technical, business, and societal dimensions — AIVSS produces severity scores that reflect reality rather than theory. The result is security investment directed where it matters most, remediation effort focused on the vulnerabilities that create genuine risk, and an AI security program that delivers measurable protection rather than impressive-looking lists.

📌 Key Takeaways

Takeaway
CVSS was not designed for AI vulnerabilities — AIVSS fills this gap with a scoring methodology built specifically for the probabilistic, context-dependent nature of AI security risks.
AIVSS scores across three groups: Exploitability, AI-Specific Context, and Impact — producing a composite 0–10 score that maps to five severity levels.
The AI-Specific Context group — covering AI Autonomy Level, Data Sensitivity, Human Oversight, and Deployment Scale — is what most differentiates AIVSS from conventional vulnerability scoring.
The same vulnerability type can score Critical (9.4) in one deployment context and Medium (4.2) in another — demonstrating why context-aware scoring is essential for accurate risk prioritization.
AIVSS Impact scoring captures three dimensions — Technical, Business, and Societal — providing a more complete picture of real-world harm than the traditional CIA triad.
AIVSS integrates with OWASP LLM Top 10, OWASP Agentic Top 10, OWASP AI Testing Guide, and the OWASP AIBOM Generator to form a complete AI security assessment ecosystem.
Organizations using structured vulnerability prioritization frameworks resolve critical vulnerabilities 47% faster than those using ad-hoc approaches, according to IBM research.
The copy-paste AIVSS scoring template in this guide provides a standardized, repeatable method for scoring every AI vulnerability finding across red team exercises, audits, and ongoing security assessments.

🔗 Related Articles

❓ Frequently Asked Questions: OWASP AIVSS

1. Is OWASP AIVSS only useful for security teams or can business stakeholders use it too?

Business stakeholders can and should use it. The AIVSS scoring output — a numerical severity rating with a plain-English risk summary — is designed to be boardroom-readable. A CISO can use the raw score to prioritize remediation budgets, while a legal team can use the same output to assess AI Liability exposure before a system goes live.

2. How is AIVSS different from the traditional CVSS score used for standard software vulnerabilities?

CVSS was built for static code vulnerabilities — it does not account for the probabilistic, context-dependent behavior of AI systems. AIVSS adds AI-specific dimensions like “Autonomy Level” and “Human Oversight Availability” to the scoring model — factors that are irrelevant in traditional software but critical when assessing the blast radius of a rogue AI agent.

3. Can AIVSS scores change over time for the same vulnerability without any code change?

Yes — and this is one of its most important features. An AIVSS score can increase if the deployment context changes. An agent granted new tool access, connected to a new MCP server, or moved into a higher-stakes environment inherits a higher severity score for the same underlying vulnerability — even if the model itself has not changed.

4. Does a high AIVSS score automatically trigger a mandatory remediation process?

Not by default — but it should in any mature governance program. Organizations should define internal AIVSS thresholds in their Corporate AI Policy that trigger escalation paths — for example, any score above 7.0 requires a Human-in-the-Loop review gate before the system can remain in production. Without defined thresholds, scores are just numbers.

5. Can AIVSS be used to compare the risk profiles of two competing AI vendor products?

Yes — and this is an underused application. Running the same vulnerability scenarios through AIVSS for two competing vendor systems produces comparable, objective severity scores. This gives procurement teams a data-driven security comparison to include in their AI Vendor Due Diligence review — moving the conversation beyond marketing claims to measurable risk evidence.

Join our YouTube Channel for weekly AI Tutorials.



Share with others!


Author of AI Buzz

About the Author

Sapumal Herath

Sapumal is a specialist in Data Analytics and Business Intelligence. He focuses on helping businesses leverage AI and Power BI to drive smarter decision-making. Through AI Buzz, he shares his expertise on the future of work and emerging AI technologies. Follow him on LinkedIn for more tech insights.

Leave a Reply

Your email address will not be published. Required fields are marked *

Latest Posts…