Every discipline that has absorbed new technology into its practice has faced a vocabulary problem. When financial derivatives entered corporate treasury, risk managers had to learn what a delta was. When cloud computing entered enterprise IT, audit teams had to understand what a shared-responsibility model meant. In both cases, the new vocabulary was eventually domesticated — absorbed into the existing control catalog with modest adjustments.
Artificial intelligence poses a different kind of vocabulary problem. It is not simply that GRC professionals need to learn new terms for new artifacts. It is that some AI concepts are genuinely incompatible with the semantic structure of existing governance language — they describe properties and behaviors that the existing vocabulary has no mechanism to represent accurately. And when the vocabulary fails, the control frameworks built on that vocabulary fail with it.
This essay provides a practical lexicon for GRC professionals who need to engage AI vocabulary precisely — followed by an analysis of why that vocabulary, however well mastered, is insufficient without new hybrid concepts that bridge AI technical reality and governance institutional logic. The foundational essay on AI governance nomenclature and the analysis of opacity, emergence, and velocity provide complementary context.
Why GRC Needs Its Own AI Dictionary
The vocabulary gap has practical consequences. A GRC professional who does not understand what model drift means cannot assess whether a model validation program is adequate. A board risk committee that does not understand what emergent behavior means cannot evaluate whether the organization's AI oversight structure is capable of detecting it. An internal auditor who does not understand the difference between a fine-tuned model and a foundation model cannot design a meaningful model governance audit.
The problem is compounded because AI vocabulary is not stable. New terms enter active use within months of new research publications; existing terms get repurposed, redefined, or superseded. The GRC professional's goal is not to track every development in AI research — that is the AI team's job. It is to maintain fluency in the vocabulary that has become load-bearing for governance: the terms that determine what controls are necessary, what risks are being accepted, and what accountability mechanisms are required.
The agentic AI executive brief provides additional context on why agentic systems specifically demand new governance vocabulary. The vocabulary survey below covers the terms that appear most frequently in governance, regulatory, and compliance contexts as of 2026.
Core Lexicon: A through M
Agentic Workflow
A multi-step AI-driven process in which the system autonomously plans, executes actions, and adapts its approach based on intermediate results, without human review of each step. GRC relevance: Agentic workflows collapse the human review window and require programmatic controls — guardrails, kill switches, action audit trails — rather than traditional approval-queue oversight. See The Four Blind Spots for the governance implications.
Alignment Problem
The technical and philosophical challenge of ensuring that an AI system's goals, values, and behaviors remain consistent with human intentions, especially as the system becomes more capable. GRC relevance: Alignment failure at the system level is the research community's framing of the risk that governance frameworks address at the organizational level through oversight mechanisms and behavioral monitoring.
Black Box
An AI system whose internal logic — the process by which it transforms inputs into outputs — is opaque and not directly inspectable by human reviewers. Most large foundation models are black boxes in the strict sense: their outputs cannot be fully explained by examining their architecture. GRC relevance: Black-box models require compensating controls (behavioral testing, output monitoring, adversarial evaluation) because traditional audit methods that rely on process inspection are unavailable.
Chain-of-Thought Reasoning
A prompting technique and training objective in which an AI model generates intermediate reasoning steps before producing a final answer, making its inference process partially visible. GRC relevance: Chain-of-thought outputs provide partial auditability but are not equivalent to a true audit trail — the model can produce plausible-sounding reasoning that does not accurately represent its actual computational process.
Constitutional AI
An alignment technique developed by Anthropic in which an AI model is trained to evaluate and revise its own outputs against a stated set of behavioral principles. GRC relevance: Relevant for vendor due diligence: knowing the alignment method used affects how GRC teams assess behavioral stability across model updates.
Emergent Behavior
Capabilities or failure modes that arise in large AI models at scale and were not explicitly trained, anticipated, or designed. GRC relevance: Emergent behavior is the primary reason that ex ante risk enumeration is insufficient for AI systems — the failure modes the organization has not enumerated are frequently the ones that materialize. See the analysis of emergence for the governance implication.
Fine-Tuning
The process of further training a pre-trained foundation model on a domain-specific dataset to adapt its behavior for a particular use case. GRC relevance: Fine-tuned models inherit the risks of the underlying foundation model plus introduce new risks specific to the fine-tuning dataset — including data quality issues, bias amplification, and behavioral changes not visible in the original model's evaluation suite.
Foundation Model
A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks, typically through fine-tuning or prompting. GPT-4, Claude, Gemini, and Llama are examples. GRC relevance: Foundation models are the base layer of most enterprise AI deployments — their characteristics, limitations, and update schedules propagate to all applications built on them.
Frontier Model
The most capable AI models at a given point in time, typically at or near the frontier of what current training methods can produce. GRC relevance: Frontier models attract the most intensive regulatory attention — EU AI Act GPAI provisions, US AI Action Plan high-capability provisions — and typically have the most uncertain emergent behavior profiles.
Grounding
The practice of connecting AI model outputs to verifiable, external sources of information — through retrieval, citations, or structured data access — to reduce hallucination. GRC relevance: Grounding is a primary architectural mitigation for hallucination risk in knowledge-intensive applications. It does not eliminate the risk but substantially reduces it.
Hallucination
The generation by an AI system of factually incorrect, fabricated, or nonsensical content presented with apparent confidence. The term is colloquial; the technical community increasingly prefers confabulation. GRC relevance: Hallucination is a known, architecturally predictable property of current language models. The governance response is architectural verification — not enhanced human review — because human reviewers cannot reliably detect well-formed false output without independently verifying every claim.
Human-in-the-Loop
An AI system design in which a human reviewer examines and approves AI-generated outputs or decisions before they take effect. Distinguished from human-on-the-loop (human monitors the system and can intervene) and human-in-command (human retains ultimate authority but does not review each output). GRC relevance: The appropriate level of human involvement depends on the stakes of the decision and the volume of outputs. Regulatory frameworks increasingly specify minimum human involvement requirements for high-stakes use cases.
Latent Space
The high-dimensional mathematical representation that a neural network uses to encode meaning, context, and relationships between concepts, derived from training. GRC relevance: Latent space analysis is used in bias detection and interpretability research to identify whether a model has encoded discriminatory patterns in its internal representations — relevant for AI fairness audits.
Mixture of Experts (MoE)
A neural network architecture in which different specialized sub-networks ("experts") handle different types of inputs, with a routing mechanism selecting which experts to activate for each query. GRC relevance: MoE architectures can produce inconsistent behavior on inputs that fall at the boundary between expert domains — a consideration for model risk management in applications where consistency across similar inputs is a compliance requirement.
Model Card
A standardized documentation artifact describing an AI model's intended uses, training data, performance characteristics, limitations, and known biases, typically published by the model developer. GRC relevance: Model cards are the primary vendor disclosure mechanism for model-level risk. They are surrogates for auditability, not audit trails — they describe intended behavior, not guaranteed behavior.
Model Drift
The degradation of an AI model's performance over time as the distribution of production data diverges from the training data distribution. Includes both concept drift (the underlying relationship between inputs and correct outputs changes) and data drift (the statistical distribution of input features shifts). GRC relevance: Central to the internal controls blind spot — see The Four Blind Spots of Force-Fitting AI and the vendor accountability analysis in The Fight for AI Credit Justice.
Multimodal
AI systems capable of processing and generating multiple types of data — text, images, audio, video, code — within a single model architecture. GRC relevance: Multimodal models introduce broader attack surfaces and more complex failure mode taxonomies than text-only models — image generation failures, audio deepfake risks, and cross-modal hallucinations are all distinct risk categories.
Core Lexicon: N through Z
Neural Scaling Laws
Empirical observations that AI model performance improves predictably as model size, training data volume, and compute increase, following power-law relationships. GRC relevance: Scaling laws underpin the strategic and regulatory concern about frontier models — they predict that larger models will exhibit qualitatively different (and potentially harder to govern) capabilities. The EU AI Act's GPAI provisions are partly a response to scaling law implications.
Parameters
The numerical values learned during training that define a model's behavior — the "knowledge" encoded in the model's weights. Modern large language models have hundreds of billions to trillions of parameters. GRC relevance: Parameter count is used as a threshold in several regulatory frameworks as a proxy for model capability and risk. The EU AI Act uses compute thresholds (measured in FLOPs) rather than parameter count, but the underlying logic is similar.
Prompt Injection
An attack technique in which malicious instructions are embedded in input data (a document, a web page, an email) that an AI agent processes, causing the agent to execute the attacker's instructions rather than the user's. GRC relevance: Prompt injection is a critical security risk for agentic AI systems with access to external tools or data. It constitutes a new attack surface that traditional application security frameworks do not address.
Retrieval-Augmented Generation (RAG)
An AI architecture in which a language model retrieves relevant documents from an indexed knowledge base before generating its response, grounding the output in retrieved content rather than solely in training data. GRC relevance: RAG reduces but does not eliminate hallucination; it introduces new risks around retrieval source reliability, document currency, and attribution accuracy.
RLHF (Reinforcement Learning from Human Feedback)
A training technique in which human evaluators rate model outputs, and those ratings are used to train a reward model that guides further model refinement, shaping the model's behavior toward human preferences. GRC relevance: RLHF is the primary mechanism by which major AI providers shape their models' behavioral guardrails. Changes to RLHF training data or methodology between model versions are a source of behavioral instability that model risk programs should track.
Stochastic
Involving randomness; for AI, referring to the property that language models generate probabilistic outputs — the same prompt does not always produce the same response. GRC relevance: Stochasticity means that model testing cannot exhaustively characterize model behavior — a model that produces acceptable output on a test set may produce problematic output in production on similar inputs.
Synthetic Data
Data generated by an AI system rather than collected from the real world, used to augment training datasets, test systems, or protect privacy. GRC relevance: Synthetic data introduces a data provenance question — models trained on synthetic data may inherit or amplify biases present in the generator model. For regulated industries, the question of whether synthetic training data satisfies data governance requirements is unresolved.
Temperature
A parameter that controls the randomness of an AI model's outputs — higher temperature produces more varied and creative responses, lower temperature produces more deterministic and consistent responses. GRC relevance: Temperature settings affect model behavior in ways that are relevant to compliance use cases: low-temperature settings are appropriate for factual query applications; higher temperatures introduce more variance and more hallucination risk.
Token
The basic unit of text that an AI language model processes — roughly a word fragment, word, or punctuation mark, depending on the tokenization scheme. Models have maximum context windows measured in tokens. GRC relevance: Context window limits determine how much information a model can process in a single interaction, affecting applications like document analysis, contract review, and compliance auditing where long documents must be processed in full.
Transformer
The neural network architecture underlying most current large language models, introduced in the 2017 paper "Attention Is All You Need." The transformer's attention mechanism enables models to process long-range dependencies in text. GRC relevance: Transformer architecture is not a direct GRC concern, but it is the foundation for understanding why properties like emergent behavior, hallucination, and context sensitivity arise — all of which have direct governance implications.
Trust and Safety (T&S) by Design
The practice of incorporating safety, content moderation, and abuse prevention into AI system architecture from the outset, rather than applying them as post-hoc filters. GRC relevance: T&S by design is becoming a regulatory expectation — the EU AI Act and FTC guidance both imply that safety measures should be architectural rather than cosmetic. It is also a vendor due diligence dimension: whether a provider has implemented T&S architecturally affects the resilience of its safety measures to adversarial inputs.
Zero-Shot Learning
The capability of an AI model to perform a task it was not explicitly trained on, based on its general training. Zero-shot capability is a defining feature of foundation models. GRC relevance: Zero-shot capability means that model risk evaluations cannot anticipate all use cases a model will be applied to — users will apply the model to tasks it was not tested for, and its performance on those tasks is uncertain.
Key Acronyms for GRC-AI Practice
Acronym
Full Term
GRC Relevance
AGI
Artificial General Intelligence
A hypothetical AI system with human-level cognitive ability across all domains. Regulatory frameworks increasingly reference AGI thresholds as triggers for enhanced oversight requirements.
AIaaS
AI as a Service
Cloud-delivered AI capabilities. The shared-responsibility model for AIaaS (who owns what risk) is the central governance question for enterprise AI procurement.
GPAI
General Purpose AI
EU AI Act classification for foundation models with broad applicability. GPAI models with systemic risk (above compute thresholds) face enhanced obligations including adversarial testing and incident reporting.
The emerging sub-discipline at the intersection of GRC and AI. Encompasses AI model governance, AI risk management, responsible AI compliance, and AI regulatory compliance. See AI GRC governance roles.
LLM
Large Language Model
Foundation models trained primarily on text data at scale. The category encompasses GPT, Claude, Gemini, Llama, and most commercial AI platforms. LLM-specific risks — hallucination, prompt injection, model drift — are the primary focus of AI model risk frameworks.
MCP
Model Context Protocol
A protocol developed by Anthropic for structured communication between AI models and external tools. Relevant for GRC teams assessing agentic AI architectures and tool-use security.
MLOps
Machine Learning Operations
The set of practices, tools, and frameworks for deploying and maintaining machine learning models in production. Model governance programs should be integrated with MLOps pipelines — the operational infrastructure is where continuous monitoring and re-validation happen.
RAG
Retrieval-Augmented Generation
See lexicon entry above. A primary architectural mitigation for hallucination risk in knowledge-intensive applications.
RLHF
Reinforcement Learning from Human Feedback
See lexicon entry above. The training technique that shapes model behavioral guardrails.
SLA
Service Level Agreement
In the AI context, SLAs traditionally cover platform availability. Extending SLAs to cover output quality and drift is the frontier of AI vendor contract governance — see The Fight for AI Credit Justice.
T&S
Trust and Safety
The organizational function and design practice responsible for preventing harmful AI outputs and misuse. Increasingly a due diligence dimension in AI vendor assessment.
FLOP
Floating Point Operation
The compute unit used to measure model training scale. EU AI Act GPAI systemic risk threshold: 10²⁵ FLOPs for training. Used as a regulatory proxy for model capability.
Can Existing Governance Terminology Cover AI?
The lexicon above describes AI concepts. The more difficult question is whether the governance vocabulary deployed to manage those concepts — the language of control frameworks, risk registers, audit standards, and board oversight — is adequate to the task. The answer has three parts, corresponding to the three destabilizing features of AI analyzed in the foundational essay in this series.
Opacity and the Limits of Audit Vocabulary
Traditional audit vocabulary — traceability, walkability, process inspection — presupposes that a decision can be reconstructed from its inputs, the rules applied, and the output produced. Auditability, in the conventional sense, requires a readable trail. Black-box AI models do not provide one. The forward pass through billions of parameters that produces an output is not a readable trail; it is a mathematical transformation that cannot be reversed into a human-comprehensible explanation of why a specific output was produced.
The governance vocabulary has responded to this with concepts like explainability and interpretability — but these are not synonyms for auditability, and treating them as such is an error with real consequences. An explainability tool like LIME or SHAP produces a local approximation of model behavior around a specific prediction. It does not produce an audit trail in the traditional sense. GRC frameworks that treat explainability output as audit evidence are using a surrogate that the underlying methodology does not support.
Emergence and the Limits of Risk Register Vocabulary
Risk registers enumerate known risks. They do not have a mechanism for representing risks that cannot be known in advance — risks that arise from properties of a system that could not be predicted from the system's design specifications. The vocabulary of risk management — likelihood, impact, residual risk, risk appetite — presupposes a universe of enumerable events. Emergent behavior in large AI models violates this presupposition.
The regulatory responses to this limitation — the EU AI Act's continuous post-market monitoring requirement, the FTC's emphasis on ongoing behavioral testing, the FSB's call for AI system-level risk assessment — all reflect a recognition that risk register vocabulary, applied to AI without modification, produces an analysis that is systematically incomplete. The specific failure modes that will materialize are precisely the ones that the risk register does not contain.
Scale and Speed and the Limits of Oversight Vocabulary
The vocabulary of oversight — review, approval, escalation, four-eyes, sign-off — presupposes a timescale on which human review is meaningful. Agentic AI systems operating at scale eliminate that timescale. The oversight vocabulary does not fail because humans are too slow to care. It fails because the architecture does not present outputs for review; it executes sequences of actions autonomously, and by the time human review could occur, the action is complete.
The governance response requires new vocabulary that does not presuppose human review: programmatic guardrails (behavioral constraints encoded in the system environment, operating at the same velocity as the agent), capability boundaries (defined limits on what actions an agent may take without human authorization), and audit trails of agent action (post-hoc reconstructions of what the agent did, enabling accountability after the fact if not before). None of these concepts map cleanly onto existing oversight vocabulary.
Hybrid Terms the Discipline Needs
The governance vocabulary gap is not simply a matter of adding AI terms to GRC dictionaries. It requires a small set of genuinely new hybrid concepts that bridge the semantic gap between AI technical reality and governance institutional logic. Three are emerging from regulatory, academic, and practitioner communities as particularly load-bearing.
Algorithmic Fiduciary Duty
Traditional fiduciary duty applies to human actors in positions of trust — directors, trustees, investment advisors — who are obligated to act in the interests of those they serve. As AI systems are deployed to make decisions in fiduciary contexts (investment management, benefits eligibility, medical triage, lending), the question of whether the deploying organization assumes a form of algorithmic fiduciary duty — a heightened obligation of care calibrated to the specific capabilities and limitations of the AI system — is becoming practically significant. The Yale Journal on Regulation has been a primary venue for developing this concept, and several EU AI Act provisions imply a version of it for high-risk system deployers.
Model Audit Trail
A model audit trail is not an audit of the model's internal logic (which is not accessible for black-box systems) but a documented record of model behavior over time: training data provenance, evaluation results at deployment, monitoring metrics over the production period, incidents and near-misses, model updates and their effects on evaluation performance, and human decisions made in response to model outputs. A model audit trail is the governance artifact that makes post-hoc accountability possible in the absence of traditional process auditability. Several regulatory frameworks (EU AI Act, FDA AI/ML guidance for medical devices) are converging on requirements that amount to mandatory model audit trails for high-risk deployments.
Human-in-Command
Distinct from human-in-the-loop (per-output review) and human-on-the-loop (monitoring with intervention capability), human-in-command is the governance principle that humans must retain ultimate authority over the consequences of AI system behavior — even when they cannot review each output in real time. Human-in-command means that the system environment encodes the boundaries of acceptable action, that escalation triggers activate when those boundaries are approached, and that humans receive actionable information about system behavior on a timescale where meaningful response is possible. This is the control topology that agentic AI governance requires. For career pathways in this emerging governance architecture, see the GRC role directory and the AI GRC governance roles guide.
Frequently Asked Questions
What is the difference between hallucination and confabulation in AI?
Both terms refer to the same failure mode: AI-generated output that is factually incorrect but presented with apparent confidence. Hallucination is the colloquial term in widespread use. Confabulation is the more technically precise term from neuropsychology. For GRC purposes, the governance implication is the same: AI output in high-stakes domains requires verification pipelines, not enhanced human review, because reviewers cannot reliably detect well-formed false output without independently verifying every claim.
What is RAG and why does it matter for AI risk management?
Retrieval-Augmented Generation (RAG) grounds AI outputs in retrieved documents before generation, substantially reducing — though not eliminating — hallucination risk. For GRC purposes, RAG introduces new risk dimensions: reliability of the retrieval source, currency of indexed documents, and attribution accuracy. Organizations deploying RAG-based AI in compliance or legal applications should include retrieval source governance in their AI risk frameworks.
What does 'agentic AI' mean for GRC professionals?
Agentic AI systems operate autonomously to complete multi-step tasks, making sequential decisions without human review of each step. For GRC professionals, this requires a completely different control topology: programmatic guardrails, capability boundaries, action audit trails, and kill-switch mechanisms — not human approval queues. The Four Blind Spots essay analyzes how the velocity blind spot applies specifically to agentic systems.
What is the difference between AI safety and AI governance?
AI safety is the research discipline of ensuring AI systems are fundamentally aligned with human values, including in advanced AI scenarios. AI governance is the organizational discipline of managing AI risk within business operations through policies, controls, oversight structures, and accountability mechanisms. They overlap — safety research informs governance design — but operate at different levels. GRC professionals practice AI governance; they inform themselves about AI safety to understand the risk landscape they are managing.
What is Constitutional AI and why is it relevant to compliance?
Constitutional AI (developed by Anthropic) trains models to evaluate their own outputs against stated behavioral principles. For compliance purposes, it is relevant as a vendor due diligence dimension: the alignment method used affects how stable behavioral constraints are across model updates. Organizations with specific behavioral requirements — content restrictions, refusal requirements, output format mandates — should understand their vendor's alignment approach as part of model risk assessment.
Where can GRC professionals find AI governance career resources?
GRCcareers.ai publishes ongoing analysis of the GRC-AI career landscape. For active role listings, the AI GRC governance roles guide covers emerging positions. The broader GRC role directory at ExecSearches maps the full range of governance and compliance positions in nonprofit and public-sector organizations.
About the Author
Stephan Pochet is the founder of GRCcareers.ai and ExecSearches.com. He has spent more than two decades placing senior executives across nonprofit and public-sector organizations and launched GRCcareers.ai to address the emerging intersection of AI governance and executive talent.