90 Days Gen AI Risk Trial -Start Now
Book a demo
Security

What is Prompt Injection?

An attack technique where malicious instructions are inserted into AI prompts to manipulate the model's behavior or extract sensitive information.

Prompt injection is a security vulnerability in AI language models where an attacker crafts input that causes the AI to ignore its original instructions and follow new, malicious ones instead. It is one of the most significant security risks in AI-powered applications.

There are two main types: direct prompt injection, where the attacker directly inputs malicious prompts to the AI, and indirect prompt injection, where malicious instructions are embedded in external data sources (websites, documents, emails) that the AI processes.

Examples include: instructing an AI chatbot to reveal its system prompt or confidential instructions, manipulating AI-powered email filters to ignore spam, tricking AI code assistants into generating vulnerable code, and extracting training data or private information through carefully crafted prompts.

Mitigation strategies include input validation and sanitization, output filtering, instruction hierarchy (system prompts with higher priority), sandboxing AI responses, human review of AI actions, and regular security testing of AI-powered applications.

Related Terms

Protect Your Organization from AI Risks

Aona AI provides automated Shadow AI discovery, real-time policy enforcement, and comprehensive AI governance for enterprises.

Empowering businesses with safe, secure, and responsible AI adoption through comprehensive monitoring, guardrails, and training solutions.

Socials

Contact

Level 1/477 Pitt St, Haymarket NSW 2000

contact@aona.ai

Copyright ©. Aona AI. All Rights Reserved