Compare commits
2 Commits
616b3139e0
...
cf2eb30a04
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
cf2eb30a04 | ||
|
|
bb0ea229e4 |
@@ -38,6 +38,8 @@ These files are a prompt injection surface. A malicious contributor, compromised
|
||||
- Override safety behaviors the agent would normally follow
|
||||
- Instruct the agent to ignore security warnings
|
||||
|
||||
IMPORTANT: A sandbox like Greywall does NOT prevent prompt injection. The agent will still read these files and follow the instructions. What a sandbox does is contain the blast radius: even if the agent is hijacked, it can't exfiltrate data (network blocked), can't read secrets (filesystem denied), can't run destructive commands (command blocked). Prompt injection findings should reflect this nuance: the risk is that the agent's behavior is manipulated, and without a sandbox the manipulated agent has unrestricted access.
|
||||
|
||||
If agent instruction files exist in this repo, this is a SIGNIFICANT finding. The more instruction files present, the larger the attack surface.
|
||||
|
||||
SEVERITY CONTEXT:
|
||||
|
||||
@@ -584,6 +584,11 @@ export default function GamePage() {
|
||||
<span className={`text-[10px] font-sans font-medium uppercase tracking-wider ${severityColor(finding.severity)}`}>
|
||||
{finding.severity}
|
||||
</span>
|
||||
{finding.type === 'prompt_injection' && (
|
||||
<span className="text-[10px] font-sans text-muted-foreground/50 uppercase tracking-wider">
|
||||
· sandbox limits damage
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
<h3 className="text-sm font-sans font-medium text-foreground mb-1">
|
||||
{finding.title}
|
||||
|
||||
Reference in New Issue
Block a user