building a vm inside chatgpt previews jailbreak design

see also: LLMs · Model Behavior

An experiment showed how to build a virtual machine inside ChatGPT using prompts (source). The project illustrates how users can create structured constraints that both guide and jailbreak model behavior. I read it as a preview of the next wave of prompt engineering.

causal chain

Structured prompt → simulated VM → constrained actions, which matters because constraints become a tool for steering outputs. Constrained actions → predictable behavior → more powerful workflows, which increase user dependence on prompt scaffolding. Prompt scaffolding → safety bypass potential, which forces guardrails to evolve.

risk surface

Safety boundaries can be bypassed with clever scaffolding.
Users rely on fragile prompt constructs that break across model updates.
Teams ship systems that are hard to audit because logic lives in prompts, not code.

decision boundary

If model-native tooling can expose interpreters or safe execution layers, I will treat prompt-based VMs as a curiosity rather than a core pattern. Until then, they are a practical workaround that will keep showing up in the wild.

my take

This is clever, but it signals a deeper gap: people want programmable behavior, not just chat. We should build that directly.

linkage

linkage tree

tags
- #ai
- #security
- #research
- #2022
related
- [[Building a VM Inside ChatGPT]]
- [[chatgpt launch proves conversational ai is ready for consumers]]
- [[gpt-3 release redefines ai api calculus]]

ending questions

What is the minimum structure needed to make AI outputs auditable and repeatable?

Keith Kitchen

Explorer

building a vm inside chatgpt previews jailbreak design

building a vm inside chatgpt previews jailbreak design

causal chain

risk surface

decision boundary

my take

linkage

ending questions

Stacked notes

Graph View

Map

Table of Contents