ArkSim is now open source: simulate and evaluate your AI agents. Try it now

Does Your AI Agent Know Who the User Is?

byZhou Yu

The difference between general and user-situated agents is fundamental: optimizing for specific users changes how they must be built and tested.

comparison

Why This Distinction Matters — Especially for Arklex and Agent Testing

User-situated agents fundamentally change how evaluation works. They cannot be evaluated with static prompts, require synthetic users rather than single test cases, and need behavioral distributions instead of golden answers. They also fail silently in ways general agents don't—appearing correct while subtly driving the wrong outcomes for a specific user. This is exactly where generic LLM evaluation tools break down, and why Arklex exists.

The Core Distinction: General AI Agents vs User-Situated AI Agents

Most conversations about AI agents get stuck on the wrong axis. People debate chat vs non-chat, or copilots vs agents. But the real distinction is simpler—and far more important: does the agent's success depend on who the user is?

If the answer is no, you're looking at a general AI agent. If the answer is yes, you're dealing with a user-situated (context-aware) AI agent.

General AI agents: instruction-first, user-agnostic

A general AI agent produces roughly the same outcome for the same instruction, regardless of who issued it. The user is interchangeable. Think: "Summarize this document," "Write a SQL query," "Translate this paragraph," or "Generate Python code for quicksort." Even when these appear in a chat interface, the agent's behavior does not meaningfully depend on the user's background, history, or goals. The mental model is simple: Given input X, produce output Y. Success is measured globally, correctness, task completion, or similarity to a reference answer.

User-situated AI agents: people-first, context-driven

A user-situated agent behaves differently depending on which user is interacting with it and the situation they're in. Two users issuing the same instruction should reasonably get different outcomes. These agents condition their behavior on user goals and role, history and prior actions, constraints and risk tolerance, and access to user-specific data and state. The mental model shifts to: Given this user, in this situation, what is the right action now? Success is no longer absolute, it's user-relative, often probabilistic, and sometimes only measurable over time.

Why "chat-based" is the wrong framing

"Chat-based agent" is often used as shorthand for something deeper: an agent that reasons over a specific user's state and optimizes for that user's success. But chat is just the interface. You can have chat-based systems that are not user-aware (most LLM demos), and non-chat systems that are deeply user-aware (recommendation engines, fraud systems, personalization platforms). User-situatedness is an architectural choice, not a UI choice.

Same question, two agents: a concrete comparison

Same instruction: "What should I do next?"

Below, the same user question is shown twice: first how a general agent responds, then how a user-situated agent responds. The comparison makes the distinction clear.

General agent — same advice for everyone

User-situated agent — context-aware guidance

In domains like insurance, finance, healthcare, or education, this distinction is everything. The "right" response depends on the person, not just the question.

The real challenge: defining success

This is where things get hard. General agents optimize for accuracy and task completion. User-situated agents optimize for outcomes, better decisions, reduced risk, progress toward goals. That means static prompts aren't enough, "golden answers" don't exist, and evaluation requires synthetic users, behavioral distributions, and longitudinal thinking. Once you see this distinction, it becomes very hard to unsee, and it explains why so many agent systems struggle when they leave the demo and meet real users.