feat(agent): add max_turns and max_token_budget execution limits#2139
feat(agent): add max_turns and max_token_budget execution limits#2139rshriharripriya wants to merge 4 commits intostrands-agents:mainfrom
Conversation
|
Thank you for the MR. Looking forward to this as we are currently using hooks to configure the max turns. |
| self.agent_id = _identifier.validate(agent_id or _DEFAULT_AGENT_ID, _identifier.Identifier.AGENT) | ||
| self.name = name or _DEFAULT_AGENT_NAME | ||
| self.description = description | ||
| self.max_turns = max_turns |
There was a problem hiding this comment.
Add validation for invalid values. Negative values produce silent misbehavior — e.g. max_turns=-1 blocks all execution because the check 0 >= -1 is True on the first cycle. Same for max_token_budget.
if max_turns is not None and max_turns < 1:
raise ValueError("max_turns must be a positive integer")
if max_token_budget is not None and max_token_budget < 1:
raise ValueError("max_token_budget must be a positive integer")
self.max_turns = max_turns
self.max_token_budget = max_token_budget
This also resolves the max_token_budget=0 edge case where the >= check prevents any model call from ever executing (see comment on event_loop.py).
| merged_state = invocation_state | ||
|
|
||
| # Reset per-invocation execution limit counters. | ||
| self._invocation_turn_count: int = 0 |
There was a problem hiding this comment.
These instance variables don't exist on the agent until the first call to stream_async. The event_loop_cycle code works around this with getattr(agent, "_invocation_turn_count", 0), but every other agent attribute (_interrupt_state, _cancel_signal, messages, etc.) is initialized in init and accessed directly.
Please also initialize them in init alongside self.max_turns / self.max_token_budget:
self.max_turns = max_turns
self.max_token_budget = max_token_budget
self._invocation_turn_count: int = 0
self._invocation_token_count: int = 0
Keep the reset here in stream_async, but having them in init makes the getattr calls in event_loop.py unnecessary (they can become direct attribute access).
| with trace_api.use_span(cycle_span, end_on_exit=False): | ||
| try: | ||
| # Check max_turns execution limit before each cycle. | ||
| _max_turns = getattr(agent, "max_turns", None) |
There was a problem hiding this comment.
Every other agent attribute in this function is accessed directly (agent.messages, agent.event_loop_metrics, agent._interrupt_state).
Using getattr with defaults implies these attributes might not exist, which is inconsistent and makes the contract unclear.
If the counters are initialized in init (see comment on agent.py), these can be simplified to:
if agent.max_turns is not None and agent._invocation_turn_count >= agent.max_turns:
Same applies to the max_token_budget check and the accumulation line further down (getattr(agent, "_invocation_token_count", 0)).
| # Check max_token_budget execution limit before each cycle. | ||
| _max_token_budget = getattr(agent, "max_token_budget", None) | ||
| _invocation_tokens = getattr(agent, "_invocation_token_count", 0) | ||
| if _max_token_budget is not None and _invocation_tokens >= _max_token_budget: |
There was a problem hiding this comment.
The >= comparison creates an asymmetry with max_turns. With max_turns=1, the first cycle executes (0 >= 1 → False, counter increments, model runs). With max_token_budget=0, the first cycle is blocked (0 >= 0 → True, model never runs).
This is because the token counter starts at 0 and only accumulates after a model call, so the initial state 0 >= 0 fires immediately.
Two options:
- Use > instead of >= — guarantees at least one model call, matches max_turns semantics
- Validate max_token_budget >= 1 in init (my preference — see that comment)
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_max_token_budget_stops_when_exceeded(): |
There was a problem hiding this comment.
The docstring says "MockedModelProvider attaches usage metadata" but it doesn't — MockedModelProvider never emits a {"metadata": ...} chunk, so message.get("metadata", {}).get("usage", {}) always returns {} and accumulated tokens are always 0. This test only passes because budget=0 triggers on the initial 0 >= 0 check.
There's no test that verifies a positive token budget actually stops the agent when real token counts accumulate. Consider either:
- Enhancing MockedModelProvider to emit usage metadata events, then testing with e.g. max_token_budget=100 where the mock reports 150 tokens
- Adding a unit test that sets agent._invocation_token_count = 500 directly and verifies the check fires with max_token_budget=500
Also, the inline comment says "First cycle executes" but then explains 0 >= 0 fires immediately — meaning the first cycle does not execute (no model call happens). The comment contradicts itself.
| mock._model_state = {} | ||
| mock.trace_attributes = {} | ||
| mock.retry_strategy = ModelRetryStrategy() | ||
| mock.max_turns = None |
There was a problem hiding this comment.
nit: These 4 lines are duplicated in 4 places across the event_loop test files (test_event_loop.py x2, test_event_loop_metadata.py, test_event_loop_structured_output.py). If the counters are initialized in Agent.init (see earlier comment), only mock agents that don't go through init need these. Consider extracting a shared fixture helper or adding these to an existing base fixture to reduce the boilerplate that would grow with future agent attributes.
c743c49 to
609fcbc
Compare
- Introduced `max_turns` and `max_token_budget` with validation and tracking. - Added exceptions for limit breaches: `MaxTurnsReachedException` and `MaxTokenBudgetReachedException`. - Enhanced tests for validation, runtime behavior, and async execution. - Created helper `apply_execution_limit_defaults` for streamlined test setup.
- Update docstrings for `max_turns` and `max_token_budget` with clearer descriptions and warnings for structured output retries. - Export `MaxTurnsReachedException` and `MaxTokenBudgetReachedException` for easier user access. - Add a comment to clarify ordering invariant for token accumulation in async generators. - Fix test mock to reference `_retry_strategy` correctly. - Introduce `test_max_token_budget_accumulates_from_message_metadata` to verify token accumulation logic.
…h `apply_execution_limit_defaults`
609fcbc to
0b34528
Compare
Summary
Closes #2124
max_turnsparameter toAgent.__init__— caps the number of event loop cycles per invocation; stops withstop_reason="max_turns"when reachedmax_token_budgetparameter toAgent.__init__— caps cumulative input+output tokens per invocation; stops withstop_reason="max_token_budget"when reachedNone(no limit), fully backwards compatible"max_turns"and"max_token_budget"to theStopReasonLiteral intypes/event_loop.pystream_async; in-flight tool calls always complete before a limit is enforcedTest plan
test_max_turns_stops_loop—max_turns=1stops before entering a second cycle triggered by a tool calltest_max_turns_allows_exactly_n_cycles—max_turns=2allows exactly 2 cyclestest_max_turns_none_is_unbounded—max_turns=None(default) runs to natural completiontest_max_turns_resets_between_invocations— counter resets on each new invocationtest_max_token_budget_stops_when_exceeded— budget of 0 triggers immediately after first model calltest_max_token_budget_none_is_unbounded—max_token_budget=None(default) runs to natural completiontest_max_token_budget_resets_between_invocations— token counter resets on each new invocationruff checkandruff format --checkpass on all changed filesmypyreports no issues on all changed source files