Skip to content

fix(memory): add maxMessages limit to ConversationSummaryBufferMemory#6172

Closed
octo-patch wants to merge 1 commit intoFlowiseAI:mainfrom
octo-patch:fix/issue-5873-summary-buffer-memory-max-messages
Closed

fix(memory): add maxMessages limit to ConversationSummaryBufferMemory#6172
octo-patch wants to merge 1 commit intoFlowiseAI:mainfrom
octo-patch:fix/issue-5873-summary-buffer-memory-max-messages

Conversation

@octo-patch
Copy link
Copy Markdown

Fixes #5873

Problem

ConversationSummaryBufferMemory fetches all messages from the database on every request. As a conversation grows, the token-counting and summarization steps operate on an ever-larger dataset, causing increasing latency, cost, and eventual OOM/context-exceeded errors.

The root cause is that movingSummaryBuffer is an in-memory instance variable reset on each request, so every call re-processes the entire message history from scratch.

Solution

Add an optional maxMessages parameter (additional param, no default) that caps how many messages are loaded from the database before token-based pruning is applied.

  • When set, only the most recent N messages are loaded from the database, bounding token-count and summarization work regardless of conversation length.
  • When unset, behaviour is identical to before (all messages loaded) - fully backward-compatible.
  • Consistent with the existing k parameter in BufferWindowMemory.

The node version is bumped from 1.0 to 2.0 to reflect the new input parameter.

Testing

  • Existing chatflows using this node without maxMessages set are unaffected (parameter is optional with no default).
  • Setting maxMessages to 20 limits DB fetch to the last 20 messages; token pruning and summarization then operate only on that window.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the ConversationSummaryBufferMemory node to version 2.0 and introduces a maxMessages parameter to limit the number of messages retrieved. While this helps manage the message count for processing, the current implementation performs the slicing in-memory after fetching the entire history from the database. Feedback suggests applying this limit at the database query level to prevent potential memory issues and improve performance.

Comment on lines +157 to +159
if (this.maxMessages && this.maxMessages > 0) {
chatMessage = chatMessage.slice(-this.maxMessages)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation fetches the entire message history from the database into memory before applying the maxMessages limit via slice. This does not address the performance overhead or potential Out-Of-Memory (OOM) issues associated with loading a very large history, which contradicts the primary goal of this pull request. To effectively optimize the database load, the limit should be applied at the database level using TypeORM's take property. This would require changing the query order to DESC on createdDate, applying the limit, and then reversing the resulting array to maintain the chronological order expected by the subsequent logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Conversational Summary Buffer Memory gets all messages from db

2 participants