PostgreSQL connection pooling by tmikula-dev · Pull Request #131 · AbsaOSS/EventGate

tmikula-dev · 2026-04-13T13:27:57Z

Overview

This pull request introduces connection caching and reuse for both PostgreSQL readers and writers, improving efficiency and reliability by maintaining a single connection per instance. It also adds robust reconnection logic, updates tests to reflect the new behavior, and enhances configuration for dependency updates.

Release Notes

PostgreSQL connection pooling for

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved database resilience with automatic retry logic for transient connection failures.
- Enhanced error handling and recovery for database operations.
Performance Improvements
- Database connections are now cached and reused across operations, reducing overhead.
Tests
- Added integration and unit tests for connection reuse and retry behavior.
Chores
- Updated dependency management configuration for improved update consolidation.

coderabbitai · 2026-04-13T13:28:26Z

Walkthrough

This PR implements connection caching for PostgreSQL readers and writers to reuse single connections across Lambda invocations instead of opening new TCP connections per operation. Both WriterPostgres and ReaderPostgres now cache connections with automatic reconnection on connection failures, supported by comprehensive unit and integration tests.

Changes

Cohort / File(s)	Summary
Documentation & Configuration `.github/copilot-instructions.md`, `.github/dependabot.yml`	Added PostgreSQL-specific caching instructions and configured Dependabot to group dependency updates by pattern.
PostgreSQL Reader Implementation `src/readers/reader_postgres.py`	Introduced `_get_connection()` method for cached connection reuse. `read_stats()` now retries up to 2 times on `OperationalError`, clears stale connections, and provides improved error messages with connection cleanup.
PostgreSQL Writer Implementation `src/writers/writer_postgres.py`	Added `_get_connection()` method for cached connection reuse. `write()` now retries up to 2 times on `OperationalError`, clears stale connections on failure, and maintains connection state across invocations.
Reader Unit Tests `tests/unit/readers/test_reader_postgres.py`	Extracted `_STATS_DESCRIPTION` constant, updated mock setup with `closed` attribute, and added `TestConnectionReuse` suite covering reconnection, retry, and error handling scenarios.
Writer Unit Tests `tests/unit/writers/test_writer_postgres.py`	Added new unit test suite verifying reconnection on closed connections, retry behavior on transient failures, exhaustion handling, and connection cleanup on persistent errors.
Integration Tests `tests/integration/test_connection_reuse.py`	New integration test module with fixtures and two test classes (`TestWriterConnectionReuse`, `TestReaderConnectionReuse`) asserting that cached connection objects remain identical across sequential operations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

OOP for project writers #104: Refactored src/writers/writer_postgres.py into a WriterPostgres class; this PR adds connection caching and retry logic to that same writer implementation.

Suggested labels

refactoring

Suggested reviewers

petr-pokorny-absa
lsulak
oto-macenauer-absa

Poem

🐰 A rabbit springs from cache to cache,
No TCP connections to detach—
With retries wise and pooled delight,
The Lambda stays forever bright!
Connection reuse, oh what a sight! 🌟

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The description covers overview, release notes (though incomplete), and related issue. However, the release notes are unfinished with incomplete bullet points.	Complete the release notes section; the first bullet point ends abruptly with 'pooling for' and needs completion describing the key user-facing improvement.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: PostgreSQL connection pooling/caching replaces per-call connections in WriterPostgres and ReaderPostgres.
Linked Issues check	✅ Passed	All primary acceptance criteria from issue `#115` are met: connection reuse via cached connection with lazy initialization, reconnection on stale/broken detection via OperationalError/closed checks, unit test mocks preserved, and integration tests added.
Out of Scope Changes check	✅ Passed	All changes are in scope for issue `#115` and PR objectives. Copilot instructions update and Dependabot grouping rules align with connection pooling implementation and maintenance goals.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/postgresql-connection-pooling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

tests/unit/readers/test_reader_postgres.py (1)
385-405: Unused variable pagination flagged by static analysis.

The variable pagination on line 402 is unpacked but never used. Consider using an underscore prefix to indicate it's intentionally unused.
🔧 Suggested fix
-            rows, pagination = reader.read_stats(limit=10)
+            rows, _pagination = reader.read_stats(limit=10)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/readers/test_reader_postgres.py` around lines 385 - 405, The test
test_retries_on_operational_error unpacks reader.read_stats into rows,
pagination but never uses pagination; rename the unused variable to _pagination
(or simply use an underscore `_`) to satisfy static analysis and indicate
intentional non-use. Update the line that calls reader.read_stats in
test_retries_on_operational_error to unpack as rows, _pagination (or rows, _) so
the test behavior (asserting connect call count and rows) remains unchanged
while eliminating the unused-variable warning.
src/writers/writer_postgres.py (1)
60-60: Consider adding explicit connection cleanup for non-Lambda deployments.

The cached connection has no explicit close() mechanism. While this works well for Lambda (connections naturally close when the container terminates), long-running processes or integration tests may benefit from explicit cleanup. The Writer base class could be extended with an optional close() method.

This is not blocking for the current Lambda use case.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/writers/writer_postgres.py` at line 60, Add an explicit cleanup hook that
closes the cached DB connection: extend the Writer base class with an optional
close() method and implement it in the Postgres writer to call and null out
self._connection.close() (or await if async) when a connection exists; update
any connection-creating methods that set self._connection to ensure close() will
be safe to call, and add a brief unit test or integration cleanup call to
exercise Writer.close() in long-running tests or processes to avoid leaked
connections.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/writers/writer_postgres.py`:
- Line 60: Add an explicit cleanup hook that closes the cached DB connection:
extend the Writer base class with an optional close() method and implement it in
the Postgres writer to call and null out self._connection.close() (or await if
async) when a connection exists; update any connection-creating methods that set
self._connection to ensure close() will be safe to call, and add a brief unit
test or integration cleanup call to exercise Writer.close() in long-running
tests or processes to avoid leaked connections.

In `@tests/unit/readers/test_reader_postgres.py`:
- Around line 385-405: The test test_retries_on_operational_error unpacks
reader.read_stats into rows, pagination but never uses pagination; rename the
unused variable to _pagination (or simply use an underscore `_`) to satisfy
static analysis and indicate intentional non-use. Update the line that calls
reader.read_stats in test_retries_on_operational_error to unpack as rows,
_pagination (or rows, _) so the test behavior (asserting connect call count and
rows) remains unchanged while eliminating the unused-variable warning.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6c04b6fc-8b9c-4a86-a7cd-246694e57bde

📥 Commits

Reviewing files that changed from the base of the PR and between ccc1639 and 452008d.

📒 Files selected for processing (7)

.github/copilot-instructions.md
.github/dependabot.yml
src/readers/reader_postgres.py
src/writers/writer_postgres.py
tests/integration/test_connection_reuse.py
tests/unit/readers/test_reader_postgres.py
tests/unit/writers/test_writer_postgres.py

oto-macenauer-absa

lgtm

lsulak · 2026-04-16T10:48:17Z

        self._secret_name = os.environ.get("POSTGRES_SECRET_NAME", "")
        self._secret_region = os.environ.get("POSTGRES_SECRET_REGION", "")
        self._db_config: dict[str, Any] | None = None
+        self._connection: Any | None = None


If we would use strict mode of mypy, I am not sure this would be allowed / without warning. It's nice that you use types, but having a type that has an optional Any value does not say that much

lsulak · 2026-04-16T10:51:41Z

            raise RuntimeError("Failed to load database configuration.")
        return config

+    def _get_connection(self) -> Any:


This is pretty common pattern actually - there is from functools import cached_property for these things exactly - caching a property for a class. It's more native and intuitive. Pseudo-code:

class DB: @cached_property def conn(self): return create_connection()

and it should be also lazy by default, so the connection would be established on the first call, not eagerly

lsulak · 2026-04-16T10:55:14Z

+            user=db_config["user"],
+            password=db_config["password"],
+            port=db_config["port"],
+            options="-c statement_timeout=30000 -c default_transaction_read_only=on",


Consider to extract this 30s timeout somewhere into a constant

lsulak · 2026-04-16T10:56:25Z

        Raises:
            RuntimeError: On database connectivity or query errors.
        """
        db_config = self._load_db_config()


I saw this call earlier already, hmm. I might revisit the whole structure actually

But this should not belong here - DB config loading and validation - what's implemented in the next few lines. Why not to put this into a 'get_conn' method or so?

lsulak · 2026-04-16T11:02:14Z

        params: list[Any] = [ts_start, ts_end]
        if cursor is not None:
            params.append(cursor)
            query = psycopg2_sql.SQL(_RUNS_SQL_WITH_CURSOR)


I saw these queries. Couple of upgrade ideas, if you want, you can either (ordered from low to higher effort & practice):

put them at least into triple double-quotes, not like this. """ query here """ i.e. multi-line strings are perfect for this.

put them into a separated sql file and load it. Combining SQL and Python is a bad practice. In simple projects like this it's okay but as projects scale this is not maintainable (not to mention to typical engineering practices - formatting, testing, discovery of these SQL files)

put them into a separated JINJA2 file - with this, you can parametrize it from Python

my most favourite option - AIOSQL: https://nackjicholson.github.io/aiosql/, also the idea here is to have a separated file with the SQL that you load and work with

lsulak · 2026-04-16T11:04:29Z

+                        raw_rows = db_cursor.fetchall()
+                    connection.rollback()
+                    break
+                except OperationalError as exc:


this method does way too much. It loads the DB config, validates it, performs retries, manipulates with cursor, unpacks and post-processes the values. Split it please, it's hard to read, hard to test, hard to extend

lsulak · 2026-04-16T11:06:24Z

        """
        logger.debug("Sending to Postgres - %s.", table)
        query = psycopg2_sql.SQL("""
            INSERT INTO {}


yet another embedded SQL - I am discovering these more and more as I read the code. Seriously consider using AIOSQL :)

Cost / reason: Imagine that someone will want to see all ways how we use a database. Like all the queries etc. With the current approach, where SQL statements are python strings all over the codebase, it's very hard to do.

lsulak · 2026-04-16T11:30:45Z

+                    with connection.cursor() as cursor:
+                        if topic_name == "public.cps.za.dlchange":
+                            self._postgres_edla_write(cursor, table_info["main"], message)
+                        elif topic_name == "public.cps.za.runs":


consider extracting these topic names into a common constant - maybe a frozen data class or so

lsulak · 2026-04-16T11:37:52Z

+                        db_cursor.execute(query, params)
+                        col_names = [desc[0] for desc in db_cursor.description]  # type: ignore[union-attr]
+                        raw_rows = db_cursor.fetchall()
+                    connection.rollback()


rollback on read? what am I missing here?

lsulak · 2026-04-16T11:38:24Z

+                    break
+                except OperationalError as exc:
+                    self._connection = None
+                    if attempt > 0:


What am I missing here - is the retry even working? Would it not fail after the first attempt?

or is it that the RUntimeError is non-retrieable?

lsulak · 2026-04-16T11:39:33Z

+                    break
+                except OperationalError:
+                    self._connection = None
+                    if attempt > 0:


consider improving logs - similar way how the Reader is done. In fact, maybe you can extract some of it into a common class handling connections and retries. But up to you. I know that the conn setting for reading and writing is slightly different (but that could be parametrized or so)

lsulak · 2026-04-16T11:41:49Z

I find the title to be a bit misleading considering the implementation. There is no connection pooling, just caching implemented manually. Either change the title and PR description to say that connection is reshared / cached, or introduce SimpleConnectionPool or something like that.

PostgreSQL connection pooling

452008d

tmikula-dev self-assigned this Apr 13, 2026

tmikula-dev requested review from lsulak, oto-macenauer-absa and petr-pokorny-absa as code owners April 13, 2026 13:27

tmikula-dev added the enhancement New feature or request label Apr 13, 2026

coderabbitai bot reviewed Apr 13, 2026

View reviewed changes

oto-macenauer-absa approved these changes Apr 14, 2026

View reviewed changes

lsulak reviewed Apr 16, 2026

View reviewed changes

Conversation

tmikula-dev commented Apr 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Release Notes

Related

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

oto-macenauer-absa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsulak Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsulak Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsulak commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tmikula-dev commented Apr 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 13, 2026 •

edited

Loading

lsulak Apr 16, 2026 •

edited

Loading

lsulak Apr 16, 2026 •

edited

Loading