Hybrid 3-connector fuzz test + VM_HOST env var by rophy · Pull Request #19 · rophy/OpenLogReplicator

rophy · 2026-04-11T01:40:32Z

Summary

Replace all hardcoded RAC VM IPs with ${VM_HOST} env var (auto-detected by vm-env.sh via virsh)
Implement hybrid 3-connector fuzz architecture: OLR (non-LOB) + LogMiner (LOB) on the "actual" side
Switch to rophy/debezium-server:3.5.0-2a7978c0af which includes mergeLogsByPrecedence RAC fix and debezium-config JAR fix

Test plan

fuzz-test.sh up — all 3 connectors start with ${VM_HOST} env var substitution
fuzz-test.sh run 10 — 10-minute workload, 0 errors, all connectors survive log switches
fuzz-test.sh validate — 55,228 events, 0 mismatches (PASS)
fuzz-test.sh db-check — 22 LOB extras + 4 LOB diffs (known DBZ-9615, identical on both sides)

Summary by CodeRabbit

New Features
- Added dedicated LOB capture pipeline and Kafka topic for LOB events.
Chores
- Parameterized database/host values to use environment variables instead of hardcoded IPs across test configs.
- Updated Debezium server images and Docker Compose services to support the new connector.
- Enhanced test scripts and consumers to wait for and subscribe to the new LOB connector/topic.
Tests
- Improved validation, readiness checks, and environment handling for RAC test runs.

Replace all hardcoded 192.168.122.x IPs with ${VM_HOST} env var. VM IP is auto-detected by vm-env.sh via virsh — no manual update needed when VM IP changes (e.g. new VM deployment). - .properties files: use Quarkus ${VM_HOST} substitution - docker-compose: use ${VM_HOST} in commands - rac.sh: fail-fast with ${VM_HOST:?} instead of silent default - db-check.py: require ORACLE_HOST, no fallback IP - .env: remove DB_CONN (set by rac driver) - vm-env.sh: remove stale IP mismatch validation

Switch to rophy/debezium-server:3.5.0-2a7978c0af which includes: - debezium-config JAR fix (missing in stock 3.5.0.Final) - mergeLogsByPrecedence RAC fix (thread-aware dedup) Implement hybrid architecture: OLR for non-LOB tables + LogMiner for LOB tables on the "actual" side. - Add dbz-lob-logminer service (LOB-only LogMiner connector) - OLR config: skip-lob-tables=1 - Consumer: subscribe to 3 topics, route olr-lob-events to OLR side - Validator: remove KNOWN_LOB_TABLES exemption, LOB mismatches are real failures with hybrid setup - fuzz-test.sh: 3-connector offset seeding, wait, logs - Remove restart: unless-stopped, remove Beta1 JAR mounts

coderabbitai · 2026-04-11T01:40:46Z

📝 Walkthrough

Walkthrough

Added a dedicated Debezium LogMiner connector for LOB tables and integrated it into the RAC fuzz-test stack; replaced hardcoded IPs with ${VM_HOST} across configs and scripts; tightened VM_HOST requirement; shifted validator behavior to treat LOB mismatches as real failures.

Changes

Cohort / File(s)	Summary
New LOB Debezium config `tests/dbz-twin/rac/config/application-lob-logminer-kafka.properties`	New Debezium LogMiner → Kafka config scoped to LOB table(s), routes to `olr-lob-events`, uses Kafka offset storage and JSON format without schemas.
Debezium / OLR configs `tests/dbz-twin/rac/config/application-logminer-kafka.properties`, `tests/dbz-twin/rac/config/application-logminer.properties`, `tests/dbz-twin/rac/config/application-olr-kafka.properties`, `tests/dbz-twin/rac/config/application-olr.properties`	Replaced hardcoded Oracle host IP with `${VM_HOST}` placeholder; one file added new LogMiner LOB config.
OLR JSON config `tests/dbz-twin/rac/config/olr-config.json`	Added `skip-lob-tables: 1` in Debezium/format settings.
Docker / compose / runtime `tests/dbz-twin/rac/docker-compose-fuzz.yaml`, `tests/dbz-twin/rac/fuzz-test.sh`, `tests/environments/rac/.env`, `tests/environments/rac/vm-env.sh`, `tests/sql/scripts/drivers/rac.sh`	Added `dbz-lob-logminer` service and named volume; updated Debezium image references; require `VM_HOST` via env; changed fuzz orchestration and env handling; removed VM_HOST consistency checks.
Consumer / topics `tests/dbz-twin/rac/kafka-consumer.py`, `tests/dbz-twin/rac/fuzz-test.sh`	Added `OLR_LOB_TOPIC` env/default `olr-lob-events`; consumer now waits for three topics and classifies `OLR_LOB_TOPIC` as `'olr'`; seeding and readiness logic extended for new connector.
Validator logic `tests/dbz-twin/rac/validator.py`	Renamed/changed LOB allowlist handling to suffix-based `LOB_TABLES`; removed known-LOB counter; treat LOB existence/value diffs as mismatches with new `[LOB_EXISTENCE]` and `[LOB_VALUE_DIFF]` logs.
Misc perf / prometheus `tests/dbz-twin/rac/perf/config/application-logminer.properties`, `tests/dbz-twin/rac/perf/config/application-olr.properties`, `tests/dbz-twin/rac/perf/config/prometheus.yml`, `tests/dbz-twin/rac/perf/docker-compose.yaml`	Replaced fixed IPs with `${VM_HOST}` placeholders and updated generated Prometheus header; adjusted Docker Compose Swingbench host param.
DB check script `tests/dbz-twin/rac/db-check.py`	Removed default host fallback; require `ORACLE_HOST`/`ORACLE_DSN` or exit with error.

Sequence Diagram(s)

sequenceDiagram
    participant Oracle as Oracle DB (LogMiner)
    participant DebeziumLOB as Debezium LogMiner (lob-logminer)
    participant Kafka as Kafka
    participant Consumer as kafka-consumer.py
    participant Validator as validator.py

    Oracle->>DebeziumLOB: Stream LOB redo records
    DebeziumLOB->>Kafka: Publish to `olr-lob-events` topic
    Kafka->>Consumer: Deliver messages from topics (lm, olr, olr-lob)
    Consumer->>Validator: Forward events grouped by adapter ('logminer'/'olr')
    Validator->>Validator: Compare LM vs OLR events, treat LOB diffs as mismatches
    Validator-->>Consumer: Emit validation results / metrics

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

LOB phantom workaround: skip-lob-tables + clean test environment #18: Adds skip-lob-tables behavior and touches validator, DBZ configs, and fuzz-test infra—strong overlap with LOB handling changes.
fix: LOB phantom undo suppression + fuzz test improvements #16: Modifies LOB-related test tooling and validator patterns similar to this PR.
feat: fuzz test framework with Kafka + SQLite streaming validation #13: Earlier expansion of the fuzz-test framework that this PR builds upon (docker-compose, consumer/validator orchestration).

Poem

🐰 I hop through logs where LOBs once hid,
VM_HOST now guides where IPs were slid.
Kafka sings topics, three streams in a row,
Validators watch closely — no more soft “we know.”
A carrot, a nod — the test garden grows. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title concisely and accurately summarizes the two main changes: implementing a hybrid 3-connector fuzz test architecture and replacing hardcoded IPs with ${VM_HOST} environment variable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fuzz-hybrid-3connector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Around line 431-445: Update the script header printed by action_help() to
include the new lob-logminer component and to reflect the correct workflow
ordering and RAC cleanup instruction: locate the action_help() function (or the
top-of-file help comment block it prints) and add "lob-logminer" to the
Components list, reorder the example workflow to show "down" before "up", and
add the note "Always run ./fuzz-test.sh down before ./fuzz-test.sh up in RAC
fuzz tests to ensure a clean environment" so the output of ./fuzz-test.sh help
matches the usage text updated in the logs case.

In `@tests/dbz-twin/rac/kafka-consumer.py`:
- Around line 147-159: The loop that checks Kafka topics should wait for all
entries in all_topics (LM_TOPIC, OLR_TOPIC, OLR_LOB_TOPIC) before proceeding;
currently it breaks as soon as LM_TOPIC or OLR_TOPIC exists which can mask an
implicitly created OLR_LOB_TOPIC on subscribe. Change the condition in the
polling loop that examines consumer.topics() (the block using topics =
consumer.topics()) to require every topic in all_topics to be present (e.g.,
check that set(all_topics).issubset(topics) or iterate and confirm each), only
then break and call consumer.subscribe(all_topics) and consumer.poll().

In `@tests/dbz-twin/rac/perf/docker-compose.yaml`:
- Line 47: The JDBC URL string "//${VM_HOST}:1521/ORCLPDB" silently becomes
malformed when VM_HOST is unset; update the interpolation to the required
expansion form so Compose fails fast—replace "${VM_HOST}" with a required
expansion like "${VM_HOST:?VM_HOST is not set}" in that JDBC URL entry to force
an error if VM_HOST is missing.

In `@tests/dbz-twin/rac/validator.py`:
- Line 248: The current LOB detection uses direct membership check ("is_lob =
event_table in LOB_TABLES") which fails for schema-qualified or case-variant
names; update the check to normalize event_table by stripping any schema
qualifier and surrounding quotes/backticks and then compare in a normalized case
(e.g., upper() or lower()) against a similarly normalized LOB_TABLES set. Modify
the logic around the is_lob assignment in validator.py so it derives a plain
table name (split on '.' and remove quotes) and performs case-normalized
membership against LOB_TABLES to correctly detect LOB tables regardless of
schema qualification or case differences.

In `@tests/environments/rac/.env`:
- Around line 1-2: Update the comment for VM_HOST in the rac environment file to
explicitly state that VM_HOST must be initialized by sourcing the vm-env.sh
script (i.e., callers must run or source vm-env.sh before invoking RAC flows),
and mention that rac.sh expects VM_HOST to be set or it will exit (see rac.sh
behavior around its VM_HOST check). Keep the note next to the existing DB_CONN
comment and use the exact symbol VM_HOST and filenames vm-env.sh and rac.sh so
callers know which script to run and which script relies on the variable.

In `@tests/sql/scripts/drivers/rac.sh`:
- Line 39: The strict fail-fast assignment VM_HOST="${VM_HOST:?VM_HOST is
required — source tests/environments/rac/vm-env.sh}" causes the RAC driver to
hard-fail because generate.sh doesn't source tests/environments/rac/vm-env.sh
first; change the driver to not use the "?" expansion (e.g.,
VM_HOST="${VM_HOST:-}") and instead add a guarded attempt to source vm-env.sh
(if present) or bail gracefully only when RAC execution truly requires VM_HOST
later; update the VM_HOST handling in tests/sql/scripts/drivers/rac.sh so it
either sources vm-env.sh when VM_HOST is empty or defers the fatal error to a
later, explicit check that runs inside the RAC execution path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a4006b05-3c83-423c-8e5f-cf9f55add7d2

📥 Commits

Reviewing files that changed from the base of the PR and between 7087cc6 and 97e5c90.

📒 Files selected for processing (18)

tests/dbz-twin/rac/config/application-lob-logminer-kafka.properties
tests/dbz-twin/rac/config/application-logminer-kafka.properties
tests/dbz-twin/rac/config/application-logminer.properties
tests/dbz-twin/rac/config/application-olr-kafka.properties
tests/dbz-twin/rac/config/application-olr.properties
tests/dbz-twin/rac/config/olr-config.json
tests/dbz-twin/rac/db-check.py
tests/dbz-twin/rac/docker-compose-fuzz.yaml
tests/dbz-twin/rac/fuzz-test.sh
tests/dbz-twin/rac/kafka-consumer.py
tests/dbz-twin/rac/perf/config/application-logminer.properties
tests/dbz-twin/rac/perf/config/application-olr.properties
tests/dbz-twin/rac/perf/config/prometheus.yml
tests/dbz-twin/rac/perf/docker-compose.yaml
tests/dbz-twin/rac/validator.py
tests/environments/rac/.env
tests/environments/rac/vm-env.sh
tests/sql/scripts/drivers/rac.sh

💤 Files with no reviewable changes (1)

tests/environments/rac/vm-env.sh

- fuzz-test.sh: add lob-logminer to help text, show down-before-up workflow - kafka-consumer.py: wait for all 3 topics before subscribing, fail on missing - perf/docker-compose.yaml: use ${VM_HOST:?} for fail-fast on missing env var - validator.py: normalize table name (strip schema, uppercase) for LOB detection - .env: clarify VM_HOST must be sourced from vm-env.sh - rac.sh: auto-source vm-env.sh when VM_HOST is not set

coderabbitai

🧹 Nitpick comments (1)

tests/dbz-twin/rac/fuzz-test.sh (1)

216-225: Bound Debezium readiness log scans to avoid expensive full-log polling.

Line 218-220 reads entire container logs on every poll iteration. As logs grow, this gets slower and noisier. Use --tail (or --since) for bounded checks.

♻️ Suggested refinement

-        docker logs fuzz-dbz-logminer 2>&1 | grep -q "Starting streaming" && LM_OK=true
-        docker logs fuzz-dbz-olr 2>&1 | grep -q "streaming client started\|Starting streaming" && OLR_OK=true
-        docker logs fuzz-dbz-lob-logminer 2>&1 | grep -q "Starting streaming" && LOB_LM_OK=true
+        docker logs --tail 200 fuzz-dbz-logminer 2>&1 | grep -q "Starting streaming" && LM_OK=true
+        docker logs --tail 200 fuzz-dbz-olr 2>&1 | grep -Eq "streaming client started|Starting streaming" && OLR_OK=true
+        docker logs --tail 200 fuzz-dbz-lob-logminer 2>&1 | grep -q "Starting streaming" && LOB_LM_OK=true

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/fuzz-test.sh` around lines 216 - 225, The readiness loop
repeatedly calls docker logs for each container which rereads the full logs on
every iteration; modify the three docker logs invocations that set LM_OK,
OLR_OK, and LOB_LM_OK to include a bounded tail (e.g., --tail N) or since (e.g.,
--since "10s") flag so each grep only scans recent output instead of the entire
log; update the commands that check "Starting streaming" / "streaming client
started" in the loop (the lines that set LM_OK, OLR_OK, LOB_LM_OK) to use --tail
or --since to limit work and improve performance.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Around line 216-225: The readiness loop repeatedly calls docker logs for each
container which rereads the full logs on every iteration; modify the three
docker logs invocations that set LM_OK, OLR_OK, and LOB_LM_OK to include a
bounded tail (e.g., --tail N) or since (e.g., --since "10s") flag so each grep
only scans recent output instead of the entire log; update the commands that
check "Starting streaming" / "streaming client started" in the loop (the lines
that set LM_OK, OLR_OK, LOB_LM_OK) to use --tail or --since to limit work and
improve performance.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1e972a6d-df7b-44f1-9413-68d2b3a0c9fa

📥 Commits

Reviewing files that changed from the base of the PR and between 97e5c90 and 591504c.

📒 Files selected for processing (6)

tests/dbz-twin/rac/fuzz-test.sh
tests/dbz-twin/rac/kafka-consumer.py
tests/dbz-twin/rac/perf/docker-compose.yaml
tests/dbz-twin/rac/validator.py
tests/environments/rac/.env
tests/sql/scripts/drivers/rac.sh

✅ Files skipped from review due to trivial changes (1)

tests/environments/rac/.env

🚧 Files skipped from review as they are similar to previous changes (4)

tests/sql/scripts/drivers/rac.sh
tests/dbz-twin/rac/perf/docker-compose.yaml
tests/dbz-twin/rac/kafka-consumer.py
tests/dbz-twin/rac/validator.py

rophy added 2 commits April 11, 2026 01:38

coderabbitai Bot reviewed Apr 11, 2026

View reviewed changes

rophy merged commit b9f7e48 into master Apr 11, 2026
2 checks passed

rophy deleted the fuzz-hybrid-3connector branch April 11, 2026 02:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hybrid 3-connector fuzz test + VM_HOST env var#19

Hybrid 3-connector fuzz test + VM_HOST env var#19
rophy merged 3 commits intomasterfrom
fuzz-hybrid-3connector

rophy commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rophy commented Apr 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rophy commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading