Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a resource-aware cleanup helper that inspects CloudFormation DELETE_FAILED events and performs targeted AWS API removals; stack deletion now retries with cleanup attempts and the script returns a nonzero exit code if any stack remains DELETE_FAILED. Changes
Sequence Diagram(s)sequenceDiagram
participant Script as Script\n(Deprovision)
participant CF as CloudFormation
participant AWS as AWS_API
Script->>CF: delete-stack(stack_name)
Script->>CF: wait stack-delete-complete(stack_name)
CF-->>Script: StackStatus (DELETE_COMPLETE / DELETE_FAILED)
alt StackStatus == DELETE_FAILED
Script->>CF: describe-stack-events(stack_name)
CF-->>Script: events (ResourceStatus==DELETE_FAILED)
Script->>AWS: delete_failed_resources(resource-type, physical-id)
AWS-->>Script: responses (errors suppressed)
Script->>CF: delete-stack(stack_name) (retry)
Script->>CF: wait stack-delete-complete(stack_name)
CF-->>Script: StackStatus (rechecked)
alt Still DELETE_FAILED
Script->>AWS: delete_failed_resources(...) (second attempt)
Script->>CF: delete-stack(stack_name) (final retry)
Script->>CF: wait stack-delete-complete(stack_name)
CF-->>Script: StackStatus (final)
end
end
Script-->>Script: aggregate rc per-stack
Script-->>Script: exit ${rc:-0}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh (2)
19-22: Consider quoting$REGIONfor shell best practices.Shellcheck flags unquoted
$REGIONthroughout the file (SC2086). While AWS region names don't contain spaces, quoting prevents potential issues with globbing or word splitting.This applies to all AWS CLI calls in the file (lines 19, 40, 46, 50, 53, 55, 58, 62, 66, 69, 72, 74, 78, 82, 86, 91, 95, 97, 100, 104, 119, 121, 125, 141, 143, 146, 162, 164, 167).
Example fix pattern
- failed_resources=$(aws --region $REGION cloudformation describe-stack-events \ + failed_resources=$(aws --region "$REGION" cloudformation describe-stack-events \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh` around lines 19 - 22, Shellcheck warns about unquoted $REGION in the aws CLI invocations; update every aws command in this script that uses REGION (e.g., the call inside the failed_resources assignment and all other aws --region $REGION occurrences) to quote the variable as "$REGION" to prevent word-splitting/globbing (replace --region $REGION with --region "$REGION" everywhere).
24-27: Return value semantics are counterintuitive (and unused).Returning
1when no failed resources are found suggests an error, but the callers at lines 138 and 159 don't check the return value. Consider either removing the return statement or returning0since "nothing to clean up" isn't necessarily a failure condition.Suggested fix
if [[ -z "${failed_resources}" ]]; then echo "No failed resources found in stack events" - return 1 + return 0 fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh` around lines 24 - 27, The check for empty failed_resources currently does "return 1", which signals an error although callers at lines 138 and 159 don't inspect the return code; change this to a non-error outcome by either removing the return or changing it to "return 0" so "no failed resources" is treated as success/neutral; update the branch that tests [[ -z "${failed_resources}" ]] accordingly (look for the if block that echoes "No failed resources found in stack events") and ensure subsequent callers behave correctly with the non-error return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Around line 71-76: The NAT gateway case currently calls aws ec2
delete-nat-gateway then waits with the wrong waiter `nat-gateway-available`;
update the waiter to `nat-gateway-deleted` so the script waits for deletion
completion (in the AWS::EC2::NatGateway case where `aws --region $REGION ec2
delete-nat-gateway --nat-gateway-id "${physical_id}"` is called, replace the
`aws --region $REGION ec2 wait nat-gateway-available --nat-gateway-ids
"${physical_id}"` invocation with the `nat-gateway-deleted` waiter).
---
Nitpick comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Around line 19-22: Shellcheck warns about unquoted $REGION in the aws CLI
invocations; update every aws command in this script that uses REGION (e.g., the
call inside the failed_resources assignment and all other aws --region $REGION
occurrences) to quote the variable as "$REGION" to prevent
word-splitting/globbing (replace --region $REGION with --region "$REGION"
everywhere).
- Around line 24-27: The check for empty failed_resources currently does "return
1", which signals an error although callers at lines 138 and 159 don't inspect
the return code; change this to a non-error outcome by either removing the
return or changing it to "return 0" so "no failed resources" is treated as
success/neutral; update the branch that tests [[ -z "${failed_resources}" ]]
accordingly (look for the if block that echoes "No failed resources found in
stack events") and ensure subsequent callers behave correctly with the non-error
return.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: c04380de-2d75-4ba3-af82-83be28892dbb
📒 Files selected for processing (1)
ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh
7da44c2 to
a3c9c7b
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh (1)
185-198: Consider quoting variables to satisfy shellcheck and improve robustness.The unquoted variables work correctly in practice since paths don't contain special characters, but quoting them is a defensive best practice.
Suggested fix
echo "Deleting stacks:" - cat ${stack_list} + cat "${stack_list}" export AWS_SHARED_CREDENTIALS_FILE="${CLUSTER_PROFILE_DIR}/.awscred" - delete_stacks ${stack_list} || rc=1 + delete_stacks "${stack_list}" || rc=1 fi stack_list="${SHARED_DIR}/to_be_removed_cf_stack_list_shared_account" if [ -e "${stack_list}" ]; then echo "Deleting stacks in shared account:" - cat ${stack_list} + cat "${stack_list}" export AWS_SHARED_CREDENTIALS_FILE="${CLUSTER_PROFILE_DIR}/.awscred_shared_account" - delete_stacks ${stack_list} || rc=1 + delete_stacks "${stack_list}" || rc=1 fi -exit ${rc:-0} +exit "${rc:-0}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh` around lines 185 - 198, Quote all shell variable expansions in this block to satisfy shellcheck and avoid word-splitting: wrap ${stack_list}, ${CLUSTER_PROFILE_DIR}, ${stack_list} (in cat and delete_stacks calls), and ${SHARED_DIR} where used; specifically update occurrences that reference stack_list, AWS_SHARED_CREDENTIALS_FILE assignments (using ${CLUSTER_PROFILE_DIR}/.awscred and ${CLUSTER_PROFILE_DIR}/.awscred_shared_account), cat ${stack_list}, and delete_stacks ${stack_list} to use quoted forms (e.g., "${stack_list}", "${CLUSTER_PROFILE_DIR}/.awscred", "${SHARED_DIR}/to_be_removed_cf_stack_list_shared_account") so delete_stacks, cat, and the env var assignments are robust.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Around line 185-198: Quote all shell variable expansions in this block to
satisfy shellcheck and avoid word-splitting: wrap ${stack_list},
${CLUSTER_PROFILE_DIR}, ${stack_list} (in cat and delete_stacks calls), and
${SHARED_DIR} where used; specifically update occurrences that reference
stack_list, AWS_SHARED_CREDENTIALS_FILE assignments (using
${CLUSTER_PROFILE_DIR}/.awscred and
${CLUSTER_PROFILE_DIR}/.awscred_shared_account), cat ${stack_list}, and
delete_stacks ${stack_list} to use quoted forms (e.g., "${stack_list}",
"${CLUSTER_PROFILE_DIR}/.awscred",
"${SHARED_DIR}/to_be_removed_cf_stack_list_shared_account") so delete_stacks,
cat, and the env var assignments are robust.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 0c903372-411e-434c-94c1-dbb63a467075
📒 Files selected for processing (1)
ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh
|
@mcornea: job(s): periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector either don't exist or were not found to be affected, and cannot be rehearsed |
a3c9c7b to
b16090a
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh (1)
185-198: Quote variables to prevent word splitting.Per shellcheck SC2086, the variables should be quoted to prevent unexpected word splitting or globbing, even if the paths are typically well-formed in this context.
🔧 Proposed fix
echo "Deleting stacks:" - cat ${stack_list} + cat "${stack_list}" export AWS_SHARED_CREDENTIALS_FILE="${CLUSTER_PROFILE_DIR}/.awscred" - delete_stacks ${stack_list} || rc=1 + delete_stacks "${stack_list}" || rc=1 fi stack_list="${SHARED_DIR}/to_be_removed_cf_stack_list_shared_account" if [ -e "${stack_list}" ]; then echo "Deleting stacks in shared account:" - cat ${stack_list} + cat "${stack_list}" export AWS_SHARED_CREDENTIALS_FILE="${CLUSTER_PROFILE_DIR}/.awscred_shared_account" - delete_stacks ${stack_list} || rc=1 + delete_stacks "${stack_list}" || rc=1 fi -exit ${rc:-0} +exit "${rc:-0}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh` around lines 185 - 198, The shell uses unquoted variables which can trigger word-splitting/globbing; update occurrences to quote variables such as stack_list, CLUSTER_PROFILE_DIR, SHARED_DIR and rc: wrap ${stack_list} as "${stack_list}" when passing to cat and delete_stacks, wrap ${CLUSTER_PROFILE_DIR} and ${SHARED_DIR} when building AWS_SHARED_CREDENTIALS_FILE and stack_list assignments, and quote the final exit ${rc:-0} as "${rc:-0}" to prevent unexpected word splitting; ensure functions/commands referenced (delete_stacks, cat, export AWS_SHARED_CREDENTIALS_FILE) use the quoted forms throughout the aws-deprovision-stacks-commands.sh snippet.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Around line 185-198: The shell uses unquoted variables which can trigger
word-splitting/globbing; update occurrences to quote variables such as
stack_list, CLUSTER_PROFILE_DIR, SHARED_DIR and rc: wrap ${stack_list} as
"${stack_list}" when passing to cat and delete_stacks, wrap
${CLUSTER_PROFILE_DIR} and ${SHARED_DIR} when building
AWS_SHARED_CREDENTIALS_FILE and stack_list assignments, and quote the final exit
${rc:-0} as "${rc:-0}" to prevent unexpected word splitting; ensure
functions/commands referenced (delete_stacks, cat, export
AWS_SHARED_CREDENTIALS_FILE) use the quoted forms throughout the
aws-deprovision-stacks-commands.sh snippet.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 6b81cf8c-6ed8-4222-a592-02ee300c563f
📒 Files selected for processing (2)
ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yamlci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@mcornea: job(s): periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector either don't exist or were not found to be affected, and cannot be rehearsed |
b16090a to
808aa41
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yaml (1)
6893-6904:⚠️ Potential issue | 🟡 MinorKeep
reporter_configon the renamed periodic.Line 6893 now flows straight into
spec, so this job no longer has areporter_configblock. That drops result/Slack reporting for this perfscale periodic, unlike the neighboring jobs in this file, which makes failures much easier to miss.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yaml` around lines 6893 - 6904, The periodic job block for periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector lost its reporter_config section when renamed; restore a reporter_config block (matching the neighboring perfscale periodics) above the spec for this job so result/Slack reporting is enabled again by adding the same reporter_config keys used by adjacent jobs (e.g., slack webhook/channel and reporters) to the job block that contains the given name and spec.
🧹 Nitpick comments (1)
ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh (1)
187-187: Quote variable expansions in changed call/exit paths.ShellCheck SC2086 is valid on the modified lines. Quote these expansions to avoid word splitting/globbing in edge cases.
Suggested fix
- delete_stacks ${stack_list} || rc=1 + delete_stacks "${stack_list}" || rc=1 ... - delete_stacks ${stack_list} || rc=1 + delete_stacks "${stack_list}" || rc=1 ... -exit ${rc:-0} +exit "${rc:-0}"Also applies to: 195-195, 198-198
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh` at line 187, The call to delete_stacks is using unquoted variable expansion (stack_list) which risks word-splitting/globbing; update the invocation(s) of delete_stacks to use quoted expansions (e.g. delete_stacks "${stack_list}") and likewise quote any other similar usages of stack_list in this file (including the other delete_stacks calls around the same block), keeping the existing failure handling (|| rc=1) unchanged so rc is still set on failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Around line 132-135: The current check treats any status that is not
DELETE_FAILED as success; change the logic so only DELETE_COMPLETE (or a missing
stack detected by catching the AWS CLI ValidationError from the describe-stacks
call) is treated as success. Specifically, in the block that inspects
stack_status and stack_name, replace the condition [[ "${stack_status}" !=
"DELETE_FAILED" ]]/continue with an explicit check: if [[ "${stack_status}" ==
"DELETE_COMPLETE" ]]; then echo success and continue; else if the
describe-stacks call returns a ValidationError (no stack), treat that as success
and continue; otherwise do not continue (handle retry/failure path or exit
non-zero) so stacks in states like DELETE_IN_PROGRESS or CREATE_COMPLETE are not
silently skipped. Ensure you update the AWS CLI describe-stacks error handling
to detect and treat the ValidationError as success.
---
Outside diff comments:
In
`@ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yaml`:
- Around line 6893-6904: The periodic job block for
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector
lost its reporter_config section when renamed; restore a reporter_config block
(matching the neighboring perfscale periodics) above the spec for this job so
result/Slack reporting is enabled again by adding the same reporter_config keys
used by adjacent jobs (e.g., slack webhook/channel and reporters) to the job
block that contains the given name and spec.
---
Nitpick comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Line 187: The call to delete_stacks is using unquoted variable expansion
(stack_list) which risks word-splitting/globbing; update the invocation(s) of
delete_stacks to use quoted expansions (e.g. delete_stacks "${stack_list}") and
likewise quote any other similar usages of stack_list in this file (including
the other delete_stacks calls around the same block), keeping the existing
failure handling (|| rc=1) unchanged so rc is still set on failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 65fab76a-11f7-413e-9816-b7e69f480192
📒 Files selected for processing (4)
ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yamlci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yamlci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-presubmits.yamlci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh
✅ Files skipped from review due to trivial changes (1)
- ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yaml
808aa41 to
0effacc
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh (1)
225-235:⚠️ Potential issue | 🔴 CriticalOnly treat a real “stack not found” as delete success.
These blocks still conflate any
describe-stacksfailure with successful deletion, and the!= "DELETE_FAILED"check also treats states likeDELETE_IN_PROGRESSorCREATE_COMPLETEas success. That can leave undeleted stacks behind while this step exits 0.Possible fix
- stack_status=$(aws --region "$REGION" cloudformation describe-stacks \ - --stack-name "${stack_name}" \ - --query 'Stacks[0].StackStatus' --output text 2>/dev/null) || { - echo "Stack ${stack_name} deleted successfully" - continue - } - - if [[ "${stack_status}" != "DELETE_FAILED" ]]; then - echo "Stack ${stack_name} status: ${stack_status}" - continue - fi + local describe_output + describe_output=$(aws --region "$REGION" cloudformation describe-stacks \ + --stack-name "${stack_name}" \ + --query 'Stacks[0].StackStatus' --output text 2>&1) + if [[ $? -ne 0 ]]; then + if grep -q 'ValidationError' <<<"${describe_output}"; then + echo "Stack ${stack_name} deleted successfully" + continue + fi + echo "ERROR: failed to describe stack ${stack_name}: ${describe_output}" >&2 + rc=1 + continue + fi + stack_status="${describe_output}" + + if [[ "${stack_status}" == "DELETE_COMPLETE" ]]; then + echo "Stack ${stack_name} deleted successfully" + continue + fi + if [[ "${stack_status}" != "DELETE_FAILED" ]]; then + echo "ERROR: Stack ${stack_name} still exists with status ${stack_status}" >&2 + rc=1 + continue + fiPlease apply the same handling to the retry and final-retry blocks too.
Also applies to: 246-256, 267-272
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh` around lines 225 - 235, The current logic treats any failure of aws cloudformation describe-stacks as a successful deletion and treats any non-DELETE_FAILED status as success; change the block that sets stack_status (and replicate the same fix in the retry and final-retry blocks) to explicitly detect the “stack not found” error and treat only that as success or treat an explicit DELETE_COMPLETE status as success — otherwise fail/continue for other errors or non-terminal statuses. Concretely, capture the aws describe-stacks stderr output when the call fails (the variable around stack_status / the describe-stacks invocation), check for the specific “does not exist”/ValidationError message and only echo "Stack ... deleted successfully" for that case (or when stack_status == "DELETE_COMPLETE"); for other failures (network/errors) preserve the error and do not assume deletion, and keep the existing special-case for DELETE_FAILED to trigger further handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Around line 142-152: The NAT gateway deletion loop collects nat_ids, issues
delete-nat-gateway calls, then sleeps a fixed 30s which can be too short;
replace the fixed sleep with an explicit wait that polls the same resource IDs
until they are fully removed (or timeout), e.g. after the delete-nat-gateway
calls poll aws ec2 describe-nat-gateways for the collected nat_ids (checking
NatGatewayId/state or empty results) in a retry loop with backoff and a total
timeout and only continue once none remain; update the block that sets nat_ids,
iterates delete-nat-gateway, and the subsequent wait logic to use this polling
approach to prevent VPC deletion racing with NAT gateways still in "deleting"
state.
---
Duplicate comments:
In
`@ci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh`:
- Around line 225-235: The current logic treats any failure of aws
cloudformation describe-stacks as a successful deletion and treats any
non-DELETE_FAILED status as success; change the block that sets stack_status
(and replicate the same fix in the retry and final-retry blocks) to explicitly
detect the “stack not found” error and treat only that as success or treat an
explicit DELETE_COMPLETE status as success — otherwise fail/continue for other
errors or non-terminal statuses. Concretely, capture the aws describe-stacks
stderr output when the call fails (the variable around stack_status / the
describe-stacks invocation), check for the specific “does not
exist”/ValidationError message and only echo "Stack ... deleted successfully"
for that case (or when stack_status == "DELETE_COMPLETE"); for other failures
(network/errors) preserve the error and do not assume deletion, and keep the
existing special-case for DELETE_FAILED to trigger further handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: cf097086-c2f0-48c2-bcdc-2595c4b46f3d
📒 Files selected for processing (4)
ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yamlci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yamlci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-presubmits.yamlci-operator/step-registry/aws/deprovision/stacks/aws-deprovision-stacks-commands.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yaml
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@mcornea, If the problem persists, please contact Test Platform. |
0effacc to
66a3be8
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
66a3be8 to
5640689
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
bb87149 to
40e05e0
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yaml`:
- Around line 6893-6894: The periodic job
"periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector"
is missing the reporter_config block so Slack reporting is not configured;
update this job's spec to include a reporter_config identical to the neighboring
perfscale periodics (same channel `#ocp-qe-scale-ci-results`, reporter type and
credentials/secrets) so failures/successes are sent to Slack, ensuring the
reporter_config key and its nested fields match the other perfscale jobs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 50083565-ad26-4d07-b7a9-13595426e98f
📒 Files selected for processing (3)
ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yamlci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yamlci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-presubmits.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yaml
- ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-presubmits.yaml
| name: periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector | ||
| spec: |
There was a problem hiding this comment.
Restore Slack reporting for this renamed periodic.
This block no longer defines reporter_config, unlike the neighboring perfscale periodics in this file, so failures/successes for this job will stop reaching #ocp-qe-scale-ci-results.
🔔 Suggested fix
name: periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector
+ reporter_config:
+ slack:
+ channel: '#ocp-qe-scale-ci-results'
+ job_states_to_report:
+ - success
+ - failure
+ - error
+ report_template: '{{if eq .Status.State "success"}} :white_check_mark: Job *{{.Spec.Job}}*
+ ended with *{{.Status.State}}*. <{{.Status.URL}}|View logs> :white_check_mark:
+ {{else}} :warning: Job *{{.Spec.Job}}* ended with *{{.Status.State}}*. <{{.Status.URL}}|View
+ logs> :warning: {{end}}'
spec:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yaml`
around lines 6893 - 6894, The periodic job
"periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector"
is missing the reporter_config block so Slack reporting is not configured;
update this job's spec to include a reporter_config identical to the neighboring
perfscale periodics (same channel `#ocp-qe-scale-ci-results`, reporter type and
credentials/secrets) so failures/successes are sent to Slack, ensuring the
reporter_config key and its nested fields match the other perfscale jobs.
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-24nodes-onperfsector |
|
@mcornea: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
This covers scenarios where the VPC fails to delete due to related depdent resources: |
Signed-off-by: Marius Cornea <mcornea@redhat.com>
40e05e0 to
a030642
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mcornea The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
[REHEARSALNOTIFIER]
A total of 2347 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
@mcornea: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary by CodeRabbit
Bug Fixes
Chores