Skip to content

ROX-33431: Bump resources again to prevent OOM kills#20902

Merged
msugakov merged 3 commits into
masterfrom
misha/ROX-33431-second-round
Jun 5, 2026
Merged

ROX-33431: Bump resources again to prevent OOM kills#20902
msugakov merged 3 commits into
masterfrom
misha/ROX-33431-second-round

Conversation

@msugakov

@msugakov msugakov commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Description

Follows up on #20655 with findings from OOM dashboard. See https://redhat-internal.slack.com/archives/C081BAM590Q/p1780296820165749

Note that I had to manually pull some info from pods to determine for which branch they ran. That's how I narrowed the set of failures to look at.

It's best to review this PR per commits because I shared info in the commit message in each.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

No change.

How I validated my change

Only CI. Will run a few times.

@msugakov msugakov added the konflux-build Run Konflux in PR. Push commit to trigger it. label Jun 1, 2026
@msugakov msugakov requested review from a team and rhacs-bot as code owners June 1, 2026 10:48
@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 409ac0e0-2693-402d-a69e-55b24be6eed7

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch misha/ROX-33431-second-round

Comment @coderabbitai help to get the list of available commands and usage tips.

@rhacs-bot rhacs-bot requested a review from a team June 1, 2026 10:48
@msugakov msugakov changed the title Fix CPU for build-source-image/step-build ROX-33431: Bump resources again to prevent OOM kills Jun 1, 2026
msugakov added 3 commits June 1, 2026 14:39
There are only 3 failures in the last 7 days and I think it may
be more likely caused by unbounded CPU than by insufficient memory.

Trying to limit the CPU as the first step.

Here's what I found for oom-killed containers on the cluster:

```
      name: step-build
      resources:
        limits:
          memory: 3Gi
        requests:
          cpu: 250m
          memory: 3Gi
```
Because still saw OOM kill on master.
In `main-on-push-x5lxh-prefetch-dependencies-pod`:

```
      name: step-prefetch-dependencies
      resources:
        limits:
          cpu: "1"
          memory: 5Gi
        requests:
          cpu: "1"
          memory: 5Gi
```

```
      name: step-prefetch-dependencies
      ready: false
      restartCount: 0
      started: false
      state:
        terminated:
          containerID: cri-o://3a3d5195556288c2d164f1fe7d03aa2da4df6f25ba79ba55f911ac0e2c5cde40
          exitCode: 1
          finishedAt: "2026-05-29T16:28:49Z"
          message: '[{"key":"StartedAt","value":"2026-05-29T16:26:49.307Z","type":3}]'
          reason: OOMKilled
          startedAt: "2026-05-29T16:26:39Z"
```

Don't know if just 1Gb will be enough, but found just 3 examples in
the last 7 days so I hope it will.

- main-on-push-x5lxh-prefetch-dependencies-pod
- operator-on-push-9gpwt-prefetch-dependencies-pod
- roxctl-on-push-jpppc-prefetch-dependencies-pod
Still saw failures on master with the 5G setting.
For example, in `main-on-push-qtwtc-sast-unicode-check-pod`:

```
      name: step-use-trusted-artifact
      resources:
        limits:
          cpu: "2"
          memory: 5Gi
        requests:
          cpu: "2"
          memory: 5Gi
```

```
      name: step-use-trusted-artifact
      ready: false
      restartCount: 0
      started: false
      state:
        terminated:
          containerID: cri-o://c70b7cd68bb9498fc70b65477dbb96da3af650775bad0a9632f833000e951620
          exitCode: 2
          finishedAt: "2026-05-27T11:37:10Z"
          message: '[{"key":"StartedAt","value":"2026-05-27T11:17:39.524Z","type":3}]'
          reason: OOMKilled
          startedAt: "2026-05-27T11:17:32Z"
```

These, for example, were affected in the last 7 days:
- main-on-push-qtwtc-sast-unicode-check-pod
- operator-on-push-9wlxl-sast-snyk-check-pod
- operator-on-push-vbn5p-build-source-image-pod
- roxctl-on-push-9477x-build-images-3-pod

There's about a dozen of others so I bump memory by 2G.
@msugakov msugakov force-pushed the misha/ROX-33431-second-round branch from 81ed172 to a313848 Compare June 1, 2026 12:39
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest operator-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest operator-bundle-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest roxctl-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest central-db-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest main-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest scanner-v4-db-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest operator-index-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest scanner-v4-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest central-db-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest roxctl-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest scanner-v4-db-on-push

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

/konflux-retest scanner-v4-on-push

@msugakov msugakov added the disable-konflux-auto-retest Disable automatic Konflux pipeline re-runs label Jun 1, 2026
@msugakov

msugakov commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest checks

@msugakov

msugakov commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

🚀 Build Images Ready

Images are ready for commit 40f6647. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.12.x-96-g40f6647d5f

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest central-db-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest main-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest operator-bundle-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest operator-index-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest operator-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest scanner-v4-db-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest scanner-v4-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest roxctl-on-push

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest retag-collector

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest retag-fact

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest retag-scanner

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest retag-scanner-db

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest retag-scanner-db-slim

@msugakov

msugakov commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest retag-scanner-slim

@msugakov

msugakov commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest operator-bundle-on-push

@msugakov

msugakov commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest create-custom-snapshot

@msugakov

msugakov commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

/konflux-retest operator-index-on-push

@msugakov msugakov merged commit 40f6647 into master Jun 5, 2026
226 of 283 checks passed
@msugakov msugakov deleted the misha/ROX-33431-second-round branch June 5, 2026 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable-konflux-auto-retest Disable automatic Konflux pipeline re-runs konflux-build Run Konflux in PR. Push commit to trigger it.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants