-
Notifications
You must be signed in to change notification settings - Fork 25
Configs, utils and common functions for containerized IIB #1236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR modularizes index.db artifact handling by introducing config-driven parsing, templating, and cache operations in ORAS utilities; implements containerized helper functions for artifact pulls, build metadata, and failure cleanup; extends the configuration schema and validation with new index.db registry and template options; enhances Git and Konflux task utilities for commit handling and pipeline polling; and adds a robust working‐directory context manager, all backed by comprehensive tests. Sequence diagram for index.db artifact pull with ImageStream cache logicsequenceDiagram
participant Worker
participant ContainerizedUtils
participant OrasUtils
participant Registry
participant ImageStream
Worker->>ContainerizedUtils: pull_index_db_artifact(from_index, temp_dir)
ContainerizedUtils->>OrasUtils: verify_indexdb_cache_for_image(from_index)
alt Cache is synced
ContainerizedUtils->>OrasUtils: get_imagestream_artifact_pullspec(from_index)
ContainerizedUtils->>OrasUtils: get_oras_artifact(imagestream_ref, temp_dir)
OrasUtils->>ImageStream: Pull index.db artifact
else Cache not synced
ContainerizedUtils->>OrasUtils: refresh_indexdb_cache_for_image(from_index)
ContainerizedUtils->>OrasUtils: get_indexdb_artifact_pullspec(from_index)
ContainerizedUtils->>OrasUtils: get_oras_artifact(artifact_ref, temp_dir)
OrasUtils->>Registry: Pull index.db artifact
end
Sequence diagram for cleanup on failure in containerized_utilssequenceDiagram
participant Worker
participant ContainerizedUtils
participant GitUtils
participant OrasUtils
Worker->>ContainerizedUtils: cleanup_on_failure(...)
alt Merge request exists
ContainerizedUtils->>GitUtils: close_mr(mr_details, index_git_repo)
else Overwrite from index and last commit exists
ContainerizedUtils->>GitUtils: revert_last_commit(...)
else No MR or commit
ContainerizedUtils->>Worker: Log no cleanup needed
end
alt original_index_db_digest exists
ContainerizedUtils->>OrasUtils: get_indexdb_artifact_pullspec(from_index)
ContainerizedUtils->>OrasUtils: run_cmd(['oras', 'copy', source_ref, v4x_artifact_ref])
end
ER diagram for new and updated configuration options for index.db artifact handlingerDiagram
CONFIG {
string iib_index_db_imagestream_registry
string iib_index_db_artifact_registry
string iib_index_db_artifact_tag_template
string iib_index_db_artifact_template
}
DEVELOPMENT_CONFIG {
string iib_index_db_artifact_registry
}
TESTING_CONFIG {
string iib_index_db_artifact_registry
string iib_index_db_imagestream_registry
}
CONFIG ||--|{ DEVELOPMENT_CONFIG : extends
DEVELOPMENT_CONFIG ||--|{ TESTING_CONFIG : extends
Class diagram for new and updated utility functions (containerized_utils, oras_utils, git_utils, utils)classDiagram
class ContainerizedUtils {
+pull_index_db_artifact(from_index: str, temp_dir: str) str
+write_build_metadata(local_repo_path: str, opm_version: str, ocp_version: str, distribution_scope: str, binary_image: str, request_id: int) None
+cleanup_on_failure(mr_details: Optional[Dict[str, str]], last_commit_sha: Optional[str], index_git_repo: Optional[str], overwrite_from_index: bool, request_id: int, from_index: str, index_repo_map: Dict[str, str], original_index_db_digest: Optional[str], reason: str) None
}
class OrasUtils {
+_get_name_and_tag_from_pullspec(image_pullspec: str) Tuple[str, str]
+_get_artifact_combined_tag(image_name: str, tag: str) str
+get_indexdb_artifact_pullspec(from_index: str) str
+get_imagestream_artifact_pullspec(from_index: str) str
+verify_indexdb_cache_for_image(index_image_pullspec: str) str
+refresh_indexdb_cache_for_image(index_image_pullspec: str) str
}
class GitUtils {
+get_last_commit_sha(local_repo_path: str) str
}
class Utils {
+change_dir(new_dir: str) contextmanager
}
ContainerizedUtils ..> OrasUtils : uses
ContainerizedUtils ..> GitUtils : uses
OrasUtils ..> Utils : uses
OrasUtils ..> get_worker_config : uses
ContainerizedUtils ..> get_worker_config : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Centralize the pullspec parsing and combined‐tag generation helpers into a single well-documented API to avoid duplication and ensure consistent behavior across modules.
- In change_dir, wrap the chdir back to the original directory in its own try/except so that errors during restoration don’t mask the original exception or leave you in the wrong directory.
- The early return in commit_and_push (when no changes) and the updated return value in wait_for_pipeline_completion may break callers expecting the previous behavior—please document and adjust downstream code as needed.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Centralize the pullspec parsing and combined‐tag generation helpers into a single well-documented API to avoid duplication and ensure consistent behavior across modules.
- In change_dir, wrap the chdir back to the original directory in its own try/except so that errors during restoration don’t mask the original exception or leave you in the wrong directory.
- The early return in commit_and_push (when no changes) and the updated return value in wait_for_pipeline_completion may break callers expecting the previous behavior—please document and adjust downstream code as needed.
## Individual Comments
### Comment 1
<location> `iib/workers/tasks/opm_operations.py:1124` </location>
<code_context>
+ index_git_repo: Optional[str],
+ overwrite_from_index: bool,
+ request_id: int,
+ from_index: str,
+ index_repo_map: Dict[str, str],
+ original_index_db_digest: Optional[str] = None,
</code_context>
<issue_to_address>
**issue:** Using 'str | None' syntax may break compatibility with Python <3.10.
If older Python versions are supported, use 'Optional[str]' from 'typing' instead of 'str | None'.
</issue_to_address>
### Comment 2
<location> `iib/workers/tasks/opm_operations.py:1150` </location>
<code_context>
- with set_registry_token(overwrite_from_index_token, from_index, append=True):
- index_db_path = get_hidden_index_database(from_index=from_index, base_dir=base_dir)
+ # When index_db_path is not provided, extract the index db from the given index image
+ if not index_db_path or not os.path.exists(index_db_path) and from_index:
+ # check if operator packages exists in hidden index.db
+ # we are not checking /config dir since it contains FBC opted-in operators and to remove those
</code_context>
<issue_to_address>
**issue:** Operator precedence in the conditional may lead to unexpected behavior.
Add parentheses to clarify the intended logic and avoid precedence issues.
</issue_to_address>
### Comment 3
<location> `iib/workers/tasks/git_utils.py:185` </location>
<code_context>
f"IIB: Update for request id {request_id} (overwrite_from_index)"
)
+
+ log.info("Commiting changes to local Git repository.")
+ run_cmd(["git", "-C", local_repo_path, "add", "."], exc_msg="Error staging changes to git")
+ git_status = run_cmd(
</code_context>
<issue_to_address>
**nitpick (typo):** Typo in log message: 'Commiting' should be 'Committing'.
Please update the log message to use the correct spelling: 'Committing'.
```suggestion
log.info("Committing changes to local Git repository.")
```
</issue_to_address>
### Comment 4
<location> `iib/workers/tasks/utils.py:1306-1307` </location>
<code_context>
return bundle_metadata
+
+
+@contextmanager
+def change_dir(new_dir: str) -> Generator[None, Any, None]:
+ """
+ Context manager for temporarily changing the current working directory.
</code_context>
<issue_to_address>
**suggestion:** The context manager does not handle exceptions during os.chdir back to prev_dir.
Catch and log exceptions from os.chdir(prev_dir) in the finally block to prevent masking the original error.
Suggested implementation:
```python
import logging
@contextmanager
def change_dir(new_dir: str) -> Generator[None, Any, None]:
"""
Context manager for temporarily changing the current working directory.
This context manager allows temporary switching to a new directory during
the execution of a block of code. Once the block is exited, it ensures the
working directory is reverted to its original state, even in cases where
an error occurs within the block.
:param str new_dir: new directory to switch to
:raises OSError: If changing to the new directory or reverting to the
original directory fails.
"""
```
```python
import os
prev_dir = os.getcwd()
try:
os.chdir(new_dir)
yield
finally:
try:
os.chdir(prev_dir)
except Exception as exc:
logging.exception(f"Failed to change directory back to {prev_dir} from {new_dir}")
# Do not mask the original exception if one occurred in the context block
```
</issue_to_address>
### Comment 5
<location> `iib/workers/tasks/opm_operations.py:1146` </location>
<code_context>
index_name = from_index if from_index else "database"
</code_context>
<issue_to_address>
**suggestion (code-quality):** Replace if-expression with `or` ([`or-if-exp-identity`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/or-if-exp-identity))
```suggestion
index_name = from_index or "database"
```
<br/><details><summary>Explanation</summary>Here we find ourselves setting a value if it evaluates to `True`, and otherwise
using a default.
The 'After' case is a bit easier to read and avoids the duplication of
`input_currency`.
It works because the left-hand side is evaluated first. If it evaluates to
true then `currency` will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
`currency` will be set to `DEFAULT_CURRENCY`.
</details>
</issue_to_address>
### Comment 6
<location> `iib/workers/tasks/containerized_utils.py:38-39` </location>
<code_context>
def pull_index_db_artifact(from_index: str, temp_dir: str) -> str:
"""
Pull index.db artifact from registry, using ImageStream cache if available.
This function determines whether to use OpenShift ImageStream cache or pull directly
from the registry based on the iib_use_imagestream_cache configuration.
:param str from_index: The from_index pullspec
:param str temp_dir: Temporary directory where the artifact will be extracted
:return: Path to the directory containing the extracted artifact
:rtype: str
:raises IIBError: If the pull operation fails
"""
from iib.workers.tasks.oras_utils import (
get_indexdb_artifact_pullspec,
get_imagestream_artifact_pullspec,
get_oras_artifact,
refresh_indexdb_cache_for_image,
verify_indexdb_cache_for_image,
)
conf = get_worker_config()
if conf.get('iib_use_imagestream_cache', False):
# Verify index.db cache is synced. Refresh if not.
log.info('ImageStream cache is enabled. Checking cache sync status.')
cache_synced = verify_indexdb_cache_for_image(from_index)
if cache_synced:
log.info('Index.db cache is synced. Pulling from ImageStream.')
# Pull from ImageStream when digests match
imagestream_ref = get_imagestream_artifact_pullspec(from_index)
artifact_dir = get_oras_artifact(
imagestream_ref,
temp_dir,
)
else:
log.info('Index.db cache is not synced. Refreshing and pulling from Quay.')
refresh_indexdb_cache_for_image(from_index)
# Pull directly from Quay after triggering refresh
artifact_ref = get_indexdb_artifact_pullspec(from_index)
artifact_dir = get_oras_artifact(
artifact_ref,
temp_dir,
)
else:
# Pull directly from Quay without ImageStream cache
log.info('ImageStream cache is disabled. Pulling index.db artifact directly from registry.')
artifact_ref = get_indexdb_artifact_pullspec(from_index)
artifact_dir = get_oras_artifact(
artifact_ref,
temp_dir,
)
return artifact_dir
</code_context>
<issue_to_address>
**suggestion (code-quality):** Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
```suggestion
if cache_synced := verify_indexdb_cache_for_image(from_index):
```
</issue_to_address>
### Comment 7
<location> `iib/workers/tasks/containerized_utils.py:165` </location>
<code_context>
def cleanup_on_failure(
mr_details: Optional[Dict[str, str]],
last_commit_sha: Optional[str],
index_git_repo: Optional[str],
overwrite_from_index: bool,
request_id: int,
from_index: str,
index_repo_map: Dict[str, str],
original_index_db_digest: Optional[str] = None,
reason: str = "error",
) -> None:
"""
Clean up Git changes and index.db artifacts on failure.
If a merge request was created, it will be closed (since the commit is only in a
feature branch). If changes were pushed directly to the main branch, the commit
will be reverted. If the index.db artifact was pushed to the v4.x tag, it will be
restored to the original digest.
:param Optional[Dict[str, str]] mr_details: Details of the merge request if one was created
:param Optional[str] last_commit_sha: The SHA of the last commit
:param Optional[str] index_git_repo: URL of the Git repository
:param bool overwrite_from_index: Whether to overwrite the from_index
:param int request_id: The IIB request ID
:param str from_index: The from_index pullspec
:param Dict[str, str] index_repo_map: Mapping of index images to Git repositories
:param Optional[str] original_index_db_digest: Original digest of index.db before overwrite
:param str reason: Reason for the cleanup (used in log messages)
"""
if mr_details and index_git_repo:
# If we created an MR, just close it (commit is only in feature branch)
log.info("Closing merge request due to %s", reason)
try:
from iib.workers.tasks.git_utils import close_mr
close_mr(mr_details, index_git_repo)
log.info("Closed merge request: %s", mr_details.get('mr_url'))
except Exception as close_error:
log.warning("Failed to close merge request: %s", close_error)
elif overwrite_from_index and last_commit_sha:
# If we pushed directly, revert the commit
log.error("Reverting commit due to %s", reason)
try:
from iib.workers.tasks.git_utils import revert_last_commit
revert_last_commit(
request_id=request_id,
from_index=from_index,
index_repo_map=index_repo_map,
)
except Exception as revert_error:
log.error("Failed to revert commit: %s", revert_error)
else:
log.error("Neither MR nor commit to revert. No cleanup needed for %s", reason)
# Restore index.db artifact to original digest if it was overwritten
if original_index_db_digest:
log.info("Restoring index.db artifact to original digest due to %s", reason)
try:
from iib.workers.tasks.oras_utils import get_indexdb_artifact_pullspec
from iib.workers.tasks.utils import run_cmd
# Get the v4.x artifact reference
v4x_artifact_ref = get_indexdb_artifact_pullspec(from_index)
# Extract registry and repository from the pullspec
# Format: quay.io/namespace/repo:tag -> we need quay.io/namespace/repo
artifact_name = v4x_artifact_ref.rsplit(':', 1)[0]
# Use oras copy to restore the old digest to v4.x tag
# This is a registry-to-registry copy, no download needed
source_ref = f'{artifact_name}@{original_index_db_digest}'
log.info("Restoring %s from %s", v4x_artifact_ref, source_ref)
run_cmd(
['oras', 'copy', source_ref, v4x_artifact_ref],
exc_msg=f'Failed to restore index.db artifact from {source_ref} to {v4x_artifact_ref}',
)
log.info("Successfully restored index.db artifact to original digest")
except Exception as restore_error:
log.error("Failed to restore index.db artifact: %s", restore_error)
</code_context>
<issue_to_address>
**issue (code-quality):** Extract code out into function ([`extract-method`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/extract-method/))
</issue_to_address>
### Comment 8
<location> `iib/workers/tasks/oras_utils.py:50-53` </location>
<code_context>
def _get_name_and_tag_from_pullspec(image_pullspec: str) -> Tuple[str, str]:
"""
Parses a container image pullspec (URL) to extract the index name and tag.
:param str image_pullspec: The full image pullspec string (e.g., registry/path/name:tag[@digest]).
:returns Tuple[str, str]: The extracted index name and tag (e.g., 'iib-pub-pending', 'v4.17').
:raises IIBError: If the pullspec is missing the required tag delimiter (':')
or if the name:tag structure cannot be parsed.
"""
# Regex to capture the image name and tag, ignoring an optional digest.
# r'/([^/:]+):([^@]+)(@.*)?$'
# Group 1: ([^/:]+) -> Image Name (the last path segment before the colon)
# Group 2: ([^@]+) -> Tag (the part after the colon, before '@' or end of string)
# Group 3: (@.*)? -> Optional digest part, which is ignored
regex = re.compile(r'/([^/:]+):([^@]+)(@.*)?$')
match = regex.search(image_pullspec)
if not match:
# Check for the most common error: missing the tag delimiter (:)
if ':' not in image_pullspec:
raise IIBError(
f"Invalid pullspec format: '{image_pullspec}'. Missing tag (':') delimiter."
)
# Raise a general error if the regex failed for other reasons
raise IIBError(
f"Invalid pullspec format: '{image_pullspec}'. Could not parse name:tag structure."
)
# Group 1: Image Name (e.g., 'iib-pub-pending')
index_name = match.group(1)
# Group 2: Tag (e.g., 'v4.17')
tag = match.group(2)
# Final check to ensure the tag isn't empty (e.g., 'image:')
if not tag:
raise IIBError(f"Invalid pullspec format: '{image_pullspec}'. Tag is present but empty.")
return index_name, tag
</code_context>
<issue_to_address>
**issue (code-quality):** Replace m.group(x) with m[x] for re.Match objects [×2] ([`use-getitem-for-re-match-groups`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/use-getitem-for-re-match-groups/))
</issue_to_address>
### Comment 9
<location> `iib/workers/tasks/oras_utils.py:323-325` </location>
<code_context>
def get_imagestream_artifact_pullspec(from_index: str) -> str:
"""
Get the ImageStream pullspec for the index.db artifact.
This function constructs the internal OpenShift ImageStream pullspec that can be used
to pull the index.db artifact from the cached ImageStream instead of directly from Quay.
:param str from_index: The from_index pullspec
:return: ImageStream pullspec for the artifact
:rtype: str
"""
conf = get_worker_config()
image_name, tag = _get_name_and_tag_from_pullspec(from_index)
combined_tag = _get_artifact_combined_tag(image_name, tag)
# ImageStream pullspec format: image-registry.openshift-image-registry.svc:5000/{namespace}/index-db:{combined_tag}
imagestream_pullspec = conf['iib_index_db_artifact_template'].format(
registry=conf['iib_index_db_imagestream_registry'], tag=combined_tag
)
return imagestream_pullspec
</code_context>
<issue_to_address>
**issue (code-quality):** Inline variable that is immediately returned ([`inline-immediately-returned-variable`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/inline-immediately-returned-variable/))
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
|
|
||
| def verify_operators_exists( | ||
| from_index: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Using 'str | None' syntax may break compatibility with Python <3.10.
If older Python versions are supported, use 'Optional[str]' from 'typing' instead of 'str | None'.
| with set_registry_token(overwrite_from_index_token, from_index, append=True): | ||
| index_db_path = get_hidden_index_database(from_index=from_index, base_dir=base_dir) | ||
| # When index_db_path is not provided, extract the index db from the given index image | ||
| if not index_db_path or not os.path.exists(index_db_path) and from_index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Operator precedence in the conditional may lead to unexpected behavior.
Add parentheses to clarify the intended logic and avoid precedence issues.
| @contextmanager | ||
| def change_dir(new_dir: str) -> Generator[None, Any, None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: The context manager does not handle exceptions during os.chdir back to prev_dir.
Catch and log exceptions from os.chdir(prev_dir) in the finally block to prevent masking the original error.
Suggested implementation:
import logging
@contextmanager
def change_dir(new_dir: str) -> Generator[None, Any, None]:
"""
Context manager for temporarily changing the current working directory.
This context manager allows temporary switching to a new directory during
the execution of a block of code. Once the block is exited, it ensures the
working directory is reverted to its original state, even in cases where
an error occurs within the block.
:param str new_dir: new directory to switch to
:raises OSError: If changing to the new directory or reverting to the
original directory fails.
""" import os
prev_dir = os.getcwd()
try:
os.chdir(new_dir)
yield
finally:
try:
os.chdir(prev_dir)
except Exception as exc:
logging.exception(f"Failed to change directory back to {prev_dir} from {new_dir}")
# Do not mask the original exception if one occurred in the context block| packages_in_index: Set[str] = set() | ||
|
|
||
| log.info("Verifying if operator packages %s exists in index %s", operator_packages, from_index) | ||
| index_name = from_index if from_index else "database" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace if-expression with or (or-if-exp-identity)
| index_name = from_index if from_index else "database" | |
| index_name = from_index or "database" |
Explanation
Here we find ourselves setting a value if it evaluates toTrue, and otherwiseusing a default.
The 'After' case is a bit easier to read and avoids the duplication of
input_currency.
It works because the left-hand side is evaluated first. If it evaluates to
true then currency will be set to this and the right-hand side will not be
evaluated. If it evaluates to false the right-hand side will be evaluated and
currency will be set to DEFAULT_CURRENCY.
| cache_synced = verify_indexdb_cache_for_image(from_index) | ||
| if cache_synced: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)
| cache_synced = verify_indexdb_cache_for_image(from_index) | |
| if cache_synced: | |
| if cache_synced := verify_indexdb_cache_for_image(from_index): |
| if original_index_db_digest: | ||
| log.info("Restoring index.db artifact to original digest due to %s", reason) | ||
| try: | ||
| from iib.workers.tasks.oras_utils import get_indexdb_artifact_pullspec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Extract code out into function (extract-method)
| index_name = match.group(1) | ||
|
|
||
| # Group 2: Tag (e.g., 'v4.17') | ||
| tag = match.group(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Replace m.group(x) with m[x] for re.Match objects [×2] (use-getitem-for-re-match-groups)
| imagestream_pullspec = conf['iib_index_db_artifact_template'].format( | ||
| registry=conf['iib_index_db_imagestream_registry'], tag=combined_tag | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)
Assisted by: Gemini, Claude, Cursor [CLOUDDST-28644] [CLOUDDST-28865]
2068d24 to
bc97cfa
Compare
Adding common functionality and configs for containerized IIB.
Assisted by: Gemini, Claude, Cursor, ChatGPT
[CLOUDDST-28644]
[CLOUDDST-28865]
Summary by Sourcery
Enable containerized IIB workflows by adding utilities for OCI artifact handling and index.db cache management, introducing relevant config options and templates, and expanding tests and documentation.
New Features:
Enhancements:
Documentation:
Tests:
Chores: