Fix HF cache permissions and improve deployment tooling. Fix Hugging … #132

Yael-F · 2025-10-29T15:00:23Z

Fix HF cache permission issues, add missing 'packaging' dependency in Tekton tasks, and enhance Makefile with help messages

…Face cache permission issues, add missing 'packaging' dependency in Tekton tasks, and enhance Makefile with help messages

# Conflicts: # deploy/Makefile

github-actions

Okay, thanks for providing the additional details and clarifications. Here's an updated summary and comments based on the latest information:

Review Summary:

The PR primarily addresses compatibility issues with the langchain library and introduces basic OpenTelemetry instrumentation. The downgrade of openai and transformers and the addition of opentelemetry dependencies are intentional and necessary to align with the specific langchain version used. The removal of unused dependencies is a positive step.

File: requirements.txt

Improvement: The addition of opentelemetry libraries is great, but needs documentation on the usage.
- Suggestion: Document the OpenTelemetry setup (e.g., environment variables, exporter configuration). It's essential to have instructions on how to enable and configure telemetry data export.
Issue: No specific versions are stated for opentelemetry-*.
- Suggestion: It's generally a good practice to pin the versions of opentelemetry-* packages to ensure consistent behavior across deployments.
Clarification: The comments explain why the OpenAI and Transformers versions were downgraded.
- Suggestion: Provide a link to the related langchain issue. This can help others understand the reasoning behind this specific version choice. Also, make sure that your setup will not break because of security patches with the old versions of openai and transformers.
Improvement: langchain version is now specified.

File: deploy/tekton/base/tasks/execute_sast_ai_workflow.yaml

Improvement: Single installation of packaging at the beginning of the task and explanation provided.

Overall, the changes seem well-considered and address a specific compatibility issue. However, make sure to add documentation and version control of your changes.

Okay, I'll review the changes with a focus on clarity, simplicity, efficiency, and security. Here's a summary and detailed comments on the provided diffs.

Review Summary:

The changes look generally good and introduce some helpful configuration and caching improvements. The Makefile update is well-structured and adds helpful documentation. There are a few points related to security best practices and potential operational issues to address.

File-Specific Comments:

1. Containerfile

Issue: The use of USER 0 to perform the chown operation might be considered a security risk, although the scope is limited to within the container build.
- Suggestion: While the current approach works, consider using sudo -u 1001 to perform the chown command instead or using USER 1001 before the COPY commands to ensure files are copied with the correct ownership.
Improvement: Excellent addition of the Hugging Face cache directories and setting environment variables. This is crucial for performance and avoids permission issues.

2. deploy/Makefile

Improvement: The addition of comments explaining the purpose of the Makefile and each section is excellent for readability and maintainability.
Improvement: Structuring the Makefile into sections (Environment, Secrets, Execution) is a great way to organize it.
Issue: Hardcoding default values like http://<<please-set-llm-url>> is a potential problem. If the user doesn't override these, the pipeline will fail in a non-obvious way.
- Suggestion: Add a check in the Makefile to ensure that critical variables like LLM_URL and EMBEDDINGS_LLM_URL are set. If they aren't, print an error message and exit. Consider using $(error ...) to force a Makefile error.
Security Consideration: Storing service account keys directly in the repository (even as examples) is generally discouraged.
- Suggestion: Emphasize in the documentation that these files should never be committed to version control. Consider using .gitignore to explicitly exclude files matching *_service_account.json. Suggest alternative methods for securely providing these credentials (e.g., using Kubernetes secrets or a dedicated secrets management solution).
Suggestion: Consider adding a help target to the Makefile that lists all available commands and their parameters. This improves usability. Example:

help:
	@echo "Usage: make <target>"
	@echo ""
	@echo "Targets:"
	@echo "  setup      - Configure the environment"
	@echo "  run        - Run the SAST AI pipeline"
	@echo "  secrets    - Configure secrets"
	@echo "  help       - Show this help message"

Let me know if you'd like more specific examples or further clarification on any of these points.

Okay, I've reviewed the provided diff. Here's a summary of the issues and suggestions:

Review Summary:

The changes introduce better organization and clarity to the configuration file. The addition of comments describing the purpose of each variable and grouping them by functionality (e.g., "Optional Features Configuration", "Container Image Configuration") significantly improves readability. The separation of "Phony Targets Declaration" and "Default Target" is also good practice.

File: Makefile

Issue: Repetition of S3_ENDPOINT_URL. It is defined twice.
- Suggestion: Remove the redundant definition to avoid confusion and potential errors.
Issue: Missing "Secret Configuration" section.
- Suggestion: Add "Secret Configuration" section including GITLAB_TOKEN, LLM_API_KEY, EMBEDDINGS_LLM_API_KEY, GOOGLE_SERVICE_ACCOUNT_JSON_PATH, GCS_SERVICE_ACCOUNT_JSON_PATH, and DOCKER_CONFIG_PATH to be in line with the comment.

Overall, these are good changes that enhance the maintainability and understanding of the Makefile.

--- a/Makefile
+++ b/Makefile
@@ -1,13 +1,13 @@
 # =============================================================================
 # Project Settings
 # =============================================================================
-PROJECT_NAME := sast-ai-workflow
-IMAGE_NAME := sast-ai-workflow
-IMAGE_REGISTRY := quay.io/ecosystem-appeng
+# The name of the project.
+PROJECT_NAME ?= sast-ai-workflow
+# The name of the Docker image.
+IMAGE_NAME ?= sast-ai-workflow
+# The Docker image registry to use.
+IMAGE_REGISTRY ?= quay.io/ecosystem-appeng
 
-# Usage:
-#   make deploy ENV=mlops              # Deploy mlops (S3/Minio, :latest)
-#   make deploy ENV=prod IMAGE_VERSION=1.2.3  # Deploy prod (Google Drive, versioned)
 deploy:
 -	@if [ "$(ENV)" = "prod" ] && [ -z "$(IMAGE_VERSION)" ]; then \
 +	@if [ -z "$(IMAGE_VERSION)" ]; then \
@@ -34,3 +34,4 @@
  echo "   Environment: Base

File: Makefile

The deploy target has been removed and replaced with deploy-dev and deploy-prod, which is good for clarity.
The help target is very helpful.
It's good to see the addition of error handling when IMAGE_VERSION is missing for production deployments.
The use of $(MAKE) --no-print-directory improves readability.
Consider using a default value for PROJECT_NAME, IMAGE_NAME, and IMAGE_REGISTRY to avoid unexpected behavior if they are not set. Using ?= is good, but add comments to explain what the default values are if they are not set.
Suggestion: Add a deploy-prod-internal target to separate concerns and improve readability. This is a good pattern. Consider adding a comment explaining why.
Suggestion: Consider using argocd to manage deployments instead of kubectl. This can provide better automation and management of deployments.

Okay, I've reviewed the provided Makefile diff. Here's a summary of the issues and suggestions:

Review Summary:

The changes introduce better environment management (specifically for production), standardize image naming, improve clarity in output messages, and refactor the setup process. The addition of help messages is a good practice. The removal of S3/MinIO secrets configuration from here is concerning.

File: Makefile

Issue: S3/MinIO output credentials secret (OPTIONAL) are removed
- Suggestion: Please confirm why these secrets have been removed and what their purpose was and whether this configuration has been shifted to a different location.
Suggestion: Consider adding validation of the environment variables used in the secrets target. For example, you could add a check to ensure GITLAB_TOKEN is a non-empty string before creating the secret. This will help catch configuration errors early.
Suggestion: For the argocd-deploy-* targets, consider adding a check to ensure that ArgoCD is actually installed and accessible before attempting to deploy.
Suggestion: In the tasks target, the comments refer to "MinIO/S3 support" and "Google Drive storage." These storage types suggest different functionalities. It would be helpful to elaborate on the practical differences between these configurations (e.g., data location, access control, cost).
Suggestion: The help target is very useful. Consider grouping the commands into logical sections (e.g., "Application Commands," "Infrastructure Commands," "Debugging Commands") to improve readability.
Suggestion: Make the CONTAINER_IMAGE variable mandatory to provide a more streamlined deployment.

--- a/Makefile
+++ b/Makefile
@@ -237,11 +237,7 @@ secrets:
 	if [ -n "$(S3_ENDPOINT_URL)" ] && [ "$(S3_ENDPOINT_URL)" != "" ]; then \
 		@$(MAKE) --no-print-directory create-s3-credentials; \
 	fi
-	@if [ -n "$(S3_OUTPUT_BUCKET_NAME)" ] && [ "$(S3_OUTPUT_BUCKET_NAME)" != "" ]; then \
-		if [ -f "secrets/s3-output-credentials.yaml" ]; then \
-			@$(CO) apply -n $(NAMESPACE) -f secrets/s3-output-credentials.yaml --dry-run=client -o yaml | $(CO) apply -f - > /dev/null 2>&1; \
-		fi; \
-		echo "   ✓ S3/Minio output credentials secret"; \
+	@if [ -n "$(S3_ENDPOINT_URL)" ] && [ "$(S3_ENDPOINT_URL)" != "" ]; then \
 	fi
 	# Create Quay pull secret
 	@DOCKER_AUTH_FILE=""; \
@@ -261,13 +257,13 @@ pipeline:
 	@echo "🔧 Pipeline..."
 	@echo "   ✓ Pipeline deployed with Tekton resources (via kustomize)"
 
-scripts:
+scripts: ## Deploy upload scripts ConfigMaps [Internal target]
 	@echo "📜 Setting up Scripts..."
 	@$(CO) apply -n $(NAMESPACE) -f tekton/scripts/upload_to_drive_cm.yaml || \
 		{ echo "   ❌ Failed to apply Google Drive upload script"; exit 1; }
 	@$(CO) apply -n $(NAMESPACE) -f tekton/scripts/upload_to_gcs_cm.yaml || \
 		{ echo "   ❌ Failed to apply GCS upload script"; exit 1; }
-	@$(CO) apply -n $(NAMESPACE) -f tekton/scripts/upload_to_s3_output_cm.yaml || \
+	@$(CO) apply -n $(NAMESPACE) -f tekton/scripts/upload_to_s3_cm.yaml || \
 		{ echo "   ❌ Failed to apply S3 output upload script"; exit 1; }
 	@echo "   ✓ Upload scripts configured"
 
@@ -304,7 +300,7 @@ configmaps:
 # =============================================================================
 
 run: ## Execute pipeline [Params: PROJECT_NAME, REPO_REMOTE_URL, INPUT_REPORT_FILE_PATH, etc.]
-	@echo ""
+	@echo "" ## shebang missing
 	@echo "🏃 Starting Pipeline Execution..."
 	@echo "   Container Image: $(CONTAINER_IMAGE)"
 	@echo "   🔄 Removing old pipeline runs..."
@@ -324,7 +320,6 @@ run:
 		-e 's|DVC_DATA_VERSION_PLACEHOLDER|$(DVC_DATA_VERSION)|g' \
 		-e 's|S3_ENDPOINT_URL_PLACEHOLDER|$(S3_ENDPOINT_URL)|g' \
 		-e 's|S3_INPUT_BUCKET_NAME_PLACEHOLDER|$(S3_INPUT_BUCKET_NAME)|g' \
-		-e 's|S3_OUTPUT_BUCKET_NAME_PLACEHOLDER|$(S3_OUTPUT_BUCKET_NAME)|g' \
 		tekton/pipelinerun.yaml > tekton/pipelinerun-temp.yaml
 	@$(CO) apply -n $(NAMESPACE) -f tekton/pipelinerun-temp.yaml
 	@rm -f tekton/pipelinerun-temp.yaml

Review Summary:

The changes in the Makefile seem to focus on streamlining the pipeline deployment and execution process. The removal of the S3 output credentials secret creation based on the output bucket name and the update of the S3 upload script application suggest a simplification of the S3 configuration. Also added documentation for each target.

File: Makefile

Issue: The run target is missing a shebang (#!/usr/bin/env make). While make might often be invoked in an environment where make is available, explicitly specifying the shebang ensures the script is executable and uses the correct interpreter, increasing portability and reliability.
- Suggestion: Add #!/usr/bin/env make at the beginning of the run target.

Okay, I've reviewed the provided diff. Here's a summary of the issues and suggestions:

Review Summary

The changes primarily focus on refactoring the Makefile to improve organization, clarity, and maintainability. The addition of comments and sectioning is good. The removal of some resources within the clean target suggests a simplification or removal of features, which needs further scrutiny.

File: Makefile

Issue: Removal of s3-output-upload-scripts, sast-ai-s3-output-credentials in the clean target.
- Suggestion: Understand the impact of removing these resources. If the S3 output functionality is no longer needed, the removal is fine. If it's still required, the cleanup process needs to be adjusted or those resources kept. Add a comment explaining why these were removed.
Issue: The renaming and re-organization of ArgoCD deployment targets is good, but might break existing workflows that rely on the old names.
- Suggestion: Consider adding a deprecation warning for the old argocd-deploy-mlops target, which informs users to switch to the new argocd-deploy-dev target. This provides a smoother transition.
Issue: The added comments [Internal target] are useful, but consider using a more consistent approach to documentation (e.g., a standard comment block format).
- Suggestion: Adopt a standardized comment block format for documenting each make target. This improves readability and maintainability.
Issue: There are some commands using || true at the end.
- Suggestion: It's generally a good practice to avoid using || true without a good reason because it masks the underlying problem. Remove || true if the command's failure is truly an error, and address the error.
Nitpick: In the argocd-deploy-* targets, the sed command could be made slightly more readable by using a different delimiter than /, since / is used within the search pattern. For example, sed -e 's|ARGOCD_NAMESPACE_PLACEHOLDER|$(NAMESPACE)|g'.

I will wait for you to send me the next file differences.

Okay, I've reviewed the changes. Here's a summary and some specific comments:

Review Summary:

The PR primarily focuses on adding the packaging library to the requirements.txt file and installing it within the Tekton task definition (execute_sast_ai_workflow.yaml). It seems this library is needed for version handling or related functionality, possibly due to recent changes in dependencies within the SAST AI workflow. The change itself is relatively straightforward.

File-Specific Comments:

File: deploy/tekton/base/tasks/execute_sast_ai_workflow.yaml
- Issue: The installation of packaging is duplicated. It's being installed both when the credentials workspace is present and also later in the task.
- Suggestion: Install packaging only once, at the beginning of the task execution. This reduces redundancy and potential for inconsistencies. Also, consider adding a comment explaining why packaging is required for future maintainers.
```
# Install necessary dependencies
pip install --quiet packaging google-cloud-storage >/dev/null 2>&1 # packaging is required for ...
```
File: requirements.txt
- Issue: There is a large number of dependency upgrades and additions. This dramatically increases the chance of incompatibility with other parts of the system.
- Suggestion: It is preferable to have smaller and more targeted dependency upgrades. It would be recommended to revert most of these upgrades except for the addition of the packaging library.

I will wait for your next message.
Review Summary:

The changes primarily involve version updates and additions to the dependencies list. A key concern is the downgrading of openai and transformers which might introduce compatibility issues or remove access to newer features. The addition of opentelemetry packages indicates a potential introduction of monitoring/observability features, which should be verified for correctness and security implications. Specific version updates may introduce breaking changes and require code modifications.

File: requirements.txt

Issue: Downgrading openai from >=1.66.3 to ==1.109.1.
- Suggestion: It's unusual to downgrade a package. Understand why this is being done. Is it to address a specific compatibility issue? If so, document this in the commit message. Consider if a version range is more appropriate (e.g., openai>=1.66.3,<2.0.0) to allow for future updates while avoiding breaking changes. Also downgrading transformers from >=4.47.0 to ==4.50.3, needs justification.
Issue: Addition of opentelemetry-* packages.
- Suggestion: Ensure that the inclusion of these packages is intentional and that the application is properly configured to use them. Consider the overhead of telemetry and its impact on performance. Also, review any sensitive data that might be captured by the telemetry system and ensure it's handled securely.
Issue: Updates to scikit-learn and sentence-transformers
- Suggestion: Check the changelogs for scikit-learn and sentence-transformers for breaking changes. These upgrades may require corresponding code changes to maintain functionality.
Issue: Adding/removing specific packages.
- Suggestion: Verify that all newly added packages are necessary and that any removed packages are no longer used. Consider adding comments to the commit message explaining the rationale behind these changes.

Yael-F and others added 2 commits October 29, 2025 16:56

Fix HF cache permissions and improve deployment tooling. Fix Hugging …

7f9868e

…Face cache permission issues, add missing 'packaging' dependency in Tekton tasks, and enhance Makefile with help messages

Merge remote-tracking branch 'origin/main' into improve-deployment

b937e65

# Conflicts: # deploy/Makefile

github-actions bot reviewed Nov 3, 2025

View reviewed changes

JudeNiroshan approved these changes Nov 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix HF cache permissions and improve deployment tooling. Fix Hugging … #132

Fix HF cache permissions and improve deployment tooling. Fix Hugging … #132

Uh oh!

Yael-F commented Oct 29, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix HF cache permissions and improve deployment tooling. Fix Hugging … #132

Are you sure you want to change the base?

Fix HF cache permissions and improve deployment tooling. Fix Hugging … #132

Uh oh!

Conversation

Yael-F commented Oct 29, 2025

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants