Skip to content
Merged
127 changes: 94 additions & 33 deletions .github/workflows/test-github-action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,26 @@ on:
type: string
required: false
description: dbt's version to test with
generate-data:
type: boolean
required: false
default: false
description: Whether to generate new data

env:
DBT_PKG_INTEG_TESTS_DIR: ${{ github.workspace }}/dbt-data-reliability/integration_tests/deprecated_tests
BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
ELEMENTARY_DBT_PACKAGE_PATH: ${{ github.workspace }}/dbt-data-reliability
E2E_DBT_PROJECT_DIR: ${{ github.workspace }}/elementary/tests/e2e_dbt_project

jobs:
test:
runs-on: ubuntu-latest
defaults:
run:
working-directory: elementary
concurrency:
group: test_action_snowflake_dbt_${{ inputs.dbt-version }}_${{ github.head_ref || github.ref_name }}
cancel-in-progress: true
steps:
- name: Checkout Elementary
uses: actions/checkout@v4
Expand All @@ -40,29 +50,39 @@ jobs:
path: dbt-data-reliability
ref: ${{ inputs.dbt-data-reliability-ref }}

- name: Write dbt profiles
id: profiles
env:
PROFILES_YML: ${{ secrets.TEST_GITHUB_ACTION_PROFILES_YML }}
run: |
mkdir -p ~/.dbt
echo "$PROFILES_YML" > ~/.dbt/profiles.yml

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.10"

- name: Install dbt
run: pip install --pre
run: >
pip install
"dbt-core${{ inputs.dbt-version && format('=={0}', inputs.dbt-version) }}"
"dbt-snowflake${{ inputs.dbt-version && format('<={0}', inputs.dbt-version) }}"
"dbt-snowflake${{ inputs.dbt-version && format('~={0}', inputs.dbt-version) }}"

- name: Install Elementary
run: |
pip install -r dev-requirements.txt
pip install ".[snowflake]"

- name: Write dbt profiles
env:
CI_WAREHOUSE_SECRETS: ${{ secrets.CI_WAREHOUSE_SECRETS || '' }}
run: |
CONCURRENCY_GROUP="test_action_snowflake_dbt_${{ inputs.dbt-version }}_${BRANCH_NAME}"
SHORT_HASH=$(echo -n "$CONCURRENCY_GROUP" | sha256sum | head -c 8)
SAFE_BRANCH=$(echo "${BRANCH_NAME}" | awk '{print tolower($0)}' | sed "s/[^a-z0-9]/_/g; s/__*/_/g" | head -c 19)
DATE_STAMP=$(date -u +%y%m%d_%H%M%S)
SCHEMA_NAME="ga_${DATE_STAMP}_${SAFE_BRANCH}_${SHORT_HASH}"

echo "Schema name: $SCHEMA_NAME (branch='${BRANCH_NAME}', timestamp=${DATE_STAMP}, hash of concurrency group)"

python "${{ github.workspace }}/elementary/tests/profiles/generate_profiles.py" \
--template "${{ github.workspace }}/elementary/tests/profiles/profiles.yml.j2" \
--output ~/.dbt/profiles.yml \
--schema-name "$SCHEMA_NAME"
Comment on lines +71 to +84
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fail fast if warehouse secrets are missing.

Line 72 defaults CI_WAREHOUSE_SECRETS to empty string, which can turn credential misconfigurations into harder-to-diagnose downstream failures.

Suggested change
       - name: Write dbt profiles
         env:
           CI_WAREHOUSE_SECRETS: ${{ secrets.CI_WAREHOUSE_SECRETS || '' }}
         run: |
+          if [ -z "$CI_WAREHOUSE_SECRETS" ]; then
+            echo "::error::CI_WAREHOUSE_SECRETS is not set"
+            exit 1
+          fi
+
           CONCURRENCY_GROUP="test_action_snowflake_dbt_${{ inputs.dbt-version }}_${BRANCH_NAME}"
           SHORT_HASH=$(echo -n "$CONCURRENCY_GROUP" | sha256sum | head -c 8)
           SAFE_BRANCH=$(echo "${BRANCH_NAME}" | awk '{print tolower($0)}' | sed "s/[^a-z0-9]/_/g; s/__*/_/g" | head -c 19)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-github-action.yml around lines 72 - 85, The workflow
currently defaulting CI_WAREHOUSE_SECRETS to an empty string can mask missing
credentials; update the run block (the step that defines CI_WAREHOUSE_SECRETS
and builds CONCURRENCY_GROUP/SCHEMA_NAME) to immediately validate
CI_WAREHOUSE_SECRETS and exit non‑zero with a clear error message if it is empty
or undefined before computing
CONCURRENCY_GROUP/SHORT_HASH/SAFE_BRANCH/SCHEMA_NAME and before invoking python
"${{ github.workspace }}/elementary/tests/profiles/generate_profiles.py";
reference the environment variable name CI_WAREHOUSE_SECRETS and the run block
where CONCURRENCY_GROUP and SCHEMA_NAME are built to locate where to add the
check.


- name: Install dbt package
run: |
ELEMENTARY_PKG_LOCATION=$(pip show elementary-data | grep -i location | awk '{print $2}')
Expand All @@ -72,21 +92,70 @@ jobs:
rm -rf "$DBT_PKGS_PATH/elementary"
ln -vs "$GITHUB_WORKSPACE/dbt-data-reliability" "$DBT_PKGS_PATH/elementary"

- name: Run dbt package integration tests
working-directory: ${{ env.DBT_PKG_INTEG_TESTS_DIR }}
- name: Run deps for E2E dbt project
working-directory: ${{ env.E2E_DBT_PROJECT_DIR }}
env:
ELEMENTARY_DBT_PACKAGE_PATH: ${{ env.ELEMENTARY_DBT_PACKAGE_PATH }}
run: |
dbt deps
python run_e2e_tests.py -t "snowflake" --clear-tests "True" -e "regular"

- name: Seed e2e dbt project
working-directory: ${{ env.E2E_DBT_PROJECT_DIR }}
if: inputs.generate-data
run: |
python generate_data.py
dbt seed -f --target snowflake

Comment on lines +102 to +108
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

generate-data currently skips seeding entirely.

Because Line 105 gates the whole step, dbt seed is skipped when generate-data is false, so runs depend on pre-existing Snowflake state.

Suggested change
       - name: Seed e2e dbt project
         working-directory: ${{ env.E2E_DBT_PROJECT_DIR }}
-        if: inputs.generate-data
         run: |
-          python generate_data.py
+          if [ "${{ inputs.generate-data }}" = "true" ]; then
+            python generate_data.py
+          fi
           dbt seed -f --target snowflake
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-github-action.yml around lines 103 - 109, The current
GitHub Actions step "Seed e2e dbt project" uses the step-level condition if:
inputs.generate-data which skips the entire step (including `dbt seed -f
--target snowflake`) when generate-data is false; change the step so `dbt seed`
always runs but `python generate_data.py` runs only when inputs.generate-data is
true — either split into two steps (one conditional step that runs `python
generate_data.py` with if: inputs.generate-data, and an unconditional step that
runs `dbt seed -f --target snowflake`) or remove the step-level if and make the
run script conditional for the generator (e.g. shell if around `python
generate_data.py`) while leaving `dbt seed` unconditional.

- name: Run e2e dbt project
working-directory: ${{ env.E2E_DBT_PROJECT_DIR }}
run: |
dbt run --target snowflake || true

# Validate run_results.json: only error_model should be non-success
if jq -e '
[.results[] | select(.status != "success") | .unique_id]
| length == 1 and .[0] == "model.elementary_integration_tests.error_model"
' target/run_results.json > /dev/null; then
echo "Validation passed: only error_model failed."
else
echo "Validation failed. Unexpected failures:"
jq '[.results[] | select(.status != "success") | .unique_id] | join(", ")' target/run_results.json
exit 1
fi

- name: Test e2e dbt project
working-directory: ${{ env.E2E_DBT_PROJECT_DIR }}
continue-on-error: true
run: |
dbt test --target snowflake

- name: Read generated profiles
id: profiles
run: |
# Mask credentials so they don't appear in logs
while IFS= read -r line; do
echo "::add-mask::$line"
done < ~/.dbt/profiles.yml
{
echo "profiles_yml<<EOFPROFILES"
cat ~/.dbt/profiles.yml
echo "EOFPROFILES"
} >> "$GITHUB_OUTPUT"
Comment thread
coderabbitai[bot] marked this conversation as resolved.

- name: Run Elementary
uses: elementary-data/run-elementary-action@v1.8
uses: elementary-data/run-elementary-action@v1.13
with:
warehouse-type: snowflake
profiles-yml: ${{ secrets.TEST_GITHUB_ACTION_PROFILES_YML }}
edr-command:
edr monitor -t "snowflake" --slack-token "${{ secrets.CI_SLACK_TOKEN }}" --slack-channel-name data-ops
profile-target: snowflake
profiles-yml: ${{ steps.profiles.outputs.profiles_yml }}
edr-command: >
edr monitor
-t snowflake
--slack-token "${{ secrets.CI_SLACK_TOKEN }}"
--slack-channel-name data-ops
&&
edr report
edr monitor report
-t snowflake
--file-path "report.html"
Comment thread
coderabbitai[bot] marked this conversation as resolved.

- name: Upload report
Expand All @@ -102,17 +171,9 @@ jobs:
name: edr.log
path: edr_target/edr.log

notify_failures:
name: Notify Slack
needs: test
if: |
always() &&
! cancelled() &&
! contains(needs.test.result, 'success') &&
! contains(needs.test.result, 'cancelled') &&
${{ github.event_name == 'schedule' }}
uses: ./.github/workflows/notify_slack.yml
with:
result: "failure"
run_id: ${{ github.run_id }}
workflow_name: "Test Elementary GitHub action"
- name: Drop test schemas
if: always()
working-directory: ${{ env.E2E_DBT_PROJECT_DIR }}
continue-on-error: true
run: |
dbt run-operation elementary_integration_tests.drop_test_schemas --target snowflake
Loading