DataDog · afontan · Apr 20, 2026 · Apr 21, 2026 · Apr 21, 2026 · Apr 21, 2026
@@ -47,6 +47,7 @@ Select your CI provider to set up CI Visibility in Datadog:
 | {{< ci-details title="Infrastructure correlation" >}}Correlation of host-level information for the Datadog Agent, CI pipelines, or job runners to CI pipeline execution data.{{< /ci-details >}} |  |  | {{< X >}} |  |  | {{< X >}} | {{< X >}} | {{< X >}} |  |  |
 | {{< ci-details title="Running pipelines" >}}Identification of pipelines executions that are running with associated tracing.{{< /ci-details >}} | {{< X >}} | | | | | {{< X >}} | {{< X >}} | {{< X >}} | | {{< X >}} |
 | {{< ci-details title="Partial retries" >}}Identification of partial retries (for example, when only a subset of jobs were retried).{{< /ci-details >}} | {{< X >}} | {{< X >}} | {{< X >}} |  | {{< X >}} | {{< X >}} | {{< X >}} |  | {{< X >}} |  {{< X >}} |
+| {{< ci-details title="Automatic job retries" >}}Preview. Datadog retries failed jobs classified as transient by its AI error model. <a href="https://docs.datadoghq.com/continuous_integration/pipelines/automatic_retries/">More info</a>.{{< /ci-details >}} |  |  |  |  |  | {{< X >}} | {{< X >}} |  |  |  |
 | {{< ci-details title="Step granularity" >}}Step level spans are available for more granular visibility.{{< /ci-details >}} |  |  |  |  | {{< X >}} | {{< X >}} |  | {{< X >}} <br /> (_Presented as job spans_) |  |  {{< X >}} |
 | {{< ci-details title="Manual steps" >}}Identification of when there is a job with a manual approval phase in the overall pipeline.{{< /ci-details >}} | {{< X >}} | {{< X >}} | {{< X >}} |  | {{< X >}} | {{< X >}} | {{< X >}} | {{< X >}} |  |  {{< X >}} |
 

@@ -0,0 +1,84 @@
+---
+title: Automatic Job Retries
+further_reading:
+  - link: "/continuous_integration/pipelines"
+    tag: "Documentation"
+    text: "Explore Pipeline Execution Results and Performance"
+  - link: "/continuous_integration/pipelines/github/"
+    tag: "Documentation"
+    text: "Set up CI Visibility for GitHub Actions"
+  - link: "/continuous_integration/pipelines/gitlab/"
+    tag: "Documentation"
+    text: "Set up CI Visibility for GitLab"
+  - link: "/continuous_integration/troubleshooting/"
+    tag: "Documentation"
+    text: "Troubleshooting CI Visibility"
+---
+
+<div class="alert alert-info">Automatic job retries are in Preview. To request access, contact your Datadog account team.</div>
+
+## Overview
+
+Automatic job retries save developer time by re-running failures that are likely transient, such as network timeouts, infrastructure failures, or flaky tests. Genuine code defects are not retried. Datadog runs each failed job through an AI-powered error classifier. When the failure is identified as retriable, Datadog triggers a retry through the CI provider's API without manual intervention.
+
+Automatic retries reduce the number of pipelines that developers re-run by hand, shorten feedback loops, and keep pipeline success metrics focused on non-transient failures.
+
+## How it works
+
+1. A CI job fails in your pipeline.
+2. Datadog's AI error classifier inspects the job's logs and error context to determine whether the failure is transient.
+3. If the failure is classified as retriable, Datadog requests a retry through the provider's API.
+4. Datadog retries each job up to a maximum number of attempts to prevent infinite retry loops.
+5. Datadog records the retry outcome on the original pipeline in CI Visibility.
+
+## Requirements
+
+- CI Visibility enabled for your [GitHub Actions][1] or [GitLab][2] integration.
+- [Datadog Source Code Integration][3] configured for the repositories where you want automatic retries.
+- Indexed CI job logs for those repositories (see [Collect job logs for GitHub Actions][4] or [Collect job logs for GitLab][5]).
+- Automatic job retries enabled for your organization (see the banner above for how to request access).
+
+Automatic retries rely on the same AI error classifier used by [CI jobs failure analysis][6], which reads indexed job logs to decide whether a failure is transient.
+
+## Provider-specific behavior
+
+{{< tabs >}}
+{{% tab "GitLab" %}}
+
+Datadog performs **smart retries** on GitLab: only the specific job classified as retriable is re-run. Other failed jobs (that aren't classified as retriable) and passing jobs aren't affected.
+
+- Retries are triggered per job, as soon as the job fails.
+- Smart retries work with GitLab.com (SaaS) and self-hosted GitLab instances reachable by the Datadog Source Code Integration.
+- There is no additional CI cost beyond the retried job.
+
+{{% /tab %}}
+{{% tab "GitHub Actions" %}}
+
+GitHub Actions imposes two provider-level limitations that shape how retries work:
+
+- **Retries happen after the workflow finishes.** The GitHub API does not allow retrying an individual job while the rest of the workflow is still running. Datadog waits for the workflow to reach a final state before issuing retries.
+- **All failed jobs are retried together.** The GitHub API does not support retrying a single job when other jobs in the workflow have also failed. Datadog reruns every failed job in the workflow through a single GitHub API call. This may increase your GitHub Actions compute usage.
+
+### Protected branches
+
+The Datadog GitHub App's default permissions do not allow retries on protected branches. To enable automatic retries on a protected branch (for example, your default branch), grant the app Maintainer-level access. Review your organization's policies before expanding permissions.
+
+{{% /tab %}}
+{{< /tabs >}}
+
+## Limitations
+
+- Each logical job is retried at most one time.
- Each logical job is retried at most one time.
+- Each failed job is retried at most once.
- Each logical job is retried at most one time.
+- Each failed job is retried at most once.
+- Jobs classified as non-retriable (for example, compilation errors or asserted test failures) are never retried.
+- If a job has already been retried manually or by provider-native retry rules, Datadog does not issue an additional retry.
+
+## Further reading
+
+{{< partial name="whats-next/whats-next.html" >}}
+
+[1]: /continuous_integration/pipelines/github/
+[2]: /continuous_integration/pipelines/gitlab/
+[3]: /integrations/guide/source-code-integration/
+[4]: /continuous_integration/pipelines/github/#collect-job-logs
+[5]: /continuous_integration/pipelines/gitlab/#collect-job-logs
+[6]: /continuous_integration/guides/use_ci_jobs_failure_analysis/
@@ -30,6 +30,7 @@ Set up CI Visibility for GitHub Actions to track the execution of your workflows
 | [Running pipelines][2] | Running pipelines | View pipeline executions that are running. Queued or waiting pipelines show with status "Running" on Datadog. |
 | [CI jobs failure analysis][23] | CI jobs failure analysis | Uses LLM models on relevant logs to analyze the root cause of failed CI jobs. |
 | [Partial retries][3] | Partial pipelines | View partially retried pipeline executions. |
+| [Automatic job retries][27] | Automatic job retries | Preview. Datadog retries failed jobs classified as transient by its AI error model. |
 | Logs correlation | Logs correlation | Correlate pipeline and job spans to logs and enable [job log collection](#collect-job-logs). |
 | Infrastructure metric correlation | Infrastructure metric correlation | Correlate jobs to [infrastructure host metrics][4] for GitHub jobs. |
 | [Custom tags][5] [and measures at runtime][6] | Custom tags and measures at runtime | Configure [custom tags and measures][7] at runtime. |
@@ -158,3 +159,4 @@ The **CI Pipeline List** page shows data for only the default branch of each rep
 [24]: /continuous_integration/guides/identify_highest_impact_jobs_with_critical_path/
 [25]: /glossary/#pipeline-execution-time
 [26]: /continuous_integration/guides/use_ci_jobs_failure_analysis/#using-pr-comments
+[27]: /continuous_integration/pipelines/automatic_retries/
@@ -28,6 +28,7 @@ Set up CI Visibility for GitLab to collect data on your pipeline executions, ana
 | [CI jobs failure analysis][28] | CI jobs failure analysis | Uses LLM models on relevant logs to analyze the root cause of failed CI jobs. |
 | [Filter CI Jobs on the critical path][29] | Filter CI Jobs on the critical path | Filter by jobs on the critical path. |
 | [Partial retries][19] | Partial pipelines | View partially retried pipeline executions. |
+| [Automatic job retries][31] | Automatic job retries | Preview. Datadog retries failed jobs classified as transient by its AI error model. |
 | [Manual steps][20] | Manual steps | View manually triggered pipelines. |
 | [Queue time][21] | Queue time | View the amount of time pipeline jobs sit in the queue before processing. |
 | Logs correlation | Logs correlation | Correlate pipeline spans to logs and enable [job log collection][12]. |
@@ -466,3 +467,4 @@ The **CI Pipeline List** page shows data for only the default branch of each rep
 [28]: /continuous_integration/guides/use_ci_jobs_failure_analysis/
 [29]: /continuous_integration/guides/identify_highest_impact_jobs_with_critical_path/
 [30]: /continuous_integration/guides/use_ci_jobs_failure_analysis/#using-pr-comments
+[31]: /continuous_integration/pipelines/automatic_retries/