|
| 1 | +--- |
| 2 | +status: proposed |
| 3 | +title: Tekton Results Third Party Logging Integration |
| 4 | +creation-date: '2023-11-30' |
| 5 | +last-updated: '2024-04-19' |
| 6 | +authors: |
| 7 | +- '@khrm' |
| 8 | +collaborators: [] |
| 9 | +--- |
| 10 | + |
| 11 | +# TEP-0159: Results Third Party Logging Integration |
| 12 | + |
| 13 | +<!-- toc --> |
| 14 | +- [Summary](#summary) |
| 15 | +- [Motivation](#motivation) |
| 16 | + - [Goals](#goals) |
| 17 | + - [Non-Goals](#non-goals) |
| 18 | + - [Use Cases](#use-cases) |
| 19 | + - [Requirements](#requirements) |
| 20 | +- [Proposal](#proposal) |
| 21 | + - [Notes and Caveats](#notes-and-caveats) |
| 22 | +- [Design Details](#design-details) |
| 23 | +- [Design Evaluation](#design-evaluation) |
| 24 | + - [Reusability](#reusability) |
| 25 | + - [Simplicity](#simplicity) |
| 26 | + - [Flexibility](#flexibility) |
| 27 | + - [User Experience](#user-experience) |
| 28 | + - [Performance](#performance) |
| 29 | + - [Risks and Mitigations](#risks-and-mitigations) |
| 30 | + - [Drawbacks](#drawbacks) |
| 31 | +- [Alternatives](#alternatives) |
| 32 | +- [Implementation Plan](#implementation-plan) |
| 33 | + - [Test Plan](#test-plan) |
| 34 | + - [Infrastructure Needed](#infrastructure-needed) |
| 35 | + - [Upgrade and Migration Strategy](#upgrade-and-migration-strategy) |
| 36 | + - [Implementation Pull Requests](#implementation-pull-requests) |
| 37 | +- [References](#references) |
| 38 | +<!-- /toc --> |
| 39 | + |
| 40 | +# TEP-0159: Tekton Results: Integration with Third Party Logging APIs |
| 41 | + |
| 42 | +## Summary |
| 43 | + |
| 44 | +This TEP proposes an integration of Tekton Results with third party logging APIs. This will enable users to query their logs from Results API server in a more efficient and cost effective way from a third party logging provider like Loki, Google Cloud Logging, Splunk, etc which were forwarded by forwarding systems like Vector, Fluentd, etc. |
| 45 | + |
| 46 | +## Motivation |
| 47 | + |
| 48 | +The current implementation of Tekton Results is focused on storing and retrieving logs in a JSON format. Results API server stores the logs for `pipelinerun` and `taskrun` resources using client-go via tkn cli library to a storage backend like GCS, S3, PVC, etc. |
| 49 | +We have found that storing logs in this way is inefficient and doesn't scale. GRPC doesn't scale alongwith kube API server and it puts pressure on etcd and kube API server. |
| 50 | + |
| 51 | +### Goals |
| 52 | + |
| 53 | +1. Add a new API endpoint in Results API server to query logs from a third party logging provider. |
| 54 | +2. Use GRPC to implement the above API endpoints. |
| 55 | +3. Add a proxy Rest API server in Results API server to query logs from a third party logging provider which doesn't go to GRPC. |
| 56 | + |
| 57 | +### Non-Goals |
| 58 | + |
| 59 | +1. This TEP is not intended to add support for all third party logging providers. It is intended to add support for a subset of popular logging providers which are compatible with the forwarding systems like Vector, Fluentd, etc. |
| 60 | + |
| 61 | +### Use Cases |
| 62 | + |
| 63 | +1. As a Tekton user, I want to query my logs from Results API server in a more efficient and cost effective way from a third party logging provider like Loki, Google Cloud Logging, Splunk, etc which were forwarded by forwarding systems like Vector, Fluentd, etc. |
| 64 | +2. As a Tekton user, I want to store my logs in a third party logging provider like Loki, Google Cloud Logging, Splunk, etc which were forwarded by forwarding systems like Vector, Fluentd, etc. |
| 65 | + |
| 66 | +## Proposal |
| 67 | + |
| 68 | +We will add a new API endpoint in Results API server to query logs from a third party logging provider. We will use GRPC to implement this API endpoint. We will also add a new API endpoint to store logs in a third party logging provider. We will use GRPC to implement this API endpoint. |
| 69 | +Tekton Pipeline Controller should store the PipelineRun and TaskRun UIDs in the Labels of the PipelineRun and TaskRun resources. This will enable us to query logs for a PipelineRun or TaskRun from the third party logging provider. |
| 70 | + |
| 71 | + |
| 72 | +## Design Details |
| 73 | + |
| 74 | +### Query API |
| 75 | + |
| 76 | +We will add a new API endpoint under v1alpha3 for GetLOG in Results API server to query logs from a third party logging provider. We will use GRPC to implement this API endpoint. We will also add a new API endpoint to store logs in a third party logging provider. We will use GRPC to implement this API endpoint. |
| 77 | + |
| 78 | +This API will take following configurations: |
| 79 | +1. LOGGING_PLUGIN_API_URL - URL for Third Party API. |
| 80 | +2. LOGGING_PLUGIN_NAMESPACE_KEY - Key for namespace labels used by forwarder and Third Party Storage to determine namespace. |
| 81 | +3. LOGGING_PLUGIN_STATIC_LABELS - Static labels are keys added while forwarding logs. These can determine whether logs is from Tekton Controllers or what's the cluster name or other such filtering. |
| 82 | +4. LOGGING_PLUGIN_TOKEN_PATH - Token Path is the path of jwt token used for Authorization Header. |
| 83 | +5. LOGGING_PLUGIN_PROXY_PATH - If third party API is behind a proxy for authorization, we can use this to figure out path. |
| 84 | +6. LOGGING_PLUGIN_CA_CERT - CA Cert is the TLS certification header used by Third Party APIs. If it's some public CA trusted by default, it's not needed. |
| 85 | +7. LOGGING_PLUGIN_TLS_VERIFICATION_DISABLE - Whether to disable TLS verification. |
| 86 | + |
| 87 | +### Proxy API |
| 88 | + |
| 89 | +A Proxy Rest API will be added which generates query on the fly and directly talks with third party backend systems. |
| 90 | + |
| 91 | +### Annotations of Logs |
| 92 | +Forwarder should pass Tekton Pipelines Controllers added TaskRun UID and PipelineRun UID labels to Logs to the third party storage backends. |
| 93 | +This will be used for constructing the query from results side. |
| 94 | + |
| 95 | +### Existing Behavior |
| 96 | + |
| 97 | +The Existing behavior will be kept for some future releases and then remove. |
| 98 | + |
| 99 | +## Performance |
| 100 | + |
| 101 | +This improves the performance for Results Watcher as it no longers need to stream logs. Also, storage by forwarders is efficient as it directly read pod logs from node. |
| 102 | + |
| 103 | +## Implementation Plan |
| 104 | + |
| 105 | + |
| 106 | +### Pull Requests |
| 107 | + |
| 108 | +https://github.com/tektoncd/results/pull/782 |
0 commit comments