Skip to content

HOSA fails to scrap its own metrics when identity is set in config.yaml #176

@ljuaneda

Description

@ljuaneda

Hi,

This is following issue #150

I'm using openshift master-proxy certs in a secret to gather metrics from jolokia endpoints.
Currently working with commit 8600302 on OSCP v3.5.5.15

I'm trying the new docker images hawkular/hawkular-openshift-agent that pulled a version 1.4.2
But HOSA fails to scrap its own metrics on :8443 :

I1025 08:06:11.320386       1 prometheus_metrics_collector.go:97] DEBUG: Told to collect all Prometheus metrics from [https://10.130.5.68:8443/metrics]
2017/10/25 08:06:11 http: TLS handshake error from 10.130.5.68:42456: read tcp 10.130.5.68:8443->10.130.5.68:42456: read: connection reset by peer
W1025 08:06:11.324820       1 metrics_collector_manager.go:186] Failed to collect metrics from [default|hawkular-openshift-agent-8ffjs|prometheus|https://10.130.5.68:8443/metrics] at [Wed, 25 Oct 2017 08:06:11 +0000]. err=Failed to collect Prometheus metrics from [https://10.130.5.68:8443/metrics]. err=Cannot scrape Prometheus URL [https://10.130.5.68:8443/metrics]: err=Get https://10.130.5.68:8443/metrics: x509: cannot validate certificate for 10.130.5.68 because it doesn't contain any IP SANs

My guess is that HOSA is not expecting unsecured connections :

$ oc exec hawkular-openshift-agent-8ffjs -- curl -vks https://10.130.5.68:8443/metrics
* About to connect() to 10.130.5.68 port 8443 (#0)
*   Trying 10.130.5.68...
* Connected to 10.130.5.68 (10.130.5.68) port 8443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
*       subject: CN=system:master-proxy
*       start date: Feb 01 16:54:03 2017 GMT
*       expire date: Feb 01 16:54:04 2019 GMT
*       common name: system:master-proxy
*       issuer: CN=openshift-signer@1485968044
> GET /metrics HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 10.130.5.68:8443
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Length: 6308
< Content-Type: text/plain; version=0.0.4
< Date: Wed, 25 Oct 2017 08:15:28 GMT
<
{ [data not shown]
* Connection #0 to host 10.130.5.68 left intact
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.00022610500000000002
go_gc_duration_seconds{quantile="0.25"} 0.00024556400000000004
go_gc_duration_seconds{quantile="0.5"} 0.000258036
go_gc_duration_seconds{quantile="0.75"} 0.000269366
go_gc_duration_seconds{quantile="1"} 0.000530231
go_gc_duration_seconds_sum 0.004801416
go_gc_duration_seconds_count 17
...

My current configuration :

$ cat hawkular-openshift-agent-configuration.cm-new.yaml
apiVersion: v1
kind: List
metadata: {}
items:
- apiVersion: v1
  kind: ConfigMap
  metadata:
    labels:
      metrics-infra: agent
    name: hawkular-openshift-agent-configuration
    namespace: default
  data:
    config.yaml: |
      kubernetes:
        tenant: ${POD:namespace_name}
      hawkular_server:
        url: https://hawkular-metrics.openshift-infra.svc.cluster.local
        credentials:
          username: secret:openshift-infra/hawkular-metrics-account/hawkular-metrics.username
          password: secret:openshift-infra/hawkular-metrics-account/hawkular-metrics.password
        ca_cert_file: secret:openshift-infra/hawkular-metrics-certificate/hawkular-metrics-ca.certificate
      emitter:
        status_enabled: true
        metrics_enabled: true
        health_enabled: true
      identity:
        cert_file: /master-proxy/master.proxy-client.crt
        private_key_file: /master-proxy/master.proxy-client.key
      collector:
        max_metrics_per_pod: 500
        minimum_collection_interval: 10s
        default_collection_interval: 30s
        metric_id_prefix: pod/${POD:uid}/custom/
        pod_label_tags_prefix: _empty_
        tags:
          metric_name: ${METRIC:name}
          description: ${METRIC:description}
          units: ${METRIC:units}
          namespace_id: ${POD:namespace_uid}
          namespace_name: ${POD:namespace_name}
          node_name: ${POD:node_name}
          pod_id: ${POD:uid}
          pod_ip: ${POD:ip}
          pod_name: ${POD:name}
          pod_namespace: ${POD:namespace_name}
          hostname: ${POD:hostname}
          host_ip: ${POD:host_ip}
          labels: ${POD:labels}
          cluster_name: ${POD:cluster_name}
          resource_version: ${POD:resource_version}
          type: pod
          collector: hawkular_openshift_agent
          custom_metric: true
    hawkular-openshift-agent: |
      endpoints:
      - type: prometheus
        protocol: "https"
        port: 8443
        path: /metrics
        collection_interval: 30s
- apiVersion: extensions/v1beta1
  kind: DaemonSet
  metadata:
    creationTimestamp: null
    labels:
      metrics-infra: agent
      name: hawkular-openshift-agent
    name: hawkular-openshift-agent
  spec:
    selector:
      matchLabels:
        name: hawkular-openshift-agent
    template:
      metadata:
        creationTimestamp: null
        labels:
          metrics-infra: agent
          name: hawkular-openshift-agent
      spec:
        containers:
        - command:
          - /opt/hawkular/hawkular-openshift-agent
          - -config
          - /hawkular-openshift-agent-configuration/config.yaml
          - -v
          - "4"
          env:
          - name: K8S_POD_NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          - name: K8S_POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: EMITTER_STATUS_CREDENTIALS_USERNAME
            valueFrom:
              secretKeyRef:
                key: username
                name: hawkular-openshift-agent-status
          - name: EMITTER_STATUS_CREDENTIALS_PASSWORD
            valueFrom:
              secretKeyRef:
                key: password
                name: hawkular-openshift-agent-status
          image: hawkular/hawkular-openshift-agent:1.4.2
          imagePullPolicy: Always
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /health
              port: 8443
              scheme: HTTPS
            initialDelaySeconds: 30
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 1
          name: hawkular-openshift-agent
          resources: {}
          terminationMessagePath: /dev/termination-log
          volumeMounts:
          - mountPath: /hawkular-openshift-agent-configuration
            name: hawkular-openshift-agent-configuration
          - mountPath: /master-proxy
            name: master-proxy
        dnsPolicy: ClusterFirst
        nodeSelector:
          hawkular-openshift-agent: "true"
        restartPolicy: Always
        securityContext: {}
        serviceAccount: hawkular-openshift-agent
        serviceAccountName: hawkular-openshift-agent
        terminationGracePeriodSeconds: 30
        volumes:
        - configMap:
            defaultMode: 420
            name: hawkular-openshift-agent-configuration
          name: hawkular-openshift-agent-configuration
        - configMap:
            defaultMode: 420
            name: hawkular-openshift-agent-configuration
          name: hawkular-openshift-agent
        - name: master-proxy
          secret:
            defaultMode: 420
            secretName: master-proxy

Regards,

Ludovic

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions