Skip to content

Find solution for clusters with DNS when all pods are recreated. #2311

@johscheuer

Description

@johscheuer

What happened?

In our e2e tests cases we observed issues when all pods of a cluster are recreated (and therefore new IP addresses are assigned). It's probably not required that all pods receive a new IP address but only the coordinator pods. The operator is able to recreate the pods but it's stuck when connecting to the FDB cluster with the following error: FoundationDB error code 1512 (Unable to bind to network). We debug this issue in the operator side and see if we can change the behaviour in the fdb bindings (probably the client).

What did you expect to happen?

The operator should be able to connect to the cluster and continue with the operations.

How can we reproduce it (as minimally and precisely as possible)?

Run the following e2e test case: https://github.com/FoundationDB/fdb-kubernetes-operator/blob/v2.8.0/e2e/test_operator/operator_test.go#L2037-L2109

Anything else we need to know?

It might be related to the FDB_NETWORK_OPTION_DISABLE_LOCAL_CLIENT knob that we recently added.

FDB Kubernetes operator

$ kubectl fdb version
# paste output here

Kubernetes version

$ kubectl version
# paste output here

Cloud provider

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions