Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,24 @@ End-to-end tests for Deckhouse storage components.
5. Write your test in `tests/<your-test-name>/<your-test-name>_test.go` (Section marked `---=== TESTS START HERE ===---`)
6. Run the test: `go test -timeout=240m -v ./tests/<your-test-name> -count=1`

### Run using an existing cluster (no VM creation)

Use this mode to run tests against a cluster that is already running (faster iterations, no virtualization/VM setup).

1. Set cluster creation mode to use existing cluster:
```bash
export TEST_CLUSTER_CREATE_MODE=alwaysUseExisting
```
2. Point SSH to the **test cluster** (the Kubernetes API master you want to run tests on):
- **Direct access:** `SSH_HOST` = IP/hostname of the cluster master, `SSH_USER` = user that can run `sudo cat /etc/kubernetes/admin.conf` on that host.
- **Via jump host:** set `SSH_JUMP_HOST`, `SSH_JUMP_USER`, `SSH_JUMP_KEY_PATH` (optional); `SSH_HOST`/`SSH_USER` are the target cluster master.
3. Source the rest of your test env (e.g. `source tests/<your-test-name>/test_exports`), then run:
```bash
go test -timeout=240m -v ./tests/<your-test-name> -count=1
```

Kubeconfig is written to `temp/<test-name>/` (e.g. `temp/sds_node_configurator_test/kubeconfig-<master-ip>.yml`). The framework acquires a cluster lock so only one test run uses the cluster at a time. If a previous run left the lock (crash, Ctrl+C), set `TEST_CLUSTER_FORCE_LOCK_RELEASE=true` for the next run (do not use if another test might be using the cluster).

The `-count=1` flag prevents Go from using cached test results.
Timeout `240m` is a global timeout for entire testkit. Adjust it on your needs.

Expand Down Expand Up @@ -71,6 +89,7 @@ See [pkg/FUNCTIONS_GLOSSARY.md](pkg/FUNCTIONS_GLOSSARY.md) for a full list of al
- `SSH_PUBLIC_KEY` -- Path to SSH public key file, or plain-text key content. Default: `~/.ssh/id_rsa.pub`
- `SSH_PASSPHRASE` -- Passphrase for the SSH private key. Required for non-interactive mode with encrypted keys
- `SSH_VM_USER` -- SSH user for connecting to VMs deployed inside the test cluster. Default: `cloud`
- `SSH_VM_PASSWORD` -- Password for SSH to VMs (e.g. `cloud`) when connecting from jump host for lsblk checks. If set, uses `sshpass`; leave empty for key-based auth. Required when VMs accept only password auth.
- `SSH_JUMP_HOST` -- Jump host address for connecting to clusters behind a bastion
- `SSH_JUMP_USER` -- Jump host SSH user. Defaults to `SSH_USER` if jump host is set
- `SSH_JUMP_KEY_PATH` -- Jump host SSH key path. Defaults to `SSH_PRIVATE_KEY` if jump host is set
Expand All @@ -79,8 +98,10 @@ See [pkg/FUNCTIONS_GLOSSARY.md](pkg/FUNCTIONS_GLOSSARY.md) for a full list of al

- `YAML_CONFIG_FILENAME` -- Filename of the cluster definition YAML. Default: `cluster_config.yml`
- `TEST_CLUSTER_CLEANUP` -- Set to `true` to remove the test cluster after tests complete. Default: `false`
- `TEST_CLUSTER_RESUME` -- Set to `true` to continue from a previous failed run (only for `alwaysCreateNew`). If the test failed in the middle of cluster creation, re-run with `TEST_CLUSTER_RESUME=true`; the framework will load saved state from `temp/<test-name>/cluster-state.json` (written after step 6), restore VM hostnames, and run the remaining steps (connect to first master, add nodes, enable modules). Requires that step 6 (VMs created, VM info gathered) completed before the failure.
- `TEST_CLUSTER_NAMESPACE` -- Namespace for DKP cluster deployment. Default: `e2e-test-cluster`
- `KUBE_CONFIG_PATH` -- Path to a kubeconfig file. Used as fallback if SSH-based kubeconfig retrieval fails
- `KUBE_INSECURE_SKIP_TLS_VERIFY` -- Set to `true` to skip TLS certificate verification for the Kubernetes API (e.g. self-signed certs or tunnel to 127.0.0.1). Default: not set (verify TLS)
- `IMAGE_PULL_POLICY` -- Image pull policy for ClusterVirtualImages: `Always` or `IfNotExists`. Default: `IfNotExists`

### Logging
Expand Down
36 changes: 19 additions & 17 deletions internal/cluster/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,8 @@ func expandPath(path string) (string, error) {
// and returns a rest.Config that can be used with Kubernetes clients, along with the path to the kubeconfig file.
// If sshClient is provided, it will be used instead of creating a new connection.
// If sshClient is nil, a new connection will be created and closed automatically.
func GetKubeconfig(ctx context.Context, masterIP, user, keyPath string, sshClient ssh.SSHClient) (*rest.Config, string, error) {
// If kubeconfigOutputDir is non-empty, the kubeconfig file is written there; otherwise temp/<caller-file-name>/ is used.
func GetKubeconfig(ctx context.Context, masterIP, user, keyPath string, sshClient ssh.SSHClient, kubeconfigOutputDir string) (*rest.Config, string, error) {
// Create SSH client if not provided
shouldClose := false
if sshClient == nil {
Expand All @@ -198,23 +199,24 @@ func GetKubeconfig(ctx context.Context, masterIP, user, keyPath string, sshClien
defer sshClient.Close()
}

// Get the test file name from the caller
_, callerFile, _, ok := runtime.Caller(1)
if !ok {
return nil, "", fmt.Errorf("failed to get caller file information")
}
testFileName := strings.TrimSuffix(filepath.Base(callerFile), filepath.Ext(callerFile))

// Determine the temp directory path in the repo root
// callerFile is in tests/{test-dir}/, so we go up two levels to reach repo root
callerDir := filepath.Dir(callerFile)
repoRootPath := filepath.Join(callerDir, "..", "..")
// Resolve the .. parts to get absolute path
repoRoot, err := filepath.Abs(repoRootPath)
if err != nil {
return nil, "", fmt.Errorf("failed to resolve repo root path: %w", err)
var tempDir string
if kubeconfigOutputDir != "" {
tempDir = kubeconfigOutputDir
} else {
// Get the test file name from the caller (creates temp/cluster when called from pkg/cluster)
_, callerFile, _, ok := runtime.Caller(1)
if !ok {
return nil, "", fmt.Errorf("failed to get caller file information")
}
testFileName := strings.TrimSuffix(filepath.Base(callerFile), filepath.Ext(callerFile))
callerDir := filepath.Dir(callerFile)
repoRootPath := filepath.Join(callerDir, "..", "..")
repoRoot, err := filepath.Abs(repoRootPath)
if err != nil {
return nil, "", fmt.Errorf("failed to resolve repo root path: %w", err)
}
tempDir = filepath.Join(repoRoot, "temp", testFileName)
}
tempDir := filepath.Join(repoRoot, "temp", testFileName)

// Create temp directory if it doesn't exist
if err := os.MkdirAll(tempDir, 0755); err != nil {
Expand Down
2 changes: 1 addition & 1 deletion internal/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ const (
// Kubernetes operations
ModuleCheckTimeout = 10 * time.Second // Timeout for checking module status
NamespaceTimeout = 30 * time.Second // Timeout for creating namespace
NodeGroupTimeout = 3 * time.Second // Timeout for creating NodeGroup
NodeGroupTimeout = 2 * time.Minute // Timeout for creating NodeGroup (API can be slow right after bootstrap)
SecretsWaitTimeout = 2 * time.Minute // Timeout for waiting for bootstrap secrets to appear
ClusterHealthTimeout = 15 * time.Minute // Timeout for cluster health check
ModuleDeployTimeout = 15 * time.Minute // Timeout for waiting for ONE module to be ready
Expand Down
16 changes: 16 additions & 0 deletions internal/config/env.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,18 @@ var (
// SSH credentials to deploy to VM
VMSSHUser = os.Getenv("SSH_VM_USER")
VMSSHUserDefaultValue = "cloud"
// VMSSHPassword when set is used to SSH from jump host to VMs (cloud@vmIP) via sshpass. Leave empty for key-based auth.
VMSSHPassword = os.Getenv("SSH_VM_PASSWORD")

// KubeConfigPath is the path to a kubeconfig file. If SSH retrieval fails (e.g., sudo requires password),
// this path will be used as a fallback. If not set and SSH fails, the user will be notified to download
// the kubeconfig manually and set this environment variable, test will fail.
KubeConfigPath = os.Getenv("KUBE_CONFIG_PATH")

// KubeInsecureSkipTLSVerify when set to "true" disables TLS certificate verification for Kubernetes API
// (e.g. when using self-signed certs or connecting via tunnel to 127.0.0.1). Default: "false".
KubeInsecureSkipTLSVerify = os.Getenv("KUBE_INSECURE_SKIP_TLS_VERIFY")

// TestClusterCreateMode specifies the cluster creation mode. Must be set to either "alwaysUseExisting" or "alwaysCreateNew". If not set, test will fail.
TestClusterCreateMode = os.Getenv("TEST_CLUSTER_CREATE_MODE")

Expand All @@ -87,6 +93,16 @@ var (
TestClusterNamespace = os.Getenv("TEST_CLUSTER_NAMESPACE")
TestClusterNamespaceDefaultValue = "e2e-test-cluster"

// TestClusterForceLockRelease when set to "true" or "True" (only for alwaysUseExisting) forces release of an
// existing cluster lock before acquiring. Use when a previous run left the lock (e.g. crash, Ctrl+C).
// Do not use if another test might be using the cluster.
TestClusterForceLockRelease = os.Getenv("TEST_CLUSTER_FORCE_LOCK_RELEASE")

// TestClusterResume when set to "true" or "True" (only for alwaysCreateNew) tries to continue from a previous
// failed run: if state was saved after step 6 (VMs created, IPs gathered), connects to the first master and
// runs remaining steps (add nodes, enable modules). Set to "true" and re-run the test after a mid-deploy failure.
TestClusterResume = os.Getenv("TEST_CLUSTER_RESUME")

// TestClusterStorageClass specifies the storage class for DKP cluster deployment
TestClusterStorageClass = os.Getenv("TEST_CLUSTER_STORAGE_CLASS")
//TestClusterStorageClassDefaultValue = "rsc-test-r2-local"
Expand Down
Loading