Google Batch: Report actual zone where tasks execute in trace records

# Google Batch: Report actual zone where tasks execute in trace records

## New feature

The Google Batch executor currently reports the configured **region** (e.g., `europe-west2`) in trace records rather than the actual **zone** where tasks execute (e.g., `europe-west2-a`, `europe-west2-b`, `europe-west2-c`). This prevents accurate cost estimation for spot instances, as spot pricing varies by zone rather than region.

**Current behavior:**
- `CloudMachineInfo.zone` is populated with `config.location` (region-level setting)
- Trace records contain only the region: `cloudZone = "europe-west2"`

**Desired behavior:**
- `CloudMachineInfo.zone` should contain the actual zone where Google Batch allocated the task
- Trace records should contain the specific zone: `cloudZone = "europe-west2-a"`

## Use case

**Cost estimation for Google Batch workloads:**

When running workflows with spot instances on Google Batch, accurate cost tracking requires knowing the specific zone where each task executed. Cloud pricing databases (like those used by Seqera Platform) store spot prices per zone (e.g., `europe-west2-a`, `europe-west2-b`, `europe-west2-c`) rather than per region.

**Current limitation:**
1. Spot price lookup fails because trace records contain region (`europe-west2`) but price database is keyed by zone (`europe-west2-a`)
2. Cost estimates cannot be calculated for Google Batch workflows
3. Users lack visibility into actual resource costs

**Deployment scenarios:**
- Seqera Platform integration for cost tracking and billing
- Custom monitoring solutions that track per-task resource costs
- Audit and compliance reporting requiring accurate zone information
- Multi-zone resource optimization analysis

## Suggested implementation

**1. Retrieve zone information from Google Batch API:**

After task completion, query the Google Batch API to retrieve the actual zone where the task executed:

```groovy
// In GoogleBatchTaskHandler.groovy
def getTaskStatus() {
    final job = client.getJob(jobId)
    final taskStatus = job.getStatus()
    // Extract actual zone from task allocation metadata
    final actualZone = extractZoneFromTaskStatus(taskStatus)
    return actualZone
}
```

**2. Update CloudMachineInfo with actual zone:**

Modify `GoogleBatchTaskHandler` to populate the zone field with the actual execution zone rather than the configured region:

```groovy
// In GoogleBatchTaskHandler.groovy, line ~351
machineInfo = new CloudMachineInfo(
    type: machineType.type,
    zone: getActualExecutionZone(),  // Instead of machineType.location
    priceModel: machineType.priceModel
)
```

**3. Google Batch API reference:**

The zone information should be available from:
- Job status metadata after task allocation
- Instance policy or allocation policy fields
- Task group status details

**Alternative approach:**

If retrieving zone information adds too much API overhead, consider:
- Lazy retrieval: Only fetch zone when trace records are generated
- Cache zone information per job to minimize API calls
- Make it optional via configuration flag

**Related components:**
- `plugins/nf-google/src/main/nextflow/cloud/google/batch/GoogleBatchTaskHandler.groovy` (lines 351-356, 656-658)
- `plugins/nf-google/src/main/nextflow/cloud/google/batch/client/BatchClient.groovy`
- `modules/nextflow/src/main/groovy/nextflow/cloud/types/CloudMachineInfo.groovy`
- `modules/nextflow/src/main/groovy/nextflow/trace/TraceRecord.groovy`

**Backwards compatibility:**

This change should be backwards compatible as it improves the accuracy of existing data without changing the field structure or API contracts.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Google Batch: Report actual zone where tasks execute in trace records #6646

Google Batch: Report actual zone where tasks execute in trace records

New feature

Use case

Suggested implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Google Batch: Report actual zone where tasks execute in trace records #6646

Description

Google Batch: Report actual zone where tasks execute in trace records

New feature

Use case

Suggested implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions