Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions examples/cases/open_source_scada/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# open_source_scada — Twin Groves Wind Farm

This example case packages one year (2010-09-01 to 2011-08-31, hourly) of
per-turbine SCADA power and wind-speed observations from the **Twin
Groves Wind Farm**, owned and operated by Invenergy in McLean County,
Illinois, USA. The farm has approximately 200 Vestas V82 1.65 MW
turbines (82 m rotor, 80 m hub height).

The packaged data is suitable for benchmarking flow models against
operational observations: it provides a public reference case with a
known turbine type, a publicly available manufacturer power curve, and
permissive licensing.

## Files

| path | content |
|--|--|
| `plant_wind_farm/FLOW_toy_study_wind_farm.yaml` | Turbine x/y coordinates (UTM zone 16N, EPSG:32616). |
| `plant_energy_turbine/turbine.yaml` | Vestas V82 1.65 MW power and thrust curves (OEM values, also at https://github.com/NatLabRockies/turbine-models). |
| `plant_energy_turbine/VestasV82_1.65MW_82.csv` | Same OEM curve in CSV form (WS, P, Cp, Ct). |
| `plant_energy_resource/` | ERA5 reanalysis wind timeseries at hub height + Weibull resource + utility scripts. |
| `plant_energy_site/FLOW_toy_study_energy_site.yaml` | Site polygon and energy-resource include. |
| `wind_energy_system/system.yaml` | Top-level WindIO definition tying farm, site, and analysis together. |
| `wind_energy_system/analysis.yaml` | Example Bastankhah Gaussian wake-model analysis configuration. |
| `outputs/observedPower.nc` | **Observed** per-turbine power (W) and wind speed (m/s) per hour, shape `(time=8760, turbine=200)`. |

## Units and conventions

- `observedPower.nc.power`: per-turbine electrical power output in **watts (W)**.
Maximum across all (time, turbine) entries is the rated power
$1.65\times10^{6}$ W, consistent with the V82 OEM curve packaged in
`plant_energy_turbine/turbine.yaml`.
- `observedPower.nc.rotor_effective_velocity`: per-turbine wind speed in
**m/s**. Currently the raw nacelle-anemometer reading; this is **not** a
freestream value (no Nacelle Transfer Function correction has been
applied, and no upstream met mast is provided).
- Time axis is hourly UTC. Approximately 20% of (time, turbine) entries
are NaN due to operational gaps, maintenance, and turbine outages — this
is real operational SCADA, not a synthetic dataset.

## Historical note on the power scaling

Prior to the cleanup PR that accompanies this README, the
`observedPower.nc.power` values were stored with a **multiplicative factor
of 1.65 applied** (i.e., the rated-power numerical value in MW). Maximum
values were near 2.72 MW, which is inconsistent with the V82 1.65 MW
rated power and made the file unusable for any benchmark relying on the
packaged OEM power curve.

If you have a copy of the older file, you can recover the corrected
values by dividing `power` by `1.65`:

```python
import xarray as xr
ds = xr.open_dataset("outputs/observedPower.nc")
ds["power"] = ds["power"] / 1.65 # only if you have the pre-PR file
```

The current file in this repository has the correction applied. A
self-consistency check is included in `tests/sanity_open_source_scada.py`
(see PR description).

## Suggested usage

```python
import xarray as xr
ds = xr.open_dataset("outputs/observedPower.nc")
print(ds) # (time: 8760, turbine: 200)
print(ds.power.max().item()) # ~1.65e6 W
print(ds.power.attrs) # units, source
```

## License

MIT (inherited from the WIFA repository).

## Provenance and citation

If you use this dataset in published work, please cite the WIFA
repository and acknowledge Twin Groves Wind Farm / Invenergy as the
source of the SCADA observations. The Vestas V82 1.65 MW OEM power and
thrust curves are reproduced from public manufacturer data (also
available at the NatLabRockies turbine-models repository:
https://github.com/NatLabRockies/turbine-models/blob/main/turbine_models/data/Onshore/VestasV82_1.65MW_82.csv).
Binary file modified examples/cases/open_source_scada/outputs/observedPower.nc
Binary file not shown.
81 changes: 81 additions & 0 deletions examples/cases/open_source_scada/tests/sanity_open_source_scada.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
"""Sanity check for examples/cases/open_source_scada/outputs/observedPower.nc.

Verifies that the cleaned per-turbine power timeseries is consistent with
the Vestas V82 1.65 MW OEM power curve packaged in the same case
directory. Catches regressions of the historical "multiply by rated MW"
bug fixed in the cleanup PR.

Run from the WIFA repository root:

python examples/cases/open_source_scada/tests/sanity_open_source_scada.py
"""
from __future__ import annotations

from pathlib import Path

import numpy as np
import xarray as xr
import yaml


CASE_DIR = Path(__file__).resolve().parents[1]
OBSERVED = CASE_DIR / "outputs" / "observedPower.nc"
TURBINE = CASE_DIR / "plant_energy_turbine" / "turbine.yaml"

RATED_W = 1.65e6
TOL_RATED = 0.02 * RATED_W # 2% above rated tolerated (overshoot)
TOL_BIN_FRAC = 0.20 # 20% bin-median tolerance vs OEM
WS_TEST_BINS = list(range(5, 14)) # exclude very-low (start-up scatter)
# and very-high (storm shutdown scatter)


def load_oem_pc(turbine_yaml: Path):
d = yaml.safe_load(turbine_yaml.read_text())
pc = d["performance"]["power_curve"]
ws = np.array(pc["power_wind_speeds"], dtype=float)
p_kw = np.array(pc["power_values"], dtype=float)
return ws, p_kw * 1000.0 # convert to W


def main():
print(f"Checking {OBSERVED} ...")
ds = xr.open_dataset(OBSERVED)
P = ds["power"].values # (time, turbine), W
WS = ds["rotor_effective_velocity"].values
oem_ws, oem_p = load_oem_pc(TURBINE)

valid = np.isfinite(P) & np.isfinite(WS)
p_max = float(np.nanmax(P))
print(f" max(power) = {p_max:.1f} W (rated = {RATED_W:.1f} W)")
assert p_max <= RATED_W + TOL_RATED, (
f"max(power)={p_max} exceeds rated+tol={RATED_W + TOL_RATED}; "
"is the historical 'multiply by rated MW' scaling re-introduced?"
)

fails = []
for lo in WS_TEST_BINS:
m = valid & (WS >= lo) & (WS < lo + 1)
if m.sum() < 30:
continue
oem_at = float(np.interp(lo + 0.5, oem_ws, oem_p))
med = float(np.median(P[m]))
rel = abs(med - oem_at) / max(oem_at, 1.0)
flag = "OK" if rel <= TOL_BIN_FRAC else "FAIL"
print(f" WS={lo:2d}-{lo+1:2d}: n={m.sum():6d} "
f"OEM={oem_at:7.0f} W median={med:7.0f} W rel-err={rel:.2f} [{flag}]")
if rel > TOL_BIN_FRAC:
fails.append((lo, oem_at, med, rel))

if fails:
msg = ", ".join(f"WS={lo}: rel-err={r:.2f}" for lo, _, _, r in fails)
raise AssertionError(
f"Power-curve agreement worse than {TOL_BIN_FRAC:.0%} on bins: {msg}"
)

nan_frac = float(np.isnan(P).mean())
print(f" NaN fraction (operational gaps): {nan_frac:.2f} (expect ~0.20)")
print("Sanity check passed.")


if __name__ == "__main__":
main()
Loading