Complete the Quiz shown below. It’s a set of 6 multiple-choice questions to test your understanding of workflow orchestration, Kestra, and ETL pipelines for data lakes and warehouses.
1 - Within the execution for Yellow Taxi data for the year 2020 and month 12: what is the uncompressed file size (i.e. the output file yellow_tripdata_2020-12.csv of the extract task)?
- 128.3 MB ✅
- 134.5 MB
- 364.7 MB
- 692.6 MB
"yellow_tripdata_2020-12.csv": "128.3 MiB"
2 - What is the rendered value of the variable file when the inputs taxi is set to green, year is set to 2020, and month is set to 04 during execution?
- {{inputs.taxi}}tripdata{{inputs.year}}-{{inputs.month}}.csv
- green_tripdata_2020-04.csv ✅
- green_tripdata_04_2020.csv
- green_tripdata_2020.csv
file: "{{inputs.taxi}}_tripdata_{{inputs.year}}-{{inputs.month}}.csv"green_tripdata_2020-04.csv
- 13,537,299
- 24,648,499 ✅
- 18,324,219
- 29,430,127
-SELECT COUNT(*) AS total_rows
FROM yellow_tripdata
WHERE EXTRACT(YEAR FROM tpep_pickup_datetime) = 2020;
- 5,327,301
- 936,199
- 1,734,051 ✅
- 1,342,034
SELECT COUNT(*) AS total_rows
FROM green_tripdata
WHERE EXTRACT(YEAR FROM lpep_pickup_datetime) = 2020;
- 1,428,092
- 706,911
- 1,925,152 ✅
- 2,561,031
SELECT COUNT(*) AS total_rows
FROM yellow_tripdata
WHERE EXTRACT(YEAR FROM tpep_pickup_datetime) = 2021
AND EXTRACT(MONTH FROM tpep_pickup_datetime) = 3;
- Add a timezone property set to EST in the Schedule trigger configuration
- Add a timezone property set to America/New_York in the Schedule trigger configuration ✅
- Add a timezone property set to UTC-5 in the Schedule trigger configuration
- Add a location property set to New_York in the Schedule trigger configuration
triggers:
- id: schedule_nyc
type: io.kestra.plugin.core.trigger.Schedule
cron: "0 9 * * *"
timezone: America/New_York

