This project implements an ETL (Extract, Transform, Load) pipeline using Apache Airflow to:
- Extract weather data from the OpenWeather API.
- Transform the data for preprocessing.
- Load the processed data into a PostgreSQL database.
The project follows the Astro CLI structure (astro dev init).
.
├── dags/ # Airflow DAGs (Pipeline definitions)
│ ├── etl_pipeline.py # Main ETL workflow
│ ├── utils.py # Utility functions (API calls, transformations)
│
├── include/ # Additional files (SQL scripts, configurations)
│
├── plugins/ # Custom Airflow plugins (if needed)
│
├── tests/ # Unit tests for DAGs
│
├── Dockerfile # Airflow Docker setup
├── requirements.txt # Python dependencies
├── airflow_settings.yaml # Airflow configurations
├── README.md # Project documentation
└── .astro/ # Astro CLI configurations
- A powerful workflow orchestration tool for scheduling and managing data pipelines.
- Provides DAGs (Directed Acyclic Graphs) to define ETL workflows.
- A robust, open-source relational database system.
- Used to store the transformed weather data.
- Provides real-time and historical weather data.
- Free access available with API key registration.
- Docker & Docker Compose installed
- Astro CLI installed (
pip install astro-cli) - PostgreSQL Database (running locally or on cloud)
- OpenWeather API Key (Get from OpenWeather)
git clone https://github.com/your-username/ETL-pipeline-airflow-to-postgres.git
cd ETL-pipeline-airflow-to-postgresastro dev start- Open Airflow UI (
http://localhost:8080) - Go to Admin → Connections
- Create a Postgres Connection with:
- Conn ID:
postgres_db - Conn Type:
Postgres - Host:
localhost - Schema:
your_database - Login:
your_user - Password:
your_password
- Conn ID:
- Store your API key in Airflow Variables:
airflow variables set openweather_api_key "YOUR_API_KEY"
API_KEY = "YOUR_API_KEY" # Replace with your API Key
- Open Airflow UI (
http://localhost:8080) - Enable and run the DAG
etl_pipeline
- DAG (
etl_pipeline.py): Defines the ETL workflow. - Task 1 - Extract: Fetches weather data from OpenWeather API.
- Task 2 - Transform: Cleans and processes data.
- Task 3 - Load: Inserts data into PostgreSQL.
MIT License
For questions or contributions, reach out via GitHub Issues.

