Data Engineer · Real‑Time Analytics Fanatic · Scaling Data Infra
Data Engineer passionate about real-time analytics and building scalable data infrastructure. Always eager to explore new tools, automate pipelines, and turn data into actionable insights.
-
Kafka → ClickHouse: Real-Time Data Pipeline – Lightweight streaming pipeline (Kafka · Python · ClickHouse · Docker) → Streams mock user data from a Python Kafka producer into ClickHouse using Kafka engine tables and materialized views — fully containerized for easy setup.
-
ClickHouse Metrics Extractor – Hourly metrics pipeline (Airflow · Python · ClickHouse · Docker) → Automates system metrics extraction from ClickHouse using Airflow, saves hourly snapshots to daily CSVs, and runs fully containerized in Docker.
-
COVID-19 Data Pipeline – Daily ETL pipeline (Airflow · Python · PostgreSQL · Docker) → Automates API ingestion, schedules with Airflow, and stores global COVID-19 stats in PostgreSQL.
-
TrendLite – Live retail insights dashboard (Python · Streamlit · ClickHouse) → Real-time KPIs, top products, and trend breakdowns powered by Altair and optimized SQL queries.
CS Grad · Data Builder · Hackathon Hustler · Living on Coffee & Curiosity ☕