This project explores and analyzes the NYC Airbnb Open Data (2019) to uncover insights into pricing, location trends, room types, and availability using visual and statistical methods.
- Source: Kaggle β NYC Airbnb Open Data (2019)
- Size: 48,000+ rows, 16 columns
- Format: CSV
- Features: Borough, Neighbourhood, Room Type, Price, Availability, Reviews, etc.
- Python 3.x
- Pandas
- Matplotlib
- Seaborn
- Google Colab (Notebook Environment)
- Data Loading & Initial Exploration
- Cleaning:
- Dropped unused columns
- Filled missing values
- Removed outliers (IQR method)
- Univariate & Bivariate Analysis
- Visualizations:
- Countplots
- Boxplots
- Heatmaps
- Geospatial Scatterplots
- Insights & Summary
- Manhattan has the most listings and highest prices.
- Private rooms dominate in number; entire homes dominate in price.
- Most listings are concentrated in tourist-heavy areas.
- No strong correlation between price and availability/reviews.
Sonal Shukla
Aspiring Data Analyst | Python Enthusiast