Skip to content

openvaibhav/InsightFlow-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

51 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“Š InsightFlow-AI

Ask a question. Get a dashboard. No SQL. No BI tools. Just insights.

InsightFlow-AI is a conversational AI data analytics tool that converts natural language questions into interactive dashboards - automatically generating SQL queries, charts, and business insights from any uploaded CSV dataset.

Upload your data β†’ Ask a question β†’ Instantly get visual analytics.

Built for hackathons and rapid data exploration using Google Gemini, Streamlit, Pandas, SQLite, and Plotly.


🎬 Demo

Example interaction:

User:
Show total views by category

System:
β†’ Generates SQL
β†’ Executes query on dataset
β†’ Chooses best chart type
β†’ Builds dashboard
β†’ Generates AI insights

Follow-up queries work conversationally:

User: Show views by category
User: Only for 2024

System resolves to:
Show views by category for 2024

No manual filtering required.


✨ Features

Feature Description
πŸ’¬ Natural language queries Ask questions about your dataset in plain English
πŸ€– AI-generated SQL Gemini converts questions into SQLite queries
πŸ“Š Smart chart selection Automatically chooses bar, line, pie, or scatter
⚑ Interactive Plotly charts Zoom, hover, inspect, export
🧠 AI business insights Gemini generates 3–5 bullet insights from results
πŸ”„ Conversational follow-ups Context-aware query rewriting
πŸ“ Upload any CSV Works with arbitrary datasets
πŸ’‘ Dynamic example queries AI suggests relevant questions for each dataset
🧠 Schema-aware reasoning Automatically extracts numeric/categorical columns

πŸ— Architecture

User Query
     β”‚
     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Followup Resolver    β”‚
β”‚ followup_resolver.py β”‚
β”‚ (Gemini)             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚ Standalone query
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ SQL Generator        β”‚
β”‚ sql_generator.py     β”‚
β”‚ (Gemini)             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚ SQL Query
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Query Executor       β”‚
β”‚ query_executor.py    β”‚
β”‚ SQLite + Pandas      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚ DataFrame
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Chart Selector       β”‚
β”‚ chart_selector.py    β”‚
β”‚ (Gemini reasoning)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚ Chart type
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Chart Renderer       β”‚
β”‚ chart_renderer.py    β”‚
β”‚ Plotly builder       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
           β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ί AI Insights Generator
           β”‚               insights_generator.py
           β”‚               (Gemini)
           β–Ό
      Streamlit UI

πŸ“ Project Structure

InsightFlow-AI/
β”‚
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ gemini_client.py
β”‚   β”œβ”€β”€ sql_generator.py
β”‚   β”œβ”€β”€ query_executor.py
β”‚   β”œβ”€β”€ chart_selector.py
β”‚   β”œβ”€β”€ chart_renderer.py
β”‚   β”œβ”€β”€ insights_generator.py
β”‚   β”œβ”€β”€ followup_resolver.py
β”‚   └── example_generator.py
β”‚
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ app.py
β”‚   └── style.css
β”‚
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ schema_loader.py
β”‚   └── css_loader.py
β”‚
β”œβ”€β”€ prompts/
β”‚   β”œβ”€β”€ sql_prompt.txt
β”‚   β”œβ”€β”€ chart_prompt.txt
β”‚   β”œβ”€β”€ insight_prompt.txt
β”‚   β”œβ”€β”€ followup_prompt.txt
β”‚   └── examples_prompt.txt
β”‚
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env
└── README.md

πŸš€ Getting Started

Prerequisites

  • Python 3.10+
  • Google Gemini API key

Get your key here:

https://aistudio.google.com/app/api-keys


Installation

git clone https://github.com/openvaibhav/InsightFlow-AI.git

cd InsightFlow-AI

python -m venv venv
# Bash:
source venv/bin/activate
# Fish:
source venv/bin/activate.fish
# Windows:
# venv\Scripts\activate

pip install -r requirements.txt

Configuration

Create a .env file:

GEMINI_API_KEY=your_api_key_here

Run the app

streamlit run frontend/app.py

Then open:

http://localhost:8501

πŸ§ͺ Usage

1️⃣ Upload a CSV dataset
2️⃣ Ask a question about the data
3️⃣ The system automatically:

  • Generates SQL
  • Runs query
  • Picks chart
  • Renders dashboard
  • Produces insights

4️⃣ Ask follow-up questions conversationally


πŸ’‘ Example Queries

Works for many datasets.

Examples:

Show total views by category
Which category has the highest engagement
Top 10 videos by views
Compare views and likes
Show monthly views trend
Which region has the highest revenue
Show average rating by product category
Compare revenue and profit

Dynamic examples are generated based on detected:

  • numeric columns
  • categorical columns
  • date columns

🧠 Key Components

Schema Loader

Automatically extracts:

  • column names
  • data types
  • numeric columns
  • categorical columns

Used to guide LLM SQL generation.


SQL Generator

Gemini converts natural language into SQLite-compatible SQL queries.

Safety features:

  • prevents write operations
  • enforces schema usage
  • validates query output

Query Executor

Runs SQL against a temporary in-memory SQLite database built from the uploaded DataFrame.


Chart Selector

AI decides which visualization fits the result:

Data Pattern Chart
Category vs value Bar
Time series Line
Parts of whole Pie
Numeric correlation Scatter

Insights Generator

Gemini analyzes result tables and produces concise business insights.

Example:

β€’ Gaming category generates the highest total views
β€’ Engagement peaks during June and July
β€’ Sentiment scores show strong positive correlation with likes

πŸ›  Tech Stack

Layer Technology
UI Streamlit
LLM Google Gemini
Charts Plotly
Data processing Pandas
Query engine SQLite (in-memory)
Environment Python 3.10+

πŸ“¦ Requirements

streamlit
pandas
plotly
google-generativeai
python-dotenv
numpy
statsmodels
pyarrow

⚠ Known Limitations

  • SQL generation depends on schema clarity
  • Very large datasets (>500k rows) may slow execution
  • Gemini rate limits apply on free tier
  • Some ambiguous queries may generate suboptimal charts

πŸš€ Future Improvements

Potential upgrades:

  • Multi-dataset joins
  • Dashboard export
  • Chart editing
  • LLM self-correction loops
  • SQL execution sandboxing
  • Live database connections

🀝 Contributing

Pull requests welcome.

Steps:

1. Fork repo
2. Create branch
3. Implement feature
4. Submit PR

πŸ“„ License

MIT License


πŸ§‘β€πŸ’» Author

Built for hackathon with questionable sleep and too much caffeine β˜•.


InsightFlow-AI

Ask your data anything.

About

Conversational AI for Business Intelligence

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors