Skip to content

RaphaelMouravieff/deep_sql

Repository files navigation

🔧 Installation

📌 Reproducing results? Just copy-paste this to get started with deepsql.

# Clone the repo
git clone https://github.com/RaphaelMouravieff/deep_sql.git deepsql
cd deepsql

# Set up the environment
conda create -n deepsql python=3.11.11 -y
conda activate deepsql

# Install dependencies
pip install -r requirements.txt

🧪 Perturbations (ROBUT)

ROBUT tests Table QA robustness with 10 human-crafted perturbations:

  • Header: synonym, abbreviation
  • Content: row/col shuffle, extension, masking, adding
  • Question: word/sentence paraphrase
  • Mixed: combined perturbations

Connect to the GPU Node bash ssh gpu Request a GPU with Slurm

srun -p hard --gpus-per-node=1 --constraint=A6000 --pty bash 

Activate the Conda Environment

conda activate [env] 

Starting Ollama

ollama serve & 

Running the Code Agents: Create the data

cd deep_sql/scripts 
bash Agents/run.sh 

Clean/Increase the dataset generated: Clean the generated data + merged files

cd deep_sql/scripts 
bash Datasets/prepare.sh 

Pre-train the clean dataset: Pre-trained the model on the dataset

cd deep_sql/scripts 
bash Train/ptrain.sh 

Fine-tuned the clean dataset: Fine-tuning on wikitablequestions

cd deep_sql/scripts 
bash Train/fine_tuned.sh 

Uni-test Library: Check if multi chunk work to save library, one common vector store and multiple .json.

cd deep_sql
python -m uni_test.library_multi_chunk

Uni-test answer check: Check vectore_store_content (step1= 96766)

cd deep_sql
python -m uni_test.vectore_store_content --vector_store_path data/library/vector_store_step_copy

Uni-test likelihood: Create the likelihood threeshold

cd deep_sql
python -m uni_test.find_likelihood_threeshold 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •