This repository is an implementation of the Question Duplicacy detection model developed for Extramarks Ltd. Demonstration video here.
In order to setup the model please follwo the following steps:
- Clone this repo from GitHub using
git clone https://github.com/VenkteshV/Question_duplicate_detection - Navigate to the cloned folder :
cd Question_duplicate_detection - Create a new virtual environment:
python3 -m venv venv_new - Run the virtual environment just created:
source venv_new/bin/activate - Install the required packages:
pip install -r requirements.txt - Download the folder of data ("Data-cache") required to run the model from here
- Move the "Data-cache" folder inside
./src/ - Download full Stanford CoreNLP Tagger version 3.8.0 and rename it to "stanford"
- Move the "stanford" folder to
./src/Kw_generation/ - Navigate to "stanford folder":
cd src/Kw_generation/stanford - Run
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -preload tokenize,ssplit,pos -status_port 9000 -port 9000 -timeout 15000 & - cd to the
cd src/Kw_generation/Unsupervised-keyphrase-extraction/srcand runpython3 app.py - Download the tagging API folder from here, unzip it, rename to "taxonomy_predictor_api" and move to
./src/Syllabus_Tagging/ - Open a new terminal and navigate to "taxonomy_predictor_api" folder:
cd src/Syllabus_Tagging/taxonomy_predictor_api - Download the required libraries:
pip install -r requirements.txt - Run
uvicorn app.main:app - Open a new terminal and navigate to "src" folder:
cd src - Run
python3 ui.py
| Syntax | Description |
|---|---|
| Period of development | 15 May 2022 - 22 August 2022 |
| Developed by | Maksimjeet Chowdhary, Sanyam Goyal, Venktesh V |
| Guidance of | Dr. Mukesh Mohania, Dr. Vikram Goyal, Mr. Deep Dwivedi, Mr. Gaurav Sharma |