Project 8. Question similarity.

This repository is an implementation of the Question Duplicacy detection model developed for Extramarks Ltd. Demonstration video here.

In order to setup the model please follwo the following steps:

Clone this repo from GitHub using git clone https://github.com/VenkteshV/Question_duplicate_detection
Navigate to the cloned folder : cd Question_duplicate_detection
Create a new virtual environment: python3 -m venv venv_new
Run the virtual environment just created: source venv_new/bin/activate
Install the required packages: pip install -r requirements.txt
Download the folder of data ("Data-cache") required to run the model from here
Move the "Data-cache" folder inside ./src/
Download full Stanford CoreNLP Tagger version 3.8.0 and rename it to "stanford"
Move the "stanford" folder to ./src/Kw_generation/
Navigate to "stanford folder": cd src/Kw_generation/stanford
Run java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -preload tokenize,ssplit,pos -status_port 9000 -port 9000 -timeout 15000 &
cd to the cd src/Kw_generation/Unsupervised-keyphrase-extraction/src and run python3 app.py
Download the tagging API folder from here, unzip it, rename to "taxonomy_predictor_api" and move to ./src/Syllabus_Tagging/
Open a new terminal and navigate to "taxonomy_predictor_api" folder: cd src/Syllabus_Tagging/taxonomy_predictor_api
Download the required libraries: pip install -r requirements.txt
Run uvicorn app.main:app
Open a new terminal and navigate to "src" folder: cd src
Run python3 ui.py

Syntax	Description
Period of development	15 May 2022 - 22 August 2022
Developed by	Maksimjeet Chowdhary, Sanyam Goyal, Venktesh V
Guidance of	Dr. Mukesh Mohania, Dr. Vikram Goyal, Mr. Deep Dwivedi, Mr. Gaurav Sharma

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
Readings		Readings
Weekwise		Weekwise
conda-env-requirements		conda-env-requirements
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
Project-8-workflow.drawio.svg		Project-8-workflow.drawio.svg
README.md		README.md
example_queries.txt		example_queries.txt
requirements.txt		requirements.txt
wsdm_2023_question_duplicate_detection_final.pdf		wsdm_2023_question_duplicate_detection_final.pdf

Provide feedback