Skip to content

Commit 3aa801b

Browse files
committed
Char-level embeddings enabled instructions and results
1 parent d3839f3 commit 3aa801b

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The report describes two versions of R-NET:
1010

1111
The current best single-model on SQuAD leaderboard has a higher score, which means R-NET development continued after March 2017. Ensemble models reach higher scores.
1212

13-
This repository contains an implementation of the first version, but we cannot yet reproduce the reported results. The best performance we got so far was EM=54.21% and F1=65.26% on the dev set. We are aware of a few differences between our implementation and the network described in the paper:
13+
This repository contains an implementation of the first version, but we cannot yet reproduce the reported results. The best performance we got so far was EM=56.82% and F1=66.68% on the dev set. We are aware of a few differences between our implementation and the network described in the paper:
1414

1515
1. We do not use character-level embedding at the input.
1616
2. The first formula in (11) of the [report](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf) contains a strange summand W_v^Q V_r^Q. Both tensors are trainable and are not used anywhere else in the network. We have replaced this product with a single trainable vector.
@@ -28,17 +28,17 @@ We are not sure whether we applied dropout correctly. Also there is nothing abou
2828

2929
2. Preprocess the data
3030
```sh
31-
python preprocessing.py data/train_parsed.json --outfile data/train_data.pkl
32-
python preprocessing.py data/valid_parsed.json --outfile data/valid_data.pkl
33-
python preprocessing.py data/dev_parsed.json --outfile data/dev_data.pkl
31+
python preprocessing.py data/train_parsed.json --outfile data/train_data_str.pkl --include_str
32+
python preprocessing.py data/valid_parsed.json --outfile data/valid_data_str.pkl --include_str
33+
python preprocessing.py data/dev_parsed.json --outfile data/dev_data_str.pkl --include_str
3434
```
3535

3636
3. Train the model
3737
```sh
38-
python train.py --hdim 40 --batch_size 70 --nb_epochs 50 --optimizer adam --dropout 0.2
38+
python train.py --hdim 45 --batch_size 50 --nb_epochs 50 --optimizer adadelta --lr 1 --dropout 0.2 --char_level_embeddings --train_data data/train_data_str.pkl --valid_data data/valid_data_str.pkl
3939
```
4040

4141
4. Predict on dev/test set samples
4242
```sh
43-
python predict.py model/your-model prediction.json
43+
python predict.py --batch_size 100 --dev_data data/dev_data_str.pkl models/31-t3.05458271443-v3.27696280528.model prediction.json
4444
```

0 commit comments

Comments
 (0)