WIP: Add timit recipe by luomingshuang · Pull Request #96 · k2-fsa/icefall

luomingshuang · 2021-10-28T08:08:13Z

Add timit recipe for icefall. This script uses phone as modeling units and it aims to compute the PER. Our target output is a list of phones. The split for {dev, test} is following kaldi ({kaldi-timit-dev, kaldi-timit-test}. At present, this script contains tdnn_lstm_ctc for use. And I will add other models and methods (such as comformer and crdnn, mmi) for it.

In fact, I have done some experiments for timit based on snowfall. k2-fsa/snowfall#247

The current result is not the best. I will continue to improve it.
log-train-2021-10-28-15-24-21.txt
https://tensorboard.dev/experiment/twUbZTxoTAK32bPCJsYF7Q/#scalars

TODOs:

Add and check the scripts
Add documents for timit
Improve the performance

csukuangfj · 2021-11-01T04:50:29Z

+#
+#  - $dl_dir/lm
+#      This directory contains the language model(LM) downloaded from
+#      https://huggingface.co/luomingshuang/timit_lm, and the LM is based


Could you please describe how lm_tgmed.arpa is obtained?
Is it possible to train it inside icefall?

Em....the lm_tgmed.arpa is obtained by this train_lms.sh which is followed by kaldi. About training lm inside icefall, I think it is a good idea. I have ever considered this problem that if we can train lm with python. There are some methods for it by using KenLM. Maybe I can have look.

ok, train_lms.sh uses https://github.com/danpovey/kaldi_lm.git

I will wrap it to Python with pybind11 when I have time.

csukuangfj · 2021-11-01T06:22:59Z

+2021-10-28 13:20:42,952 INFO [decode.py:360] Wrote detailed error stats to tdnn_lstm_ctc/exp/errs-TEST-lm_scale_2.0.txt
+2021-10-28 13:20:42,986 INFO [decode.py:374] 
+For TEST, PER of different settings are:
+lm_scale_0.1	20.82	best for TEST


Could you use a smaller lm scale value as it reaches the edge?

csukuangfj · 2021-11-01T06:44:23Z

+                recordings=m["recordings"],
+                supervisions=m["supervisions"],
+            )
+            if "train" in partition:


Please note that in librispeech, the names of the training datasets begin with
train (lowercase).

In TIMIT, I find that it is TRAIN (uppercase) , see line 52 in this file, so this if
statement is never executed.

Please change train to TRAIN and re-run your experiments.

Oh....will do it....

csukuangfj · 2021-11-01T06:49:43Z

+        load_dicts = json.load(load_f)
+        for load_dict in load_dicts:
+            text = load_dict["text"]
+            phones_list = list(filter(None, text.split(" ")))


Could it be changed to

phones_list = text.split()

?
It's simpler and easier to understand.

csukuangfj · 2021-11-01T06:50:53Z

+            phones_list = list(filter(None, text.split(" ")))
+
+            for phone in phones_list:
+                if phone not in phones:


Could you use a set to represent phones, not a list?
set is more efficient for looking up.

csukuangfj · 2021-11-01T06:51:51Z

+
+    with open(lexicon, "w") as f:
+        for phone in sorted(phones):
+            f.write(str(phone) + "  " + str(phone))


phone is already of type str, can we remove str here?

csukuangfj · 2021-11-01T06:59:52Z

+  # We assume that you have installed the git-lfs, if not, you could install it
+  # using: `sudo apt-get install git-lfs && git-lfs install`
+  [ ! -e $dl_dir/lm ] && mkdir -p $dl_dir/lm
+  git clone https://huggingface.co/luomingshuang/timit_lm $dl_dir/lm


Please add a check that lm_tgmed.arpa is downloaded correctly.
Some users may forget to run git lfs install.

You can add an extra statement

( cd $dl_dir/lm && git lfs pull )

csukuangfj · 2021-11-01T07:02:04Z

+
+if [ $stage -le 6 ] && [ $stop_stage -ge 6 ]; then
+  log "Stage 6: Prepare G"
+  # We assume you have install kaldilm, if not, please install


typo: install -> installed

csukuangfj · 2021-11-01T07:05:39Z

+      --read-symbol-table="data/lang_phone/words.txt" \
+      --disambig-symbol='#0' \
+      --max-order=4 \
+      $dl_dir/lm/lm_tgmed.arpa > data/lm/G_4_gram.fst.txt


tgmed means this arpa is a tri-gram, of medium size, I think.
Please use a 4-gram arpa to generate G_4_gram.fst.txt, if you need it for decoding/rescoring.

Will do it.

csukuangfj · 2021-11-01T07:06:52Z

@@ -0,0 +1,97 @@
+#!/usr/bin/env bash


This file is shared across various recipes.

Could you make it a symlink, like what we are doing in the librispeech recipe?

csukuangfj · 2021-11-01T07:07:35Z

@@ -0,0 +1,400 @@
+FADG0_SI1279 TEST/DR4/FADG0/SI1279.WAV


Can this file be generated by some scripts? If so, we don't need to check it in.

Em....about {train, dev, test} spliting files, I don't find there are some scripts to generate them. In kaldi, there are placed in a list file. In speechbrain, there are placed in timit_prepare.py which listing the speakers with a list. A option for us is to learn speechbrain. We can use a list to contain speakers' names in data preparing process. I will add it to Lhotse.

luomingshuang added 10 commits October 12, 2021 20:15

Update train.py

18d7dd2

Update train.py

906d0ad

Update train.py

e5feabb

Merge branch 'k2-fsa:master' into master

f3fd279

Merge branch 'k2-fsa:master' into master

69c8720

Add timit recipe for icefall

e023a9d

Update timit recipe

4beb25c

Update prepare.sh

5e7c733

Update decode.py

e2bb9b4

Delete RESULTS.md

a9cdaae

csukuangfj reviewed Nov 1, 2021

View reviewed changes

csukuangfj requested changes Nov 1, 2021

View reviewed changes

luomingshuang closed this Nov 9, 2021

Conversation

luomingshuang commented Oct 28, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants