n-best rescore with transformer lm by glynpu · Pull Request #201 · k2-fsa/snowfall

glynpu · 2021-05-24T14:15:21Z

Wer results of this pr (by loaded models from espnet model zoo):

test-clean 2.43% 
test-other 5.79%

This pr implements following procedure with models from espnet model zoo:

Added benefits by loading espnet trained conformer encoder model with equivalent snowfall model definition:

identify differences of conformer implementation between espnet and snowfall. As shown in snowfall/models/conformer.py, snowfall only scaling q; while espnet scale attn_outout_weights.
espnet conformer has an extra layer_norm after encoder

Also, the loaded espnet transformer lm could be used as a baseline for snowfall lm training tasks.

danpovey · 2021-05-24T14:53:28Z

Great!!
I assume the modeling units are BPE pieces? I think a good step towards resolving the difference would be to train
(i) a CTC model
(ii) a LF-MMI model
using those same BPE pieces.

glynpu · 2021-05-25T02:21:14Z

Great!!
I assume the modeling units are BPE pieces? I think a good step towards resolving the difference would be to train
(i) a CTC model
(ii) a LF-MMI model
using those same BPE pieces.

Yes, the modeling units are 5000 tokens including "<blank>".
I will do the suggested experiments.

danpovey · 2021-05-25T03:17:39Z

Thanks! You may run into memory problems. Fangjun recently committed a code change that can be used to work around something related to that, though. We need to make sure our recipes can run for those kinds of sizes anyway.

…

On Tue, May 25, 2021 at 10:21 AM LIyong.Guo ***@***.***> wrote: Great!! I assume the modeling units are BPE pieces? I think a good step towards resolving the difference would be to train (i) a CTC model (ii) a LF-MMI model using those same BPE pieces. Yes, the modeling units are 5000 tokens including . I will do the suggested experiments. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#201 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO2ABWX6JSLSM35IIELTPMCSRANCNFSM45NKCFJQ> .

danpovey · 2021-06-09T05:24:41Z

+                                         b_to_a_map=b_to_a_map,
+                                         sorted_match_a=True)
+        lm_path_lats = k2.top_sort(k2.connect(lm_path_lats.to('cpu'))).to(device)
+        lm_scores = lm_path_lats.get_tot_scores(True, True)


The 2nd arg to get_tot_scores() here, representing log_semiring, should be false, because ARPA-type language models are constructed in such a way that the backoff prob is included in the direct arc. I.e. we would be double-counting if we were to sum the probabilities of the non-backoff and backoff arcs.

csukuangfj

Please add more documentation to your code.

csukuangfj · 2021-06-16T02:09:26Z

+                x -= self.mean
+
+        if norm_vars:
+            x /= self.std


norm_means uses a guard requires_grad to choose whether to perform an in-place update. Is there a reason not to do the same here?

The original implementation
https://github.com/espnet/espnet/blob/08feae5bb93fa8f6dcba66760c8617a4b5e39d70/espnet/nets/pytorch_backend/frontends/feature_transform.py#L135
uses self.scale to do a multiplication, which is more efficient than dividing by self.std.

csukuangfj · 2021-06-16T02:14:40Z

+    def encode(
+            self, speech: torch.Tensor,
+            speech_lengths: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
+


Would you mind adding doc describing the shape of various tensors?

csukuangfj · 2021-06-16T02:15:59Z

+        return nnet_output
+
+    @classmethod
+    def build_model(cls, asr_train_config, asr_model_file, device):


cls is never used.
I would suggest changing @classmethod to @staticmethod and removing cls.

csukuangfj · 2021-06-16T02:19:33Z

+    """
+    model = TransformerLM(**config)
+
+    assert model_file is not None, f"model file doesn't exist"


f"{model_file} doesn't exist"

csukuangfj · 2021-06-16T02:20:34Z

+    if model_type == 'espnet':
+        return load_espnet_model(config, model_file)
+    elif model_type == 'snowfall':
+        raise NotImplementedError(f'Snowfall model to be suppported')


No need to use f-string here.

csukuangfj · 2021-06-16T02:33:40Z

+        self.unk_idx = self.token2idx['<unk>']
+
+
+@dataclass


Do we really need to use dataclass here?

Also, could you remove the class NumericalizerMixin?
The extra level of inheritance makes the code hard to read.

csukuangfj · 2021-06-16T02:41:54Z

+  # The original link of these models is:
+  # https://zenodo.org/record/4604066#.YKtNrqgzZPY
+  # which is accessible by espnet utils
+  # The are ported to following link for users who don't have espnet dependencies.


Nit: The -> They

csukuangfj · 2021-06-16T02:43:03Z

+  # The are ported to following link for users who don't have espnet dependencies.
+  if [ ! -d snowfall_model_zoo ]; then
+    echo "About to download pretrained models."
+    git clone https://huggingface.co/GuoLiyong/snowfall_model_zoo


I would suggest using git clone --depth 1. It improves the clone speed.

csukuangfj · 2021-06-16T02:47:58Z

+        blank_bias = -1.0
+        nnet_output[:, :, 0] += blank_bias
+
+        supervision_segments = torch.tensor([[0, 0, nnet_output.shape[1]]],


Is the batch size always 1? A larger batch size can improve decoding speed.

csukuangfj · 2021-06-16T02:50:45Z

+
+        ref = batch['supervisions']['text']
+        for i in range(len(ref)):
+            hyp_words = text.split(' ')


What's the format of text?
Does text depend on i? If not, you can split it outside of the for loop.

n-best rescore with transformer lm

884d56f

glynpu mentioned this pull request May 24, 2021

Full-libri default true? #154

Open

glynpu mentioned this pull request Jun 2, 2021

espnet-style attn_output_weight scaling and extra after-norm layer #204

Merged

danpovey reviewed Jun 9, 2021

View reviewed changes

csukuangfj suggested changes Jun 16, 2021

View reviewed changes

csukuangfj mentioned this pull request Jun 16, 2021

Use tropical semiring for lm_paths.get_tot_scores #214

Merged

glynpu mentioned this pull request Jun 21, 2021

bpe ctc decoder with a released model #217

Merged

Conversation

glynpu commented May 24, 2021

Uh oh!

danpovey commented May 24, 2021

Uh oh!

glynpu commented May 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danpovey commented May 25, 2021 via email

Uh oh!

danpovey Jun 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csukuangfj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

glynpu commented May 25, 2021 •

edited

Loading

danpovey Jun 9, 2021 •

edited

Loading