Create whisper_evaluator.py #3990

pwolnows · 2024-12-19T09:52:32Z

Enable validation of whisper models with:

WhisperPipeline from openvino_genai
AutomaticSpeechRecognitionPipeline from transformers

AlexKoff88 · 2024-12-19T14:12:11Z

@eaidova, please take a look and trigger the CI please

AlexKoff88 · 2024-12-19T14:14:01Z

tools/accuracy_checker/accuracy_checker/evaluators/custom_evaluators/whisper_evaluator.py

+        return [], outputs
+
+
+class GenAI_WhisperPipeline(WhisperPipeline):


I would rename classes for consistency, e.g. HFWhisperPipeline, OptimumWhisperPipeline, GenAIWhisperPipeline,

Agree - suggested names are self descriptive

AlexKoff88 · 2024-12-19T14:14:57Z

it looks good overall. It would be great to get some sanity tests on a dummy model to make sure that all three classes work.

eaidova · 2024-12-19T14:15:09Z

tools/accuracy_checker/accuracy_checker/evaluators/custom_evaluators/whisper_evaluator.py

+import openvino_genai as ov_genai
+from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
+from transformers.pipelines.automatic_speech_recognition import \
+    AutomaticSpeechRecognitionPipeline


please make these packages optionl like inflect bellow

Agree. I had them all in try except in initial version but then I thought that packages are so common that was no sense to import, but indeed there are checks that fails to import them though.

AlexKoff88 · 2024-12-19T14:20:40Z

it looks good overall. It would be great to get some sanity tests on a dummy model to make sure that all three classes work.

e.g. with yujiepan/whisper-v3-tiny-random model from the Hub.

AlexKoff88 · 2024-12-20T11:17:16Z

tools/accuracy_checker/tests/test_whisper_evaluator.py

+input_data = [sample["audio"]["array"]]
+input_meta = [{"sample_rate": sample["audio"]["sampling_rate"]}]
+identifiers = [sample["id"]]
+# print(ground_truth)


Please clean up the code a bit and remove print. Also, you need to remove the directory after test suite finishes. You can define teardown_module() function for that.

also looks like you need to install datasets in test requirements

tools/accuracy_checker/tests/test_whisper_evaluator.py:23: in <module> from datasets import load_dataset E ModuleNotFoundError: No module named 'datasets'

@pwolnows I believe you have enough permissions to open github actions status, right? still some dependencies missed
https://github.com/openvinotoolkit/open_model_zoo/actions/runs/12442951517/job/34781160595?pr=3990

pwolnows added 3 commits December 19, 2024 10:49

Create whisper_evaluator.py

4952d71

Add OptimumIntelPipeline to whisper_evaluator.py

75b57b8

Update whisper_evaluator.py

b31af0f

AlexKoff88 approved these changes Dec 19, 2024

View reviewed changes

Update OptimumIntelPipeline

1d3c287

AlexKoff88 reviewed Dec 19, 2024

View reviewed changes

eaidova reviewed Dec 19, 2024

View reviewed changes

pwolnows added 2 commits December 20, 2024 12:11

Update naming, avoid errors for long audio

74a9882

Create test_whisper_evaluator.py

a67c9f3

AlexKoff88 reviewed Dec 20, 2024

View reviewed changes

pwolnows added 8 commits December 20, 2024 13:10

Add datasets to requirements-test.in

6bd2668

Add infect to requirements-extra.in

8f82fcd

Add cleanup test_whisper_evaluator.py

d99c20c

Cleanup of test_whisper_evaluator.py

f870aed

Skip tests if modules not available

52639a0

Update copyright

74e46d3

Merge branch 'master' into custom-whisper-evaluator

16879e5

Pylint fixes

08a89a9

sstrehlk approved these changes Jan 7, 2025

View reviewed changes

pwolnows merged commit 1be1a30 into openvinotoolkit:master Jan 7, 2025
13 checks passed

pwolnows deleted the custom-whisper-evaluator branch January 9, 2025 10:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create whisper_evaluator.py #3990

Create whisper_evaluator.py #3990

Uh oh!

pwolnows commented Dec 19, 2024

Uh oh!

AlexKoff88 commented Dec 19, 2024

Uh oh!

AlexKoff88 Dec 19, 2024

Uh oh!

pwolnows Dec 19, 2024

Uh oh!

AlexKoff88 commented Dec 19, 2024

Uh oh!

eaidova Dec 19, 2024

Uh oh!

pwolnows Dec 19, 2024

Uh oh!

AlexKoff88 commented Dec 19, 2024

Uh oh!

AlexKoff88 Dec 20, 2024

Uh oh!

eaidova Dec 20, 2024

Uh oh!

eaidova Dec 23, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		return [], outputs


		class GenAI_WhisperPipeline(WhisperPipeline):

Create whisper_evaluator.py #3990

Create whisper_evaluator.py #3990

Uh oh!

Conversation

pwolnows commented Dec 19, 2024

Uh oh!

AlexKoff88 commented Dec 19, 2024

Uh oh!

AlexKoff88 Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

pwolnows Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

AlexKoff88 commented Dec 19, 2024

Uh oh!

eaidova Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

pwolnows Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

AlexKoff88 commented Dec 19, 2024

Uh oh!

AlexKoff88 Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

eaidova Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

eaidova Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants