feat: add option to disable timing metadata in ASR transcription #2622

akanshajain231999 · 2025-11-13T07:10:54Z

Feature: Option to disable timing metadata during ASR transcription

This PR adds a new feature that allows users to disable the printing of timing metadata in ASR (Automatic Speech Recognition) transcription output. By default, timing information like [time: 0.0-2.5] is included in the transcribed text, but users can now opt out of this behavior.

Changes Made:

Core Implementation:

Added include_time_metadata: bool = True field to InlineAsrOptions class in docling/datamodel/pipeline_options_asr_model.py
Modified _ConversationItem.to_string() method to accept an include_time_metadata parameter
Updated both _NativeWhisperModel and _MlxWhisperModel to respect the new setting

CLI Integration:

Added --asr-no-timing flag to disable timing metadata via CLI
The flag is automatically documented through the CLI's auto-generated documentation

Tests:

Added test_asr_pipeline_without_time_metadata() - verifies timing metadata can be disabled
Added test_asr_pipeline_with_time_metadata_default() - verifies timing metadata is enabled by default
Added test_conversation_item_to_string_with_and_without_time() - unit tests for the to_string() method

Backward Compatibility:

Default behavior unchanged: timing metadata is included by default
No breaking changes to existing APIs

Usage Examples

Programmatic:

from docling.datamodel import asr_model_specs
from docling.datamodel.pipeline_options import AsrPipelineOptions

pipeline_options = AsrPipelineOptions()
pipeline_options.asr_options = asr_model_specs.WHISPER_TINY.model_copy(deep=True)
pipeline_options.asr_options.include_time_metadata = False  # Disable timing

CLI:

docling audio.mp3 --asr-no-timing

Issue resolved by this Pull Request: Resolves #2564

Screenshot:

Checklist:

- [x] Documentation has been updated
  - CLI documentation auto-generates from code (includes new `--asr-no-timing` flag)
  - Code includes comprehensive docstrings explaining the feature
  
- [x] Examples have been added
  - The feature is straightforward and covered by tests
  - Usage is documented in code comments and test cases
  
- [x] Tests have been added
  - ✅ [test_asr_pipeline_without_time_metadata()] - Integration test for disabled timing
  - ✅ [test_asr_pipeline_with_time_metadata_default()]- Verifies default behavior
  - ✅ [test_conversation_item_to_string_with_and_without_time()]- Unit tests
  - ✅ All tests properly isolated (using [model_copy(deep=True)]
  - ✅ No compilation/lint errors

…cription

github-actions · 2025-11-13T07:11:04Z

✅ DCO Check Passed

Thanks @akanshajain231999, all your commits are properly signed off. 🎉

dosubot · 2025-11-13T07:11:10Z

Related Documentation

Checked 3 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

mergify · 2025-11-13T07:11:29Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

….com> I, akanshajain231999 <[email protected]>, hereby add my Signed-off-by to this commit: b0f6e50 Signed-off-by: akanshajain231999 <[email protected]>

akanshajain231999 · 2025-11-13T07:13:23Z

Hey @ceberam , Can you please review this PR?

codecov · 2025-11-13T07:23:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

ceberam

@akanshajain231999 Thanks a lot for your contribution and detailed documentation and testing of this PR.

In terms of design:

This PR is intended to address the issue #2564 , Would be nice to have an argument to disable printing the timing metadata.. It looks to me as a reasonable request to customize the text serialization of an audio file parsed as DoclingDocument. One may want to extract the text from the DoclingDocument with the time metadata or without it. Note that your implementation is not addressing a serialization option but rather the parsing of the raw file into DoclingDocument. If a user sets the pipeline option include_time_metadata = False, the time metadata information will be lost and the converted DoclingDocument will not hold this information. I think it would be more flexible that DoclingDocument keeps the information and the user can then decide whether to extract/export all the information or just the text (without time metadata). Some alternatives could be:
- keep everything in TextItem.orig (untreated representation) and only the text in text (sanitized representation). Note that text export formats like markdown will use text to serialize. This option would remove the hassle of dealing with pipeline options.
- we are currently working in a new data model for audio provenance items in docling-core. The metadata information would be stored there and it would help manage metadata information from ASR and WebVTT files, as well as unlocking time-dependent chunking. Please see my final remark later on.
I have the impression that issue #2564 was about any type of metadata in ASR, i.e., both the time and the speaker metadata. In this PR, the option include_time_metadata = False would still keep the speaker annotation, which may also be annoying for those who just want to process text.

More technically:

Name of the CLI option: I find a bit confusing --asr-no-timing, because of the double negation of the explicit False option --no-asr-no-timing . I would have simply used --asr-metadata and --no-asr-metadata (and included the speaker annotation like explained above).
In test files, please try to avoid converting the same file multiple times across the same test module (e.g., like in test_asr_pipeline_with_time_metadata_default), not to make the test suite unnecessary longer. You can use fixtures with a module scope.

Since we do not want to introduce features that may be deprecated in the short term, please allow us some few days (expected next week) and we will get back with further suggestions on this PR.

akanshajain231999 · 2025-11-14T00:42:39Z

@ceberam Thanks for the detailed review. I will wait for your response until next week.
Meanwhile, do you have any other issues which I can work on?

ceberam · 2025-11-14T09:45:24Z

@ceberam Thanks for the detailed review. I will wait for your response until next week. Meanwhile, do you have any other issues which I can work on?

@akanshajain231999 you are very welcome to contribute to Docling, this is an open-source collaborative project 🙂
Feel free to pick up an issue and when you are ready to actively work on it, you can set yourself in the Assignees list.
Here is a list of issues that I think could be easy wins. Some may be outdated, so please always consider the latest Docling release. 2626, 2515, 2465, 2487, 2476, 2367, 2298, 2351 (this one more ambitious)

feat(asr): add option to include/exclude timing metadata in ASR trans…

b0f6e50

…cription

DCO Remediation Commit for akanshajain231999 <akanshajain231998@gmail…

328dca3

….com> I, akanshajain231999 <[email protected]>, hereby add my Signed-off-by to this commit: b0f6e50 Signed-off-by: akanshajain231999 <[email protected]>

ceberam self-requested a review November 13, 2025 07:15

ceberam reviewed Nov 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add option to disable timing metadata in ASR transcription #2622

feat: add option to disable timing metadata in ASR transcription #2622

akanshajain231999 commented Nov 13, 2025

Uh oh!

github-actions bot commented Nov 13, 2025 •

edited

Loading

Uh oh!

dosubot bot commented Nov 13, 2025

Uh oh!

mergify bot commented Nov 13, 2025

Uh oh!

akanshajain231999 commented Nov 13, 2025

Uh oh!

codecov bot commented Nov 13, 2025 •

edited

Loading

Uh oh!

ceberam left a comment

Uh oh!

akanshajain231999 commented Nov 14, 2025

Uh oh!

ceberam commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add option to disable timing metadata in ASR transcription #2622

Are you sure you want to change the base?

feat: add option to disable timing metadata in ASR transcription #2622

Conversation

akanshajain231999 commented Nov 13, 2025

Feature: Option to disable timing metadata during ASR transcription

Changes Made:

Core Implementation:

CLI Integration:

Tests:

Backward Compatibility:

Usage Examples

Screenshot:

Checklist:

Uh oh!

github-actions bot commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dosubot bot commented Nov 13, 2025

Uh oh!

mergify bot commented Nov 13, 2025

Merge Protections

🟢 Enforce conventional commit

Uh oh!

akanshajain231999 commented Nov 13, 2025

Uh oh!

codecov bot commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ceberam left a comment

Choose a reason for hiding this comment

Uh oh!

akanshajain231999 commented Nov 14, 2025

Uh oh!

ceberam commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Nov 13, 2025 •

edited

Loading

codecov bot commented Nov 13, 2025 •

edited

Loading