Feature innertube captions #240

alive4ever · 2025-06-18T02:15:12Z

Add ability to obtain captions from get_transcript innertube api instead of using caption baseUrl of player response.

The feature is behind new settings: use_innertube_for_captions, which is set to False by default.

The protobuf encoded params for get_transcript is crafted using blackboxprotobuf module due to its lightweight size and easy-to-use.

Currently only manual and auto generated captions are supported. There is no support for translated captions, so request for translated captions will return the caption in its original language.

This will hopefully fix #239

user234683

Sorry for the late review. Have a few comments

youtube/watch.py

youtube/innertube_caption.py

Add settings to get caption from get_transcript innertube api. Disabled by default.

Add bbpb (i.e. blackboxprotobuf) as requirements to encode protobuf for innertube caption.

Add innertube_caption submodule to enable fetching captions via innertube get_caption api. No support for translated captions for now.

Add support to get captions via innertube api.

Add test for vtt_body and retry using footer continuation.

Will use the built-in proto module.

Use built-in proto submodule instead of bbpb to generate innertube caption request params.

Avoid url-quoting twice of params.

Set use_innertube_for_captions to True to get captions working.

Use deep_get from yt_data_extract to access nested dict items to provide safe extraction of its value. Also rewrite vtt part construction to use f-string.

Set text/vtt as mimetype for get_caption()

Fix missing newline at the end of vtt chunk running text.

Add a workaround to get non innertube captions working, i.e. when use_innertube_for_captions is disabled.

Return 302 redirect to /api/timedtext when accessing /watch/transcript/ endpoint.

user234683 requested changes Aug 22, 2025

View reviewed changes

youtube/watch.py Show resolved Hide resolved

youtube/innertube_caption.py Outdated Show resolved Hide resolved

youtube/innertube_caption.py Outdated Show resolved Hide resolved

alive4ever added 10 commits August 28, 2025 01:26

settings: add use_innertube_for_caption

74b9e2c

Add settings to get caption from get_transcript innertube api. Disabled by default.

requirements: add bbpb

b50bc3c

Add bbpb (i.e. blackboxprotobuf) as requirements to encode protobuf for innertube caption.

innertube_caption: new submodule

3784a7d

Add innertube_caption submodule to enable fetching captions via innertube get_caption api. No support for translated captions for now.

watch: add support for innertube captions

af6926c

Add support to get captions via innertube api.

innertube_caption: Add continuation retry

0bcb6c2

Add test for vtt_body and retry using footer continuation.

Revert "requirements: add bbpb"

c1612c2

Will use the built-in proto module.

innertube_caption: use built in proto submodule

904fab8

Use built-in proto submodule instead of bbpb to generate innertube caption request params.

innertube_caption: fix quoting of params

fadc931

Avoid url-quoting twice of params.

settings: enable innertube captions by default

05b5c31

Set use_innertube_for_captions to True to get captions working.

innertube_caption: use deep_get

9a10217

Use deep_get from yt_data_extract to access nested dict items to provide safe extraction of its value. Also rewrite vtt part construction to use f-string.

alive4ever force-pushed the feature-innertube-captions branch from 3ccde56 to 9a10217 Compare August 28, 2025 02:12

alive4ever added 4 commits August 29, 2025 14:24

watch: fix mimetype for caption

ae4d51d

Set text/vtt as mimetype for get_caption()

innertube_caption: fix missing newline

4a39b06

Fix missing newline at the end of vtt chunk running text.

watch: get non innertube captions working

6d16e84

Add a workaround to get non innertube captions working, i.e. when use_innertube_for_captions is disabled.

watch: fix /watch/transcript endpoint

d3988d7

Return 302 redirect to /api/timedtext when accessing /watch/transcript/ endpoint.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature innertube captions #240

Feature innertube captions #240

Uh oh!

alive4ever commented Jun 18, 2025 •

edited

Loading

Uh oh!

user234683 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature innertube captions #240

Are you sure you want to change the base?

Feature innertube captions #240

Uh oh!

Conversation

alive4ever commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

user234683 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alive4ever commented Jun 18, 2025 •

edited

Loading