Improve Python's tokenizer for numeric literals #5084

lucach · 2025-11-05T08:29:30Z

This improves Python's tokenizer for numeric literals with respect to several aspects:

Support underscores between digits and after prefixes (fixes [Bug] Python syntax highlighting does not support underscores in numeric literals #4745)
Support octal and binary literals
Support case-insensitive prefixes for hex/octal/binary literals
Recognize a possible leading minus sign as a separate token, instead of mistakenly treating it as part of the numeric literal

Reference: https://docs.python.org/3/reference/lexical_analysis.html#numeric-literals

Add tests to cover several of the above cases and their combinations.

P.S.: I ran the tests with v0.52.0, as they have been (mistakenly? temporarily?) removed in e56ad4b.

This improves Python's tokenizer for numeric literals with respect to several aspects: - Support underscores between digits and after prefixes (fixes microsoft#4745) - Support octal and binary literals - Support case-insensitive prefixes for hex/octal/binary literals - Recognize a possible leading minus sign as a separate token, instead of mistakenly treating it as part of the numeric literal Reference: https://docs.python.org/3/reference/lexical_analysis.html#numeric-literals Add tests to cover several of the above cases and their combinations.

lucach · 2025-11-05T08:31:23Z

@microsoft-github-policy-service agree

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve Python's tokenizer for numeric literals #5084

Improve Python's tokenizer for numeric literals #5084

Uh oh!

lucach commented Nov 5, 2025

Uh oh!

lucach commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Improve Python's tokenizer for numeric literals #5084

Are you sure you want to change the base?

Improve Python's tokenizer for numeric literals #5084

Uh oh!

Conversation

lucach commented Nov 5, 2025

Uh oh!

lucach commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant