Skip to content

feat: add Anthropic-compatible serving endpoints#4538

Open
lvhan028 wants to merge 3 commits intoInternLM:mainfrom
lvhan028:feat/anthropic-compatible-endpoints
Open

feat: add Anthropic-compatible serving endpoints#4538
lvhan028 wants to merge 3 commits intoInternLM:mainfrom
lvhan028:feat/anthropic-compatible-endpoints

Conversation

@lvhan028
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings April 19, 2026 12:53
@lvhan028 lvhan028 marked this pull request as draft April 19, 2026 12:53
@lvhan028 lvhan028 review requested due to automatic review settings April 19, 2026 12:53
@lvhan028 lvhan028 added the enhancement New feature or request label Apr 19, 2026
Introduce Anthropic-style messages, count_tokens, and model-list endpoints with dedicated per-endpoint handlers so LMDeploy can interoperate with Anthropic-oriented clients while keeping OpenAI routes unchanged.

Made-with: Cursor
@lvhan028 lvhan028 force-pushed the feat/anthropic-compatible-endpoints branch from 296c675 to 48395a0 Compare April 27, 2026 04:32
@lvhan028 lvhan028 marked this pull request as ready for review April 27, 2026 06:10
@lvhan028 lvhan028 requested review from Copilot and lzhangzz April 27, 2026 06:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Anthropic-compatible API surface to LMDeploy’s serving stack, including message generation, token counting, and Anthropic-scoped model listing.

Changes:

  • Introduces lmdeploy.serve.anthropic package with protocol models, adapters, endpoints, and SSE streaming utilities.
  • Wires the Anthropic router into the existing OpenAI FastAPI server.
  • Adds endpoint tests plus English/Chinese documentation pages and index links.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lmdeploy/serve/openai/api_server.py Mounts the Anthropic router into the main FastAPI app.
lmdeploy/serve/anthropic/router.py Assembles Anthropic endpoint modules into a single router.
lmdeploy/serve/anthropic/endpoints/messages.py Implements POST /v1/messages (streaming + non-streaming).
lmdeploy/serve/anthropic/endpoints/messages_count_tokens.py Implements POST /v1/messages/count_tokens.
lmdeploy/serve/anthropic/endpoints/models.py Implements GET /anthropic/v1/models.
lmdeploy/serve/anthropic/streaming.py Converts LMDeploy generation streams into Anthropic-style SSE events.
lmdeploy/serve/anthropic/adapter.py Maps between Anthropic request/response shapes and LMDeploy/OpenAI internals.
lmdeploy/serve/anthropic/protocol.py Adds Pydantic models for Anthropic-compatible request/response payloads.
lmdeploy/serve/anthropic/errors.py Adds Anthropic-style error response helper.
tests/test_lmdeploy/serve/anthropic/test_endpoints.py Adds tests covering endpoint behavior and SSE shape.
docs/en/llm/api_server_anthropic.md Documents Anthropic-compatible endpoints (English).
docs/zh_cn/llm/api_server_anthropic.md Documents Anthropic-compatible endpoints (Chinese).
docs/en/llm/api_server.md Links to the new Anthropic docs page.
docs/en/index.rst / docs/zh_cn/index.rst Adds the new doc page to navigation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

app = FastAPI(docs_url='/', lifespan=lifespan)

app.include_router(router)
app.include_router(create_anthropic_router(VariableInterface))
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The router is created with create_anthropic_router(VariableInterface) (the class) rather than the server_context = VariableInterface() instance used elsewhere in this file (e.g., check_request). Passing the instance improves consistency and avoids surprises if VariableInterface ever gains instance-level state.

Copilot uses AI. Check for mistakes.
if request.tools and (parser_cls is None or parser_cls.tool_parser_cls is None):
return create_error_response(
HTTPStatus.BAD_REQUEST,
'Please launch the api_server with --tool-call-parser if you want to use tool.')
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User-facing error message is grammatically awkward: "if you want to use tool." Consider rewording to "...if you want to use tools." or "...if you want to use tool calling." to be clearer.

Suggested change
'Please launch the api_server with --tool-call-parser if you want to use tool.')
'Please launch the api_server with --tool-call-parser if you want to use tools.')

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +96
if block is None:
closing = _close_current_block()
if closing:
events.append(closing)
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _start_tool_block, switching to an already-created tool block does not close the currently-open content block (the _close_current_block() call only happens when block is None). If the stream ever interleaves text/thinking with additional tool deltas for the same tool index, this will produce invalid/missing content_block_stop events. Consider always closing the current block when changing current_block kind/index, even when reusing an existing tool block.

Suggested change
if block is None:
closing = _close_current_block()
if closing:
events.append(closing)
target_block_index = block['block_index'] if block is not None else None
same_block = (
current_block is not None and current_block.get('kind') == 'tool_use'
and current_block.get('tool_index') == tool_index
and current_block.get('block_index') == target_block_index)
if not same_block:
closing = _close_current_block()
if closing:
events.append(closing)
if block is None:

Copilot uses AI. Check for mistakes.
Comment on lines +112 to +117
'content_block': {
'type': 'tool_use',
'id': tool_delta.id,
'name': '',
'input': {},
},
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content_block_start for tool_use is emitted with an empty name even though tool_delta.function.name is typically available. Clients may rely on the tool name being present on the initial tool_use block. Populate the tool name (and store it in tool_blocks) when starting the block, using the delta's function name when present.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants