CVE-2025-62426

vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1.
Configurations

No configuration.

History

21 Nov 2025, 02:15

Type Values Removed Values Added
New CVE

Information

Published : 2025-11-21 02:15

Updated : 2025-11-21 15:13


NVD link : CVE-2025-62426

Mitre link : CVE-2025-62426

CVE.ORG link : CVE-2025-62426


JSON object : View

Products Affected

No product.

CWE
CWE-770

Allocation of Resources Without Limits or Throttling