Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://github.com/vllm-project/vllm

[Bug]: vllm 0.6.4 部署 MiniCPM-V_2_6_awq_int4 报错

fengqiliang93 opened this issue 10 days ago
[Bugfix] Fix none seed sampling in rejection_sampler

TopIdiot opened this pull request 10 days ago
[Misc][LoRA] Ensure Lora Adapter requests return adapter name

Jeffwan opened this pull request 11 days ago
[Bug]: lora adapter request still return the base model name

Jeffwan opened this issue 11 days ago
[Misc]: Potential division by zero in csrc/cpu/attention.cpp

Xaenalt opened this issue 12 days ago
[RFC]: Adding support for Geospatial models

christian-pinto opened this issue 12 days ago
[Hardware][CPU] support cpu in v1 engine

yma11 opened this pull request 12 days ago
[V1][Bugfix] Always set enable_chunked_prefill = True for V1

WoosukKwon opened this pull request 12 days ago
Don't try to add special tokens to the matcher in XGrammar.

sjuxax opened this pull request 12 days ago
[torch.compile] add a flag to track batchsize statistics

youkaichao opened this pull request 12 days ago
[Bugfix] Fix value unpack error of simple connector for KVCache transfer.

ShangmingCai opened this pull request 12 days ago
[CI/Build] Increase VLLM_MAX_SIZE_MB to 300M

tolak opened this pull request 12 days ago
[Feature]: logging request_id instead of random uuid

cynial opened this issue 12 days ago
[CI] Expand OpenAI guided decoding tests

mgoin opened this pull request 12 days ago
[Bugfix] cuda error running llama 3.2

GeneDer opened this pull request 12 days ago
[Bugfix] Fix guided decoding with tokenizer mode mistral

wallashss opened this pull request 12 days ago
[Pixtral] Improve loading

patrickvonplaten opened this pull request 12 days ago
[Bugfix] Handle <|tool_call|> token in granite tool parser

tjohnson31415 opened this pull request 13 days ago
[Bugfix] Backport request id validation to v0

joerunde opened this pull request 13 days ago
Update README.md

dmoliveira opened this pull request 13 days ago
[Kernel] Triton Paged Attn Decode Kernel

rahulbatra85 opened this pull request 13 days ago
[V1] Use input_ids as input for text-only models

WoosukKwon opened this pull request 13 days ago
monitor metrics of tokens per step using cudagraph batchsizes

youkaichao opened this pull request 13 days ago
[Hardware][Gaudi] Add multiprocessing HPU executor

kzawora-intel opened this pull request 13 days ago
[Frontend] Add OpenAI API support for input_audio

kylehh opened this pull request 13 days ago
[Bugfix] Fix usage of `deprecated` decorator

DarkLight1337 opened this pull request 13 days ago
[Model] Add Llama-SwiftKV model

aurickq opened this pull request 13 days ago
[BUG] Remove token param #10921

flaviabeo opened this pull request 13 days ago
[V1] VLM preprocessor hashing

alexm-neuralmagic opened this pull request 13 days ago
Avoid mistakenly picking Gaudi/HPU if XPU is requested.

janimo opened this pull request 13 days ago
[Misc]: Has anyone tried to run Microsoft Graphrag with vllm?

SushmitaSingh96 opened this issue 13 days ago
[Neuron] Upgrade neuron to 2.20.2

xendo opened this pull request 13 days ago
[torch.compile] add dynamo time tracking

youkaichao opened this pull request 13 days ago
[Misc][LoRA] Add PEFTHelper for LoRA

jeejeelee opened this pull request 13 days ago
[Feature]: Support for Qwen2-VL on AWS Neuron

Chin-Vic opened this issue 13 days ago
[v1] fix use compile sizes

youkaichao opened this pull request 13 days ago
[misc] clean up and unify logging

youkaichao opened this pull request 13 days ago
[Doc][V1] Add V1 support column for multimodal models

ywang96 opened this pull request 13 days ago
[V1] Fix Detokenizer loading in `AsyncLLM`

ywang96 opened this pull request 13 days ago
[core] clean up cudagraph batchsize padding logic

youkaichao opened this pull request 13 days ago
[Kernel]: Cutlass 2:4 Sparsity + FP8/Int8 Quant Support

dsikka opened this pull request 14 days ago
[Usage]: Qwen/Qwen2-VL-7B-Instruct

mahmoudelnazer opened this issue 14 days ago
[torch.compile][misc] fix comments

youkaichao opened this pull request 14 days ago
[Model] PP support for Mamba-like models

mzusman opened this pull request 14 days ago
[CI/Build] Check transformers v4.47

DarkLight1337 opened this pull request 14 days ago
[V1] Further reduce CPU overheads in flash-attn

WoosukKwon opened this pull request 14 days ago
[V1][VLM] Add V1-rearch image inference support for Qwen2-VL

ywang96 opened this pull request 14 days ago
[Bug]: Qwen2VL doesn't work with TPU backend

carlesoctav opened this issue 14 days ago
[core][distributed] initialization from StatelessProcessGroup

youkaichao opened this pull request 14 days ago
[Doc] Update README.md

habaohaba opened this pull request 14 days ago
[torch.compile] allow candidate compile sizes

youkaichao opened this pull request 14 days ago
[Bug]: LLama 3.2 vision focuses only on first image

hrodruck opened this issue 15 days ago
Update benchmarking code

Faraz9877 opened this pull request 15 days ago
[Bugfix] Multiple fixes to tool streaming with hermes and mistral

cedonley opened this pull request 15 days ago
[Doc] Explicitly state that InternVL 2.5 is supported

DarkLight1337 opened this pull request 15 days ago
[Model] Implement merged input processor for Phi-3-Vision models

Isotr0py opened this pull request 15 days ago
[core][executor] simplify instance id

youkaichao opened this pull request 15 days ago
[Doc] Explicitly state that PP isn't compatible with speculative decoding yet

DarkLight1337 opened this pull request 15 days ago
[Usage]: How to run local model in docker with cpu

yuzifu opened this issue 15 days ago
[Bugfix] Fix test-pipeline.yaml

jeejeelee opened this pull request 15 days ago
[torch.compile] use depyf to dump torch.compile internals

youkaichao opened this pull request 15 days ago
[Bug]: embedding model not supported

cosmic-chichu opened this issue 15 days ago
[Frontend] Use request id from header

joerunde opened this pull request 15 days ago
[Usage]: Unable to server embedding model e5-mistral-7b-instruct

SushmitaSingh96 opened this issue 16 days ago
[Core] Add support for loading weight that has already done TP sharding

HollowMan6 opened this pull request 16 days ago
[New Model]: Add support for Llama3.3

jorgeantonio21 opened this issue 16 days ago
[Bug]: Can't load/compile Mixtral-8x7B-Instruct-v0.1 on TPU

hosseinsarshar opened this issue 16 days ago
[V1] Input Batch Relocation

varun-sundar-rabindranath opened this pull request 16 days ago
[Core] Cleanup startup logging a bit

russellb opened this pull request 16 days ago
[misc] fix typo

youkaichao opened this pull request 16 days ago
[V1] Run mypy on

WoosukKwon opened this pull request 16 days ago
[V1] LoRA Support

varun-sundar-rabindranath opened this pull request 16 days ago
[ci] fix broken tests

youkaichao opened this pull request 16 days ago
[Misc][LoRA] Abstract PunicaWrapper

jeejeelee opened this pull request 16 days ago