Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Bug]: Mismatch of tqdm when n > 1
github.com/vllm-project/vllm - MiDonkey opened this issue 15 days ago
github.com/vllm-project/vllm - MiDonkey opened this issue 15 days ago
[Feature]: Add support for torchao quantification model
github.com/vllm-project/vllm - mapxin opened this issue 15 days ago
github.com/vllm-project/vllm - mapxin opened this issue 15 days ago
[Model] Composite weight loading for multimodal Qwen2
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 15 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 15 days ago
Build tpu image in release pipeline
github.com/vllm-project/vllm - richardsliu opened this pull request 16 days ago
github.com/vllm-project/vllm - richardsliu opened this pull request 16 days ago
[CI] Add test case with JSON schema using references + use xgrammar by default with OpenAI parse
github.com/vllm-project/vllm - mgoin opened this pull request 16 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 16 days ago
[Usage]: Persistent Errors with vllm serve on Neuron Device: Model architectures ['LlamaForCausalLM'] failed to be inspected.
github.com/vllm-project/vllm - xiao11lam opened this issue 16 days ago
github.com/vllm-project/vllm - xiao11lam opened this issue 16 days ago
[BugFix][Kernel]: fix illegal memory access in causal_conv1d when conv_states is None
github.com/vllm-project/vllm - xffxff opened this pull request 16 days ago
github.com/vllm-project/vllm - xffxff opened this pull request 16 days ago
Update deploying_with_k8s.rst
github.com/vllm-project/vllm - AlexHe99 opened this pull request 16 days ago
github.com/vllm-project/vllm - AlexHe99 opened this pull request 16 days ago
[core][misc] remove use_dummy driver for _run_workers
github.com/vllm-project/vllm - youkaichao opened this pull request 16 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 16 days ago
[Bug]: illegal memory access in `causal_conv1d_fn` with input length 1026
github.com/vllm-project/vllm - xffxff opened this issue 17 days ago
github.com/vllm-project/vllm - xffxff opened this issue 17 days ago
[Doc]: How to make Multi-Node Inference
github.com/vllm-project/vllm - pygongnlp opened this issue 17 days ago
github.com/vllm-project/vllm - pygongnlp opened this issue 17 days ago
[Core] Support disaggregated prefill with Mooncake Transfer Engine
github.com/vllm-project/vllm - ShangmingCai opened this pull request 18 days ago
github.com/vllm-project/vllm - ShangmingCai opened this pull request 18 days ago
[Feature]: Publish vllm-tpu image to dockerhub
github.com/vllm-project/vllm - jjk-g opened this issue 18 days ago
github.com/vllm-project/vllm - jjk-g opened this issue 18 days ago
[Core] Support offloading KV cache to CPU
github.com/vllm-project/vllm - ApostaC opened this pull request 18 days ago
github.com/vllm-project/vllm - ApostaC opened this pull request 18 days ago
[torch.compile] Add torch inductor pass for fusing silu_and_mul with subsequent scaled_fp8_quant operations
github.com/vllm-project/vllm - SageMoore opened this pull request 18 days ago
github.com/vllm-project/vllm - SageMoore opened this pull request 18 days ago
[Bugfix] Only require XGrammar on x86
github.com/vllm-project/vllm - mgoin opened this pull request 18 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 18 days ago
[CI] Turn on basic correctness tests for V1
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 18 days ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 18 days ago
[Bugfix] Fix spec decoding when seed is none in a batch
github.com/vllm-project/vllm - wallashss opened this pull request 18 days ago
github.com/vllm-project/vllm - wallashss opened this pull request 18 days ago
[Usage]: Different Context Free Grammars (or regex) per request
github.com/vllm-project/vllm - AlbertoCastelo opened this issue 18 days ago
github.com/vllm-project/vllm - AlbertoCastelo opened this issue 18 days ago
[Bug]: Error When Running gguf with vllm for Ultra-Long Context
github.com/vllm-project/vllm - anrgct opened this issue 18 days ago
github.com/vllm-project/vllm - anrgct opened this issue 18 days ago
[Model] Add support for embedding model JambaClassfication]
github.com/vllm-project/vllm - yecohn opened this pull request 18 days ago
github.com/vllm-project/vllm - yecohn opened this pull request 18 days ago
[MISC][XPU] quick fix for XPU CI
github.com/vllm-project/vllm - yma11 opened this pull request 18 days ago
github.com/vllm-project/vllm - yma11 opened this pull request 18 days ago
[Bug]: Docker deployment returns zmq.error.ZMQError: Operation not supported
github.com/vllm-project/vllm - aqx95 opened this issue 18 days ago
github.com/vllm-project/vllm - aqx95 opened this issue 18 days ago
[Bug]: RuntimeError: HIP Error on vLLM ROCm Image in Kubernetes Cluster with AMD GPUs
github.com/vllm-project/vllm - taddeusb90 opened this issue 18 days ago
github.com/vllm-project/vllm - taddeusb90 opened this issue 18 days ago
Update sampling_params.py
github.com/vllm-project/vllm - o2363286 opened this pull request 18 days ago
github.com/vllm-project/vllm - o2363286 opened this pull request 18 days ago
[Frontend] correctly record prefill and decode time metrics
github.com/vllm-project/vllm - tomeras91 opened this pull request 18 days ago
github.com/vllm-project/vllm - tomeras91 opened this pull request 18 days ago
[Usage]: Sampling several sequences from OpenAI compatible server.
github.com/vllm-project/vllm - Ignoramus0817 opened this issue 18 days ago
github.com/vllm-project/vllm - Ignoramus0817 opened this issue 18 days ago
Regional compilation support
github.com/vllm-project/vllm - Kacper-Pietkun opened this pull request 18 days ago
github.com/vllm-project/vllm - Kacper-Pietkun opened this pull request 18 days ago
[Speculative Decoding] Move indices to device before filtering output
github.com/vllm-project/vllm - zhengy001 opened this pull request 18 days ago
github.com/vllm-project/vllm - zhengy001 opened this pull request 18 days ago
[Feature]: add DoRA support
github.com/vllm-project/vllm - cmhungsteve opened this issue 18 days ago
github.com/vllm-project/vllm - cmhungsteve opened this issue 18 days ago
[Bug]: GPTQ llama2-7b infer server failed!!!
github.com/vllm-project/vllm - tensorflowt opened this issue 19 days ago
github.com/vllm-project/vllm - tensorflowt opened this issue 19 days ago
[Bug]: benchmark random input-len inconsistent
github.com/vllm-project/vllm - ltm920716 opened this issue 19 days ago
github.com/vllm-project/vllm - ltm920716 opened this issue 19 days ago
[CORE] No Request No Scheduler: auto-increment of multi-step
github.com/vllm-project/vllm - DriverSong opened this pull request 19 days ago
github.com/vllm-project/vllm - DriverSong opened this pull request 19 days ago
Tmp whl
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 19 days ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 19 days ago
[Bugfix] Fix QKVParallelLinearWithShardedLora bias bug
github.com/vllm-project/vllm - jeejeelee opened this pull request 19 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 19 days ago
[core][distributed] add pynccl broadcast
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
[Model] support bitsandbytes quantization with minicpm model
github.com/vllm-project/vllm - zixuanzhang226 opened this pull request 19 days ago
github.com/vllm-project/vllm - zixuanzhang226 opened this pull request 19 days ago
[torch.compile] remove compilation_context and simplify code
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
[Doc] add KubeAI to serving integrations
github.com/vllm-project/vllm - samos123 opened this pull request 19 days ago
github.com/vllm-project/vllm - samos123 opened this pull request 19 days ago
[WIP] Xgrammar init in engine
github.com/vllm-project/vllm - mgoin opened this pull request 19 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 19 days ago
[Model] Add TP and BNB quantization support to LlavaMultiModalProjector
github.com/vllm-project/vllm - Isotr0py opened this pull request 19 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 19 days ago
[Bug]: ERROR hermes_tool_parser.py:108] Error in extracting tool call from response.
github.com/vllm-project/vllm - Sala8888 opened this issue 19 days ago
github.com/vllm-project/vllm - Sala8888 opened this issue 19 days ago
[Misc][LoRA] Move the implementation of lora bias to punica.py
github.com/vllm-project/vllm - jeejeelee opened this pull request 19 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 19 days ago
[Doc] Create a new "Usage" section
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 19 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 19 days ago
[Bug]: mistral tool choice error
github.com/vllm-project/vllm - warlockedward opened this issue 19 days ago
github.com/vllm-project/vllm - warlockedward opened this issue 19 days ago
[Misc] Split up pooling tasks
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 19 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 19 days ago
[Bug]: The model output is abnormal when I use 2:4 sparsity
github.com/vllm-project/vllm - jiangjiadi opened this issue 20 days ago
github.com/vllm-project/vllm - jiangjiadi opened this issue 20 days ago
[RFC]: Disaggregated prefilling and KV cache transfer roadmap
github.com/vllm-project/vllm - KuntaiDu opened this issue 20 days ago
github.com/vllm-project/vllm - KuntaiDu opened this issue 20 days ago
[Misc] Remove deprecated names
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 20 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 20 days ago
[Model] Add support for embedding model GritLM
github.com/vllm-project/vllm - pooyadavoodi opened this pull request 20 days ago
github.com/vllm-project/vllm - pooyadavoodi opened this pull request 20 days ago
[Usage]: v0.5.0, why the process_model_inputs_async and engine_step be processed simutaneously? they are separate coroutines, and use await.
github.com/vllm-project/vllm - Deeperfinder opened this issue 20 days ago
github.com/vllm-project/vllm - Deeperfinder opened this issue 20 days ago
[misc] remove xverse modeling file
github.com/vllm-project/vllm - youkaichao opened this pull request 20 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 20 days ago
[Usage]: How to use llava-hf/llava-1.5-7b-hf with bitsandbytes quantization in vllm serve?
github.com/vllm-project/vllm - Wxy-24 opened this issue 20 days ago
github.com/vllm-project/vllm - Wxy-24 opened this issue 20 days ago
[Bug]: Engine process (pid 76) died
github.com/vllm-project/vllm - 0xymoro opened this issue 20 days ago
github.com/vllm-project/vllm - 0xymoro opened this issue 20 days ago
[Kernel] Use `out` in flash_attn_varlen_func
github.com/vllm-project/vllm - WoosukKwon opened this pull request 20 days ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 20 days ago
[Core]: Support destroying all KV cache during runtime
github.com/vllm-project/vllm - HollowMan6 opened this pull request 20 days ago
github.com/vllm-project/vllm - HollowMan6 opened this pull request 20 days ago
[core] Avoid metrics log noise when idle - include speculative decodi…
github.com/vllm-project/vllm - cduk opened this pull request 20 days ago
github.com/vllm-project/vllm - cduk opened this pull request 20 days ago
[Bug]: vllm stream generate error
github.com/vllm-project/vllm - Wbxxx opened this issue 20 days ago
github.com/vllm-project/vllm - Wbxxx opened this issue 20 days ago
[Bug]: The new vllm version is slow in inference
github.com/vllm-project/vllm - imrankh46 opened this issue 20 days ago
github.com/vllm-project/vllm - imrankh46 opened this issue 20 days ago
[Bug]: Failed to abort requests when killing client process.
github.com/vllm-project/vllm - lixinye-nju opened this issue 20 days ago
github.com/vllm-project/vllm - lixinye-nju opened this issue 20 days ago
[doc] add warning about comparing hf and vllm outputs
github.com/vllm-project/vllm - youkaichao opened this pull request 20 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 20 days ago
[Misc] Adding `MMMU-Pro` vision dataset to serving benchmark
github.com/vllm-project/vllm - ywang96 opened this pull request 21 days ago
github.com/vllm-project/vllm - ywang96 opened this pull request 21 days ago
[Core] add xgrammar as guided generation provider
github.com/vllm-project/vllm - joennlae opened this pull request 21 days ago
github.com/vllm-project/vllm - joennlae opened this pull request 21 days ago
[Bugfix] fix race condition that leads to wrong order of token returned
github.com/vllm-project/vllm - joennlae opened this pull request 21 days ago
github.com/vllm-project/vllm - joennlae opened this pull request 21 days ago
[Misc] Rename embedding classes to pooling
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 21 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 21 days ago
[Bug]: Fail to use CUDA with multiprocessing (llama_3_8b)
github.com/vllm-project/vllm - yliu2702 opened this issue 21 days ago
github.com/vllm-project/vllm - yliu2702 opened this issue 21 days ago
Fill TorchSDPAAttentionMetadata seq_lens_field for prefill
github.com/vllm-project/vllm - maxdebayser opened this pull request 21 days ago
github.com/vllm-project/vllm - maxdebayser opened this pull request 21 days ago
[Bug]: LoRa adapter responses not matching peft/transformers response
github.com/vllm-project/vllm - RonanKMcGovern opened this issue 21 days ago
github.com/vllm-project/vllm - RonanKMcGovern opened this issue 21 days ago
[Usage]: cannot load llama 3.2 3b on a 16gb gpu when gpu_memory_utilisation=1
github.com/vllm-project/vllm - TheKidThatCodes opened this issue 21 days ago
github.com/vllm-project/vllm - TheKidThatCodes opened this issue 21 days ago
[LoRA] Change lora_tokenizers capacity
github.com/vllm-project/vllm - xyang16 opened this pull request 21 days ago
github.com/vllm-project/vllm - xyang16 opened this pull request 21 days ago
[Model] Add BNB support to Llava and Pixtral-HF
github.com/vllm-project/vllm - Isotr0py opened this pull request 21 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 21 days ago
[Bug]: Improve Error Messaging for Unsupported Tasks in vLLM (e.g., embedding with Llama Models)
github.com/vllm-project/vllm - laura-dietz opened this issue 21 days ago
github.com/vllm-project/vllm - laura-dietz opened this issue 21 days ago
[Usage]: Question on max_model_len
github.com/vllm-project/vllm - mces89 opened this issue 21 days ago
github.com/vllm-project/vllm - mces89 opened this issue 21 days ago
[Core][Performance] Add XGrammar support for guided decoding and set it as default
github.com/vllm-project/vllm - aarnphm opened this pull request 22 days ago
github.com/vllm-project/vllm - aarnphm opened this pull request 22 days ago
[Usage]: Dynamically loaded LoRas do not appear on the /models endpoint
github.com/vllm-project/vllm - RonanKMcGovern opened this issue 22 days ago
github.com/vllm-project/vllm - RonanKMcGovern opened this issue 22 days ago
[New Model]: nvidia/Hymba-1.5B-Base
github.com/vllm-project/vllm - hutm opened this issue 22 days ago
github.com/vllm-project/vllm - hutm opened this issue 22 days ago
[Bugfix] Multiple fixes to tool streaming with hermes and mistral parsers
github.com/vllm-project/vllm - cedonley opened this pull request 22 days ago
github.com/vllm-project/vllm - cedonley opened this pull request 22 days ago
[Bug]: Streaming w/ tool choice auto often truncates the final delta in the streamed arguments
github.com/vllm-project/vllm - cedonley opened this issue 22 days ago
github.com/vllm-project/vllm - cedonley opened this issue 22 days ago
[Bugfix] Fix OpenVino/Neuron `driver_worker` init
github.com/vllm-project/vllm - NickLucche opened this pull request 22 days ago
github.com/vllm-project/vllm - NickLucche opened this pull request 22 days ago
[Bugfix] Fix Idefics3 bug
github.com/vllm-project/vllm - jeejeelee opened this pull request 22 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 22 days ago
[Usage]: how to skip samples that have error and process the rest when using llm.generate(prompts, sampling_params)?
github.com/vllm-project/vllm - yxchng opened this issue 22 days ago
github.com/vllm-project/vllm - yxchng opened this issue 22 days ago
Prepare sin/cos buffers for rope outside model forward
github.com/vllm-project/vllm - tzielinski-habana opened this pull request 22 days ago
github.com/vllm-project/vllm - tzielinski-habana opened this pull request 22 days ago
[Bug]: [Open-VINO] inference in CPU failed with "init_device" error
github.com/vllm-project/vllm - Orion-zhen opened this issue 22 days ago
github.com/vllm-project/vllm - Orion-zhen opened this issue 22 days ago
[Feature]: Unblock LLM while handling long sequences / Handling multiple prefills at the same time
github.com/vllm-project/vllm - schoennenbeck opened this issue 22 days ago
github.com/vllm-project/vllm - schoennenbeck opened this issue 22 days ago
[Bug]: AttributeError: 'Qwen2Model' object has no attribute 'rotary_emb'
github.com/vllm-project/vllm - Alex-DeepL opened this issue 22 days ago
github.com/vllm-project/vllm - Alex-DeepL opened this issue 22 days ago
[Bug]: OpenAI compatible server with HuggingFaceTB/SmolVLM-Instruct resulting in 500 Internal Server error
github.com/vllm-project/vllm - Sriramkk123 opened this issue 22 days ago
github.com/vllm-project/vllm - Sriramkk123 opened this issue 22 days ago
[Model] Refactor Molmo weights loading to use AutoWeightsLoader
github.com/vllm-project/vllm - Isotr0py opened this pull request 22 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 22 days ago
[Model]: add some tests for aria model
github.com/vllm-project/vllm - xffxff opened this pull request 22 days ago
github.com/vllm-project/vllm - xffxff opened this pull request 22 days ago
[Model] Replace embedding models with pooling adapter
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 22 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 22 days ago
[Platform] Move `async output` check to platform
github.com/vllm-project/vllm - wangxiyuan opened this pull request 22 days ago
github.com/vllm-project/vllm - wangxiyuan opened this pull request 22 days ago
Drop ROCm load format check
github.com/vllm-project/vllm - wangxiyuan opened this pull request 22 days ago
github.com/vllm-project/vllm - wangxiyuan opened this pull request 22 days ago
[Usage]: Removal of vllm.openai.rpc folder in vLLM 0.6.2 release
github.com/vllm-project/vllm - utkshukla opened this issue 22 days ago
github.com/vllm-project/vllm - utkshukla opened this issue 22 days ago
[feature]:upstream quark format to vllm
github.com/vllm-project/vllm - kewang-xlnx opened this pull request 22 days ago
github.com/vllm-project/vllm - kewang-xlnx opened this pull request 22 days ago