Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

vLLM

vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective - Host: opensource - https://opencollective.com/vllm - Code: https://github.com/vllm-project/vllm

[WIP] Deepseek V2 MLA

github.com/vllm-project/vllm - simon-mo opened this pull request about 2 months ago
Update deploying_with_k8s.rst

github.com/vllm-project/vllm - AlexHe99 opened this pull request about 2 months ago
[core][misc] remove use_dummy driver for _run_workers

github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
[Feature]: Provide pre-built CPU docker image

github.com/vllm-project/vllm - fzyzcjy opened this issue about 2 months ago
Add Bamba Model

github.com/vllm-project/vllm - fabianlim opened this pull request about 2 months ago
[Misc]: FP8/INT8 for AQLM ?

github.com/vllm-project/vllm - Duncan1115 opened this issue about 2 months ago
[Doc]: How to make Multi-Node Inference

github.com/vllm-project/vllm - pygongnlp opened this issue about 2 months ago
[Core] Support offloading KV cache to CPU

github.com/vllm-project/vllm - ApostaC opened this pull request about 2 months ago
[Bugfix] Only require XGrammar on x86

github.com/vllm-project/vllm - mgoin opened this pull request about 2 months ago
[CI] Turn on basic correctness tests for V1

github.com/vllm-project/vllm - tlrmchlsmth opened this pull request about 2 months ago
[Bugfix] Fix spec decoding when seed is none in a batch

github.com/vllm-project/vllm - wallashss opened this pull request about 2 months ago
[Model] Add JambaForSequenceClassification model

github.com/vllm-project/vllm - yecohn opened this pull request about 2 months ago
[MISC][XPU] quick fix for XPU CI

github.com/vllm-project/vllm - yma11 opened this pull request about 2 months ago
Add jamba classfication

github.com/vllm-project/vllm - yecohn opened this pull request about 2 months ago
Update sampling_params.py

github.com/vllm-project/vllm - o2363286 opened this pull request about 2 months ago
Regional compilation support

github.com/vllm-project/vllm - Kacper-Pietkun opened this pull request about 2 months ago
[Feature]: add DoRA support

github.com/vllm-project/vllm - cmhungsteve opened this issue about 2 months ago
[Bug]: GPTQ llama2-7b infer server failed!!!

github.com/vllm-project/vllm - tensorflowt opened this issue about 2 months ago
[Bug]: benchmark random input-len inconsistent

github.com/vllm-project/vllm - ltm920716 opened this issue about 2 months ago
[CORE] No Request No Scheduler: auto-increment of multi-step

github.com/vllm-project/vllm - DriverSong opened this pull request about 2 months ago
Tmp whl

github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 2 months ago
[Bugfix] Fix QKVParallelLinearWithShardedLora bias bug

github.com/vllm-project/vllm - jeejeelee opened this pull request about 2 months ago
[core][distributed] add pynccl broadcast

github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
[Model] support bitsandbytes quantization with minicpm model

github.com/vllm-project/vllm - zixuanzhang226 opened this pull request about 2 months ago
Lora scheduler

github.com/vllm-project/vllm - Scott-Hickmann opened this pull request about 2 months ago
[torch.compile] remove compilation_context and simplify code

github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
[Doc] add KubeAI to serving integrations

github.com/vllm-project/vllm - samos123 opened this pull request about 2 months ago
[WIP] Xgrammar init in engine

github.com/vllm-project/vllm - mgoin opened this pull request about 2 months ago
[Doc] Create a new "Usage" section

github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 2 months ago
[Bug]: mistral tool choice error

github.com/vllm-project/vllm - warlockedward opened this issue about 2 months ago
[Misc] Split up pooling tasks

github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 2 months ago
[Misc] Remove deprecated names

github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 2 months ago
[Model] Add support for embedding model GritLM

github.com/vllm-project/vllm - pooyadavoodi opened this pull request about 2 months ago
[misc] remove xverse modeling file

github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
[Bug]: Engine process (pid 76) died

github.com/vllm-project/vllm - 0xymoro opened this issue about 2 months ago
[Kernel] Use `out` in flash_attn_varlen_func

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 2 months ago
[Core]: Support destroying all KV cache during runtime

github.com/vllm-project/vllm - HollowMan6 opened this pull request about 2 months ago
[Bug]: vllm stream generate error

github.com/vllm-project/vllm - Wbxxx opened this issue about 2 months ago
[Bug]: The new vllm version is slow in inference

github.com/vllm-project/vllm - imrankh46 opened this issue about 2 months ago
[doc] add warning about comparing hf and vllm outputs

github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
[Core] add xgrammar as guided generation provider

github.com/vllm-project/vllm - joennlae opened this pull request about 2 months ago
[Misc] Rename embedding classes to pooling

github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 2 months ago
Fill TorchSDPAAttentionMetadata seq_lens_field for prefill

github.com/vllm-project/vllm - maxdebayser opened this pull request about 2 months ago
[LoRA] Change lora_tokenizers capacity

github.com/vllm-project/vllm - xyang16 opened this pull request about 2 months ago
[Model] Add BNB support to Llava and Pixtral-HF

github.com/vllm-project/vllm - Isotr0py opened this pull request about 2 months ago
Fix openvino on GPU

github.com/vllm-project/vllm - janimo opened this pull request about 2 months ago
[Usage]: Question on max_model_len

github.com/vllm-project/vllm - mces89 opened this issue about 2 months ago
[New Model]: nvidia/Hymba-1.5B-Base

github.com/vllm-project/vllm - hutm opened this issue about 2 months ago
[Bugfix] Fix OpenVino/Neuron `driver_worker` init

github.com/vllm-project/vllm - NickLucche opened this pull request about 2 months ago
[Bugfix] Fix Idefics3 bug

github.com/vllm-project/vllm - jeejeelee opened this pull request about 2 months ago
Prepare sin/cos buffers for rope outside model forward

github.com/vllm-project/vllm - tzielinski-habana opened this pull request about 2 months ago
[Model]: add some tests for aria model

github.com/vllm-project/vllm - xffxff opened this pull request about 2 months ago
[Model] Replace embedding models with pooling adapter

github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 2 months ago
[Platform] Move `async output` check to platform

github.com/vllm-project/vllm - wangxiyuan opened this pull request about 2 months ago
Drop ROCm load format check

github.com/vllm-project/vllm - wangxiyuan opened this pull request about 2 months ago
[Misc][Quark] Upstream Quark format to VLLM

github.com/vllm-project/vllm - kewang-xlnx opened this pull request about 2 months ago
[Bug]: idefics3 doesn't stream

github.com/vllm-project/vllm - sjuxax opened this issue about 2 months ago