Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

vLLM

vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective - Host: opensource - https://opencollective.com/vllm - Code: https://github.com/vllm-project/vllm

Rahul quant merged

github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 1 month ago
[Perf] Reduce peak memory usage of llama

github.com/vllm-project/vllm - andoorve opened this pull request about 1 month ago
[bugfix] Fix static asymmetric quantization case

github.com/vllm-project/vllm - ProExpertProg opened this pull request about 1 month ago
[Tool parsing] Improve / correct mistral tool parsing

github.com/vllm-project/vllm - patrickvonplaten opened this pull request about 1 month ago
Nir b2b latest

github.com/vllm-project/vllm - nirda7 opened this pull request about 1 month ago
[Docs] Publish meetup slides

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[Feature] enable host memory for kv cache

github.com/vllm-project/vllm - YZP17121579 opened this pull request about 1 month ago
Rs 24 sparse

github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 1 month ago
[Hardware][Cambricon MLU] Add Cambricon MLU inference backend (#9649)

github.com/vllm-project/vllm - zonghuaxiansheng opened this pull request about 1 month ago
[Bugfix] Fix unable to load some models

github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
[Model] Support telechat2

github.com/vllm-project/vllm - shunxing12345 opened this pull request about 1 month ago
[TPU] Implement prefix caching for TPUs

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[Model] Add Support for Multimodal Granite Models

github.com/vllm-project/vllm - alex-jw-brooks opened this pull request about 1 month ago
[Feature]: 2D TP & EP

github.com/vllm-project/vllm - WenhaoHe02 opened this issue about 1 month ago
[Misc] Update benchmark to support image_url file or http

github.com/vllm-project/vllm - kakao-steve-ai opened this pull request about 1 month ago
[CI/Build] Make shellcheck happy

github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
Bump to compressed-tensors v0.8.0

github.com/vllm-project/vllm - dsikka opened this pull request about 1 month ago
Bump to `compressed-tensors` v0.8.0

github.com/vllm-project/vllm - dsikka opened this pull request about 1 month ago
[Core][Frontend] Add faster-outlines as guided decoding backend

github.com/vllm-project/vllm - unaidedelf8777 opened this pull request about 1 month ago
[core][distributed] use tcp store directly

github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[help wanted]: add QwenModel to ci tests

github.com/vllm-project/vllm - youkaichao opened this issue about 1 month ago
[V1] Fix CI tests on V1 engine

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
Revert "[ci][build] limit cmake version"

github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[doc] improve debugging doc

github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[V1] Enable Inductor when using piecewise CUDA graphs

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[Feature]: Support for NVIDIA Unified memory

github.com/vllm-project/vllm - khayamgondal opened this issue about 1 month ago
[doc] fix location of runllm widget

github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[TPU] Use numpy to compute slot mapping

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[Doc] Fix typo in arg_utils.py

github.com/vllm-project/vllm - xyang16 opened this pull request about 1 month ago
[Bug]: qwen cannot be quantized in vllm

github.com/vllm-project/vllm - yananchen1989 opened this issue about 1 month ago
[Bugfix] Fix QwenModel argument

github.com/vllm-project/vllm - DamonFool opened this pull request about 1 month ago
[Feature]: 2:4 sparsity + w4a16 support

github.com/vllm-project/vllm - arunpatala opened this issue about 1 month ago
[Usage]:Qwen2-VL not support Lora

github.com/vllm-project/vllm - menglrskr opened this issue about 1 month ago
[Misc]Fix Idefics3Model argument

github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Bug]: Deepseek V2 coder 236B awq error!

github.com/vllm-project/vllm - tohnee opened this issue about 1 month ago
[misc] Layerwise profile updates

github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request about 1 month ago
[V1] TPU Prototype

github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 1 month ago
[Hardware] [HPU]add `mark_step` for hpu

github.com/vllm-project/vllm - jikunshang opened this pull request about 1 month ago
[Core] Reduce TTFT with concurrent partial prefills

github.com/vllm-project/vllm - joerunde opened this pull request about 1 month ago
[Bugfix] Fix for Spec model TP + Chunked Prefill

github.com/vllm-project/vllm - andoorve opened this pull request about 1 month ago
Making vLLM compatible with Mistral fp8 weights.

github.com/vllm-project/vllm - akllm opened this pull request about 1 month ago
[V1] Enable custom ops with piecewise CUDA graphs

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[6/N] pass whole config to inner model

github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Bugfix] bitsandbytes models fail to run pipeline parallel

github.com/vllm-project/vllm - HoangCongDuc opened this pull request about 1 month ago
[Frontend] Add per-request number of cached token stats

github.com/vllm-project/vllm - zifeitong opened this pull request about 1 month ago
[Feature]: BASE_URL environment variable

github.com/vllm-project/vllm - bjb19 opened this issue about 1 month ago
[Docs] Misc updates to TPU installation instructions

github.com/vllm-project/vllm - mikegre-google opened this pull request about 1 month ago
[Doc] Move PR template content to docs

github.com/vllm-project/vllm - russellb opened this pull request about 1 month ago
[Usage]: how can i get all logits of token?

github.com/vllm-project/vllm - joyyyhuang opened this issue about 1 month ago
[Bug]: Outlines w/ Mistral

github.com/vllm-project/vllm - matbee-eth opened this issue about 1 month ago
[Feature]: Support for predicted outputs

github.com/vllm-project/vllm - flozi00 opened this issue about 1 month ago
[V1] Add all_token_ids attribute to Request

github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
Rename vllm.logging to vllm.logging_utils

github.com/vllm-project/vllm - flozi00 opened this pull request about 1 month ago