Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

vLLM

vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective - Host: opensource - https://opencollective.com/vllm - Code: https://github.com/vllm-project/vllm

DeepSeek VL support

github.com/vllm-project/vllm - SinanAkkoyun opened this issue 11 months ago
inference with AWQ quantization

github.com/vllm-project/vllm - Kev1ntan opened this issue 11 months ago
Fixes #1556 double free

github.com/vllm-project/vllm - br3no opened this pull request 11 months ago
TCPStore is not available

github.com/vllm-project/vllm - Z-Diviner opened this issue 11 months ago
add aya-101 model

github.com/vllm-project/vllm - ahkarami opened this issue 11 months ago
What's up with Pipeline Parallelism?

github.com/vllm-project/vllm - duanzhaol opened this issue 11 months ago
Question regarding GPU memory allocation

github.com/vllm-project/vllm - wx971025 opened this issue 11 months ago
Error compiling kernels

github.com/vllm-project/vllm - declark1 opened this issue 11 months ago
lm-evaluation-harness broken on master

github.com/vllm-project/vllm - pcmoritz opened this issue 11 months ago
Enable scaled FP8 (e4m3fn) KV cache on ROCm (AMD GPU)

github.com/vllm-project/vllm - AdrianAbeyta opened this pull request 11 months ago
[FIX] Fix prefix test error on main

github.com/vllm-project/vllm - zhuohan123 opened this pull request 11 months ago
Mixtral 4x 4090 OOM

github.com/vllm-project/vllm - SinanAkkoyun opened this issue 11 months ago
Order of keys for guided JSON

github.com/vllm-project/vllm - ccdv-ai opened this issue 11 months ago
unload the model

github.com/vllm-project/vllm - osafaimal opened this issue 11 months ago
install from source failed using the latest code

github.com/vllm-project/vllm - sleepwalker2017 opened this issue 11 months ago
[FIX] Make `flash_attn` optional

github.com/vllm-project/vllm - WoosukKwon opened this pull request 11 months ago
[Minor fix] Include flash_attn in docker image

github.com/vllm-project/vllm - tdoublep opened this pull request 11 months ago
OpenAI Tools / function calling v2

github.com/vllm-project/vllm - FlorianJoncour opened this pull request 11 months ago
Prefix Caching with FP8 KV cache support

github.com/vllm-project/vllm - chenxu2048 opened this pull request 11 months ago
vllm load SqueezeLLM quantization model failed

github.com/vllm-project/vllm - zuosong-peng opened this issue 11 months ago
[WIP] Build FlashInfer

github.com/vllm-project/vllm - WoosukKwon opened this pull request 11 months ago
ExLlamaV2: exl2 support

github.com/vllm-project/vllm - pabl-o-ce opened this issue 11 months ago
Supporting embedding models

github.com/vllm-project/vllm - jc9123 opened this pull request 11 months ago
add doc about serving option on dstack

github.com/vllm-project/vllm - deep-diver opened this pull request 11 months ago
Fatal Python error: Segmentation fault

github.com/vllm-project/vllm - lmx760581375 opened this issue 11 months ago
Merge Gemma into Llama

github.com/vllm-project/vllm - WoosukKwon opened this pull request 11 months ago
[Feature] Add vision language model support.

github.com/vllm-project/vllm - xwjiang2010 opened this pull request 11 months ago
Support of AMD consumer GPUs

github.com/vllm-project/vllm - arno4000 opened this issue 11 months ago
lots of blank before each runing step

github.com/vllm-project/vllm - Eutenacity opened this issue 11 months ago
AWQ: Implement new kernels (64% faster decoding)

github.com/vllm-project/vllm - casper-hansen opened this issue 11 months ago
Unable to specify GPU usage in VLLM code

github.com/vllm-project/vllm - humza-sami opened this issue 11 months ago
Separate attention backends

github.com/vllm-project/vllm - WoosukKwon opened this pull request 11 months ago
some error happend when installing vllm

github.com/vllm-project/vllm - finylink opened this issue 11 months ago
AWQ Quantization Memory Usage

github.com/vllm-project/vllm - vcivan opened this issue 12 months ago
Multi-GPU Support Failures with AMD MI210

github.com/vllm-project/vllm - tom-papatheodore opened this issue 12 months ago
Fix empty output when temp is too low

github.com/vllm-project/vllm - CatherineSue opened this pull request 12 months ago
E5-mistral-7b-instruct embedding support

github.com/vllm-project/vllm - DavidPeleg6 opened this issue 12 months ago
Runtime exception [step must be nonzero]

github.com/vllm-project/vllm - DreamGenX opened this issue 12 months ago
vllm keeps hanging when using djl-deepspeed

github.com/vllm-project/vllm - ali-firstparty opened this issue 12 months ago
Add docker-compose.yml and corresponding .env

github.com/vllm-project/vllm - WolframRavenwolf opened this pull request 12 months ago
Allow model to be served under multiple names

github.com/vllm-project/vllm - hmellor opened this pull request 12 months ago
HQQ quantization support

github.com/vllm-project/vllm - max-wittig opened this issue 12 months ago
Missing prometheus metrics in `0.3.0`

github.com/vllm-project/vllm - SamComber opened this issue 12 months ago
Please add lora support for higher ranks and alpha values

github.com/vllm-project/vllm - parikshitsaikia1619 opened this issue 12 months ago
Add LoRA support for Mixtral

github.com/vllm-project/vllm - tterrysun opened this pull request 12 months ago
Add guided decoding for OpenAI API server

github.com/vllm-project/vllm - felixzhu555 opened this pull request 12 months ago
Adds support for gunicorn multiprocess process

github.com/vllm-project/vllm - jalotra opened this pull request 12 months ago