Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://github.com/vllm-project/vllm

[Bugfix] Fix lora loading for Compressed Tensors in #9120

fahadh4ilyas opened this pull request 3 months ago
[TPU] Fix memory profiling

WoosukKwon opened this pull request 3 months ago
[Bug]: quantization does not work with dummy weight format

youkaichao opened this issue 3 months ago
[Bug]: Error Running Qwen2.5-7B-Instruct on CPU

xiayouran opened this issue 3 months ago
[Model] Remap FP8 kv_scale in CommandR and DBRX

hliuca opened this pull request 3 months ago
Update link to KServe deployment guide

terrytangyuan opened this pull request 3 months ago
[Bug]: Port binding keep failing due to unnecessary code

James4Ever0 opened this issue 3 months ago
Add classifiers in setup.py

terrytangyuan opened this pull request 3 months ago
[Doc] Fix VLM prompt placeholder sample bug

ycool opened this pull request 3 months ago
[Misc] Improve validation errors around best_of and n

tjohnson31415 opened this pull request 3 months ago
[WIP] Prototyping re-arch

WoosukKwon opened this pull request 3 months ago
[ci][test] use load dummy for testing

youkaichao opened this pull request 3 months ago
[Feature]: Enabling MSS for larger number of sequences (>256)

kushanam opened this issue 3 months ago
mypy: check additional directories

russellb opened this pull request 3 months ago
Add `lm-eval` directly to requirements-test.txt

mgoin opened this pull request 3 months ago
[Bugfix] Optimize composite weight loading and fix EAGLE weight loading

DarkLight1337 opened this pull request 3 months ago
[Bugfix][Doc] Report neuron error in output

joerowell opened this pull request 3 months ago
[Misc]: How to set num-scheduler-steps

o1iv3r opened this issue 3 months ago
[Usage]: Multi-gpu inference takes too much memory + how to make uneven load

Ouna-the-Dataweaver opened this issue 3 months ago
[Doc] Update vlm.rst to include an example on videos

sayakpaul opened this pull request 3 months ago
[Frontend][Feature] Add jamba tool parser

tomeras91 opened this pull request 3 months ago
[Bug]: InternVL bounding box prediction does not work

MoritzLaurer opened this issue 3 months ago
[Bug]: Can not pip install vllm inside docker

fahadh4ilyas opened this issue 3 months ago
[Frontend] Add Early Validation For Chat Template / Tool Call Parser

alex-jw-brooks opened this pull request 3 months ago
[Misc]: Nobody reviews my PR

CharlesRiggins opened this issue 3 months ago
support bitsandbytes quantization with more models

chenqianfzh opened this pull request 3 months ago
[Neuron] Introduce paged attention support for neuron backend

liangfu opened this pull request 3 months ago
[Bugfix] Fix crashing for multimodal when image passed with height == 1

Pernekhan opened this pull request 3 months ago
[torch.compile] Fuse RMSNorm with quant

ProExpertProg opened this pull request 3 months ago
[Doc] Improve contributing and installation documentation

rafvasq opened this pull request 3 months ago
[Core][Frontend] Add Support for Inference Time mm_processor_kwargs

alex-jw-brooks opened this pull request 3 months ago
[CI/Build] Update Dockerfile install+deploy image to ubuntu 22.04

mgoin opened this pull request 3 months ago
[Usage]: Not getting the infrence metrics in the api response

vverma01232 opened this issue 3 months ago
[New Model]: silma-ai/SILMA-9B-Instruct-v1.0

hassanraha opened this issue 3 months ago
[OpenVINO] Use torch 2.4.0 and newer optimim version

ilya-lavrenov opened this pull request 3 months ago
[Bug]: Installation from last commit (version wrong)

johnnynunez opened this issue 3 months ago
[Bug]: Issue Running VLLM Open AI using nonroot user in K8s

luhurfth opened this issue 3 months ago
[Frontend] API support for beam search for MQLLMEngine

LunrEclipse opened this pull request 3 months ago
[Bugfix][Hardware] Fix model input for decode

yma11 opened this pull request 3 months ago
[Usage]: How to run llama 3.2 with CPU only version

chanandrew96 opened this issue 3 months ago
[Feature]: Does vLLM support ONNX models?

LetianLee opened this issue 3 months ago
support jetson AGX Orin

johnnynunez opened this pull request 3 months ago
[Model] Explicit interface for vLLM models and support OOT embedding models

DarkLight1337 opened this pull request 3 months ago
[Usage]: chat 接口有问题,completion接口正常

cdhx opened this issue 3 months ago
[core] remove beam search from the core

youkaichao opened this pull request 3 months ago
[Misc] Remove user-facing error for removed VLM args

DarkLight1337 opened this pull request 3 months ago
[BugFix][Core] Fix BlockManagerV2 when Encoder Input is None

sroy745 opened this pull request 3 months ago
[torch.compile] register blocksparse attention

youkaichao opened this pull request 3 months ago
[RFC]: hide continuous batching complexity through forward context

youkaichao opened this issue 3 months ago
[core] use forward context for flash infer

youkaichao opened this pull request 3 months ago
[Bug]: vllm serve Exception in ASGI application

SpaceHunterInf opened this issue 3 months ago
[Model] Make llama3.2 support multiple and interleaved images

xiangxu-google opened this pull request 3 months ago
[Bugfix] limit lora init id greater than 0

Ssunbell opened this pull request 3 months ago
[Installation]: cannot install vllm with openvino backend

guanxiang opened this issue 3 months ago
[Bug]: Qwen2-VL model support

kulievvitaly opened this issue 3 months ago
[Model] PP support for embedding models and update docs

DarkLight1337 opened this pull request 3 months ago
[Doc] Update README.md with Ray summit slides

zhuohan123 opened this pull request 3 months ago
[Frontend] API support for beam search

LunrEclipse opened this pull request 3 months ago
[Bugfix] Try to handle older versions of pytorch

bnellnm opened this pull request 3 months ago
[Misc] Fix CI lint

comaniac opened this pull request 3 months ago
[Bugfix] use blockmanagerv1 for encoder-decoder

heheda12345 opened this pull request 3 months ago
[Bugfix] Deprecate registration of custom configs to huggingface

heheda12345 opened this pull request 3 months ago
[Bug]: vLLM MQLLMEngine Timeout - Json Schema

wrisigo opened this issue 3 months ago
[Misc] Add random seed for prefix cache benchmark

Imss27 opened this pull request 3 months ago
Yet another Prefill-Decode separation in vllm

chenqianfzh opened this pull request 3 months ago
[Misc] Improved prefix cache example

Imss27 opened this pull request 3 months ago
[Bug]: vllm overrides transformer's Autoconfig for mllama

lyuqin-scale opened this issue 3 months ago
Remove AMD Ray Summit Banner

simon-mo opened this pull request 3 months ago
[Misc]: Need to understand support for torch.compile in Q4 roadmap

amd-abhikulk opened this issue 3 months ago