Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://github.com/vllm-project/vllm

[Bug]: Tensor Parallelism performs poorly

DanielViglione opened this issue 4 months ago
[CI/Build] VLM Test Consolidation

alex-jw-brooks opened this pull request 4 months ago
[CI][Misc] Add tests for python-only development

cermeng opened this pull request 4 months ago
[Bug]: cannot run model when TP>1 (already run debug file)

jli943 opened this issue 4 months ago
[Feature]: support for prompt cache

wiluen opened this issue 4 months ago
[Bug]: 400 Bad Request

ErykCh opened this issue 4 months ago
[Bug]: Qwen2-VL-72B Inference on Multiple-GPUs

bhupendra1324 opened this issue 4 months ago
[Misc]: Im trying to host my finetuned Llama -3-8b instruct in Vllm

preethiisenthil opened this issue 4 months ago
[Bug]: Error running Molmo on API in v0.6.3

Inforeon opened this issue 4 months ago
[Bug]: guided_json fails on pixtral when using OpenAI API

ktrapeznikov opened this issue 4 months ago
[Bugfix]: Make chat content text allow type content

vrdn-23 opened this pull request 4 months ago
[BugFix] Fix chat API continuous usage stats

njhill opened this pull request 4 months ago
[Bug]: llama3.2-11B-Vision-Instruct not working

warlockedward opened this issue 4 months ago
bugfix on draft_tp value

qibaoyuan opened this pull request 4 months ago
[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage

joerunde opened this pull request 4 months ago
[Bugfix] Update InternVL input mapper to support image embeds

hhzhang16 opened this pull request 4 months ago
[TPU] Fix TPU SMEM OOM by Pallas paged attention kernel

WoosukKwon opened this pull request 4 months ago
pass ignore_eos parameter to all benchmark_serving calls

gracehonv opened this pull request 4 months ago
[Doc] Fix code formatting in spec_decode.rst

mgoin opened this pull request 4 months ago
[Docs] Remove PDF build from Readtehdocs

simon-mo opened this pull request 4 months ago
[Usage]: Obtaining success / error rate % metrics

yqlu opened this issue 4 months ago
[Frontend] Clarify model_type error messages

stevegrubb opened this pull request 4 months ago
[Hardware][CPU] compressed-tensor INT8 W8A8 AZP support

bigPYJ1151 opened this pull request 4 months ago
[Bugfix] Clean up some cruft in mamba.py

tlrmchlsmth opened this pull request 4 months ago
[Bug]: LLAMA 3.2 11B Vision Instruct Model not Running in VLLM 0.6.2

saikatscalers opened this issue 4 months ago
[Installation]: Adding opentelemetry packages in container image

sanketsudake opened this issue 4 months ago
[Usage]: --cpu-offload-gb no use

Rane2021 opened this issue 4 months ago
[Hardware] [Intel GPU] Add multistep scheduler for xpu device

jikunshang opened this pull request 4 months ago
[Feature]: Allow max_tokens = 0

fgebhart opened this issue 4 months ago
[Bug]: missing 'Finished request xxxx' log

jinzhen-lin opened this issue 4 months ago
[Bug]: Gemma 27B Produces no Outputs (2B and 9B work fine)

RonanKMcGovern opened this issue 4 months ago
[Feature]: Support for rhymes-ai/Aria

engchina opened this issue 4 months ago
[CI] Fix merge conflict

LiuXiaoxuanPKU opened this pull request 4 months ago
[Bug]: KeyError during loading of Mixtral 8x22B in FP8

IowaSovereign opened this issue 4 months ago
[help wanted]: write tests for python-only development

youkaichao opened this issue 4 months ago
[Bugfix] Update grafana dashboard

zhan9san opened this pull request 4 months ago
[Bug]: vllm mistralai--Codestral-22B-v0.1 response is truncated

Fly-Pluche opened this issue 4 months ago
[Installation]: vllm installation error

leoneyar opened this issue 4 months ago
[Model] VLM2Vec, the first multimodal embedding model in vLLM

DarkLight1337 opened this pull request 4 months ago
[core] move parallel sampling out from vllm core

youkaichao opened this pull request 4 months ago
[Quantization][TPU] `compressed-tensors` integration for TPU

robertgshaw2-neuralmagic opened this pull request 4 months ago
[misc] Fine-grained CustomOp enabling mechanism

ProExpertProg opened this pull request 4 months ago
[Bugfix] Fix support for dimension like integers and ScalarType

bnellnm opened this pull request 4 months ago
[SpecDec] Remove Batch Expansion (2/3)

LiuXiaoxuanPKU opened this pull request 4 months ago
[CI/Build] Adds a test for multi step with TPUs

allenwang28 opened this pull request 4 months ago
[Frontend] merge beam search implementations

LunrEclipse opened this pull request 4 months ago
[bugfix] fix f-string for error

prashantgupta24 opened this pull request 4 months ago
[New Model]: meta-llama/Llama-Guard-3-1B

ayeganov opened this issue 4 months ago
[Misc] Add environment variables collection in collect_env.py tool

ycool opened this pull request 4 months ago
[Model] Support Mamba2 (Codestral Mamba)

tlrmchlsmth opened this pull request 4 months ago
[Feature] [Spec decode]: Combine chunked prefill with speculative decoding

NickLucche opened this pull request 4 months ago
Add `vllm_v1`

WoosukKwon opened this pull request 4 months ago
[Doc] Remove outdated comment to avoid misunderstanding

homeffjy opened this pull request 4 months ago
[Bugfix]Fix MiniCPM's LoRA bug

jeejeelee opened this pull request 4 months ago
[Bug]: MiniCPM3-4B is support lora by --enable-lora ?

ML-GCN opened this issue 4 months ago
`seed_everything` doesn't handle HPU

SanjuCSudhakaran opened this pull request 4 months ago
[Bug]: VLLM doesn't support LoRa with config `modules_to_save`

fahadh4ilyas opened this issue 4 months ago
[CI] add `ignore_eos` for `benchmark_serving.py`

jikunshang opened this pull request 4 months ago
[Bugfix] Fix priority in multiprocessing engine

schoennenbeck opened this pull request 4 months ago
[Misc][LoRA] Support loading LoRA weights for target_modules in reg format

jeejeelee opened this pull request 4 months ago
[Usage]: Manually Increasing inference time

Playerrrrr opened this issue 4 months ago
Max num seqs"

seungrokj opened this pull request 4 months ago
[Usage]: blip2 inference code

zhaoxueqi6666 opened this issue 4 months ago
[RFC]: Make device agnostic for diverse hardware support

wangshuai09 opened this issue 4 months ago
[CI/Build] mypy: Resolve some errors from checking vllm/engine

russellb opened this pull request 4 months ago
Pytorch hete spec

jiqing-feng opened this pull request 4 months ago
[Feature]: Improve Logging For Embedding Models

robertgshaw2-neuralmagic opened this issue 4 months ago
[Frontend, Core] Adding stop and stop_token_ids for beam search.

nFunctor opened this pull request 4 months ago
[Bug]: AsyncLLMEngine stuck on a single too long request

rickyyx opened this issue 4 months ago