Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://github.com/vllm-project/vllm

[Feature]: Allow max_tokens = 0

fgebhart opened this issue 2 months ago
[Feature]: Support for rhymes-ai/Aria

engchina opened this issue 2 months ago
[CI] Fix merge conflict

LiuXiaoxuanPKU opened this pull request 2 months ago
[Bug]: KeyError during loading of Mixtral 8x22B in FP8

IowaSovereign opened this issue 2 months ago
[help wanted]: write tests for python-only development

youkaichao opened this issue 2 months ago
[Bugfix] Update grafana dashboard

zhan9san opened this pull request 2 months ago
[Bug]: vllm mistralai--Codestral-22B-v0.1 response is truncated

Fly-Pluche opened this issue 2 months ago
[Installation]: vllm installation error

leoneyar opened this issue 2 months ago
[Model] VLM2Vec, the first multimodal embedding model in vLLM

DarkLight1337 opened this pull request 2 months ago
[core] move parallel sampling out from vllm core

youkaichao opened this pull request 2 months ago
[Quantization][TPU] `compressed-tensors` integration for TPU

robertgshaw2-neuralmagic opened this pull request 2 months ago
[misc] Fine-grained CustomOp enabling mechanism

ProExpertProg opened this pull request 2 months ago
[Bugfix] Fix support for dimension like integers and ScalarType

bnellnm opened this pull request 2 months ago
[SpecDec] Remove Batch Expansion (2/3)

LiuXiaoxuanPKU opened this pull request 2 months ago
[CI/Build] Adds a test for multi step with TPUs

allenwang28 opened this pull request 2 months ago
[Frontend] merge beam search implementations

LunrEclipse opened this pull request 2 months ago
[bugfix] fix f-string for error

prashantgupta24 opened this pull request 2 months ago
[New Model]: meta-llama/Llama-Guard-3-1B

ayeganov opened this issue 2 months ago
[Misc] Add environment variables collection in collect_env.py tool

ycool opened this pull request 2 months ago
[Model] Support Mamba2 (Codestral Mamba)

tlrmchlsmth opened this pull request 2 months ago
[Feature] [Spec decode]: Combine chunked prefill with speculative decoding

NickLucche opened this pull request 2 months ago
Add `vllm_v1`

WoosukKwon opened this pull request 2 months ago
[Doc] Remove outdated comment to avoid misunderstanding

homeffjy opened this pull request 2 months ago
[Bugfix]Fix MiniCPM's LoRA bug

jeejeelee opened this pull request 2 months ago
[Bug]: MiniCPM3-4B is support lora by --enable-lora ?

ML-GCN opened this issue 2 months ago
`seed_everything` doesn't handle HPU

SanjuCSudhakaran opened this pull request 2 months ago
[Bug]: VLLM doesn't support LoRa with config `modules_to_save`

fahadh4ilyas opened this issue 2 months ago
[CI] add `ignore_eos` for `benchmark_serving.py`

jikunshang opened this pull request 2 months ago
[Bugfix] Fix priority in multiprocessing engine

schoennenbeck opened this pull request 2 months ago
[Misc][LoRA] Support loading LoRA weights for target_modules in reg format

jeejeelee opened this pull request 2 months ago
[Usage]: Manually Increasing inference time

Playerrrrr opened this issue 2 months ago
Max num seqs"

seungrokj opened this pull request 2 months ago
[Usage]: blip2 inference code

zhaoxueqi6666 opened this issue 2 months ago
[RFC]: Make device agnostic for diverse hardware support

wangshuai09 opened this issue 2 months ago
[CI/Build] mypy: Resolve some errors from checking vllm/engine

russellb opened this pull request 2 months ago
Pytorch hete spec

jiqing-feng opened this pull request 2 months ago
[Feature]: Improve Logging For Embedding Models

robertgshaw2-neuralmagic opened this issue 2 months ago
[Frontend, Core] Adding stop and stop_token_ids for beam search.

nFunctor opened this pull request 2 months ago
[Bug]: AsyncLLMEngine stuck on a single too long request

rickyyx opened this issue 2 months ago
[misc] hide best_of from engine

youkaichao opened this pull request 2 months ago
[Bug]: Streaming response fails after one token (0.5.3.post1)

NeonDaniel opened this issue 2 months ago
[CI/Build] Adopt Mergify for auto-labeling PRs

russellb opened this pull request 2 months ago
[torch.compile] generic decorators

youkaichao opened this pull request 2 months ago
[Doc] Improve quickstart documentation

rafvasq opened this pull request 2 months ago
[Usage]: running gated models offline

SamuelBG13 opened this issue 2 months ago
[Bugfix][CI/Build] Fix docker build where CUDA archs < 7.0 are being detected

LucasWilkinson opened this pull request 2 months ago
[Bug]: new beam search implementation ignores stop conditions

nFunctor opened this issue 2 months ago
[Misc] Standardize RoPE handling for Qwen2-VL

DarkLight1337 opened this pull request 2 months ago
[Model] Add Qwen2-Audio model support

faychu opened this pull request 2 months ago
[Kernel] adding fused moe kernel config for L40S TP4

bringlein opened this pull request 2 months ago
[Model] Add GLM-4v support and meet vllm==0.6.2

sixsixcoder opened this pull request 2 months ago
Questions about the inference performance of the GPTQ model

Rssevenyu opened this issue 2 months ago
[Model] support input image embedding for minicpmv

whyiug opened this pull request 2 months ago
[Misc] Fix sampling from sonnet for long context case

Imss27 opened this pull request 2 months ago
[Misc] Collect model support info in a single process per model

DarkLight1337 opened this pull request 2 months ago
[Feature] vLLM ARM Enablement for AARCH64 CPUs

sanketkaleoss opened this pull request 2 months ago
[BugFix] Fix tool call finish reason in streaming case

maxdebayser opened this pull request 2 months ago
[Bugfix] Sets `is_first_step_output` for TPUModelRunner

allenwang28 opened this pull request 2 months ago
Add example of helm chart for vllm deployment on k8s

mfournioux opened this pull request 2 months ago
Bump actions/setup-python from 3 to 5

dependabot[bot] opened this pull request 2 months ago
[RFC]: Adopt mergify for auto-labeling PRs

russellb opened this issue 2 months ago
[Kernel][Model] Improve continuous batching for Jamba and Mamba

mzusman opened this pull request 2 months ago