Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://github.com/vllm-project/vllm

[Kernel]Enable HPU for Speculative Decoding

xuechendi opened this pull request about 1 month ago
[Mistral] FP8 format

patrickvonplaten opened this pull request about 1 month ago
[Bug]: can not serve microsoft/llava-med-v1.5-mistral-7b

cubense opened this issue about 1 month ago
Prefix Cache Aware Scheduling [1/n]

rickyyx opened this pull request about 1 month ago
[V1][Bugfix] Propagate V1 LLMEngine properly

comaniac opened this pull request about 1 month ago
[Usage]: VLLM failing to stream response after 512+ prompt tokens.

aghbd opened this issue about 1 month ago
[Core] Add padding-aware scheduling for 2D prefills

kzawora-intel opened this pull request about 1 month ago
[Usage]: Engine iteration timed out. (during using qwen2-vl-7b)

HuiyuanYan opened this issue about 1 month ago
[CI/Build] Always run mypy

russellb opened this pull request about 1 month ago
[V1] Allow piecewise cuda graphs to run with custom allreduce

SageMoore opened this pull request about 1 month ago
Fix quantization config of vl model

jinzhen-lin opened this pull request about 1 month ago
[New Model]: dunzhang/stella_en_1.5B_v5

cavities opened this issue about 1 month ago
[Doc]: follow the doc but got error

husheng-liu opened this issue about 2 months ago
[RFC]: Merge input processor and input mapper for multi-modal models

DarkLight1337 opened this issue about 2 months ago
[Hardware][CPU][torch.compile] integrate torch compile

bigPYJ1151 opened this pull request about 2 months ago
[Bugfix] Make image processor respect `mm_processor_kwargs` for Qwen2-VL

li-plus opened this pull request about 2 months ago
[Hardware][CPU][bugfix] Fix half dtype support on AVX2-only target

bigPYJ1151 opened this pull request about 2 months ago
[Hardware][XPU] AWQ/GPTQ support for xpu backend

yma11 opened this pull request about 2 months ago
[Misc] Add Gamma-Distribution Request Generation Support for Serving Benchmark.

spliii opened this pull request about 2 months ago
[Bug]: Engine loop has died for Meta-Llama-3.1-8B-Instruct TP=2

HaoyuWang4188 opened this issue about 2 months ago
[V1][BugFix] Fix Generator construction in greedy + seed case

njhill opened this pull request about 2 months ago
Add hf_transfer to testing image

mgoin opened this pull request about 2 months ago
[Kernel]Generalize Speculative decode from Cuda

xuechendi opened this pull request about 2 months ago
[Usage]: disable pydantic request validation

matbee-eth opened this issue about 2 months ago
Splitting attention kernel file

maleksan85 opened this pull request about 2 months ago
[Misc] Improve Web UI

rafvasq opened this pull request about 2 months ago
[CI/Build] Automate PR body text cleanup

russellb opened this pull request about 2 months ago
[Core] Add dynamic chunk size calculation

prashantgupta24 opened this pull request about 2 months ago
[Build] Fix for the Wswitch-bool clang warning

gshtras opened this pull request about 2 months ago
[Doc] Updated TPU install instructions

mikegre-google opened this pull request about 2 months ago
[Kernel] Refactor Cutlass c3x

varun-sundar-rabindranath opened this pull request about 2 months ago
[Benchmark] guided decoding

aarnphm opened this pull request about 2 months ago
[0/N] Rename `MultiModalInputs` to `MultiModalKwargs`

DarkLight1337 opened this pull request about 2 months ago
[Bug]: PyTorch 2.5.x vLLM 1.0.0 dev issue with tensor parallel size > 1

CortexEdgeUser opened this issue about 2 months ago
Online video support for VLMs

litianjian opened this pull request about 2 months ago
Adding cascade inference to vLLM

raywanb opened this pull request about 2 months ago
[WIP] Ray Backend V1

rkooo567 opened this pull request about 2 months ago
[Bugfix] Upgrade to pytorch 2.5.1

bnellnm opened this pull request about 2 months ago
[Doc] Update VLM doc about loading from local files

ywang96 opened this pull request about 2 months ago
[Bug]: last_token_time is equal to arrival_time

wolfgangsmdt opened this issue about 2 months ago
[Misc] Modify BNB parameter name

jeejeelee opened this pull request about 2 months ago
[Core] Use os.sched_yield in ShmRingBuffer instead of time.sleep

tlrmchlsmth opened this pull request about 2 months ago
[Bug]: Segment fault when import decord before import vllm

litianjian opened this issue about 2 months ago
[Performance]: FP8 performance worse than FP16 for Qwen2-VL-2B-Instruct

LinJianping opened this issue about 2 months ago
[Bug]: Llama3.2 tool calling OpenAI API not working

SinanAkkoyun opened this issue about 2 months ago
[Bug]: I cannot able to load the model on TESLA T4 GPU in Full precision

VpkPrasanna opened this issue about 2 months ago
[Bug]: internvl “max_dynamic_patch” not work, and add_special_tokens bug

wangpeng138375 opened this issue about 2 months ago
[Misc]Reduce BNB static variable

jeejeelee opened this pull request about 2 months ago
[Bugfix] Fix E2EL mean and median stats

daitran2k1 opened this pull request about 2 months ago
[5/N] pass the whole config to model

youkaichao opened this pull request about 2 months ago
[Encoder Decoder] Update Mllama to run with both FlashAttention and XFormers

sroy745 opened this pull request about 2 months ago
[4/N] make quant config first-class citizen

youkaichao opened this pull request about 2 months ago
[Feature]: do you plan to support "suffix" of "v1/completions"

qiao-wei opened this issue about 2 months ago
[Bugfix][OpenVINO] Fix circular reference #9939

MengqingCao opened this pull request about 2 months ago
[Bugfix] Fix `MQLLMEngine` hanging

robertgshaw2-neuralmagic opened this pull request about 2 months ago
[V1] Prefix caching (take 2)

comaniac opened this pull request about 2 months ago
[CI] Basic Integration Test For TPU

robertgshaw2-neuralmagic opened this pull request about 2 months ago
[help wanted]: fix broken xverse model

youkaichao opened this issue about 2 months ago
[Hardware][CPU] Add ARM CPU backend

ShawnD200 opened this pull request about 2 months ago
[V1][VLM] Enable proper chunked prefill for multimodal models

ywang96 opened this pull request about 2 months ago
[Bugfix] Fix Phi-3 BNB quantization with tensor parallel

Isotr0py opened this pull request about 2 months ago
[V1] Support per-request seed

njhill opened this pull request about 2 months ago
[Doc] Add documentation for Structured Outputs

ismael-dm opened this pull request about 2 months ago
Bump the patch-update group with 3 updates

dependabot[bot] opened this pull request about 2 months ago
[Core]Add New Run:ai Streamer Load format.

pandyamarut opened this pull request about 2 months ago
[CI] Prune tests/models/decoder_only/language/* tests

mgoin opened this pull request about 2 months ago
[Bug]: Phi-3 cannot be used with bitsandbytes

yananchen1989 opened this issue about 2 months ago
[CI] Prune down LM Eval test time

mgoin opened this pull request about 2 months ago
[ci/build] Have dependabot ignore pinned dependencies

khluu opened this pull request about 2 months ago
[CI] Prune back the number of tests in tests/kernels/*

mgoin opened this pull request about 2 months ago
[Bugfix] Fix pickle of input when async output processing is on

wallashss opened this pull request about 2 months ago
Doc: Improve benchmark documentation

rafvasq opened this pull request about 2 months ago
[RFC] Propose a vulnerability management team

russellb opened this pull request about 2 months ago
[Doc] Move CONTRIBUTING to docs site

russellb opened this pull request about 2 months ago
[Frontend] Automatic detection of chat content format from AST

DarkLight1337 opened this pull request about 2 months ago
[Bug]: illegal memory access error when using prefix caching

StevenTang1998 opened this issue about 2 months ago
[Bugfix] Fix MiniCPMV and Mllama BNB bug

jeejeelee opened this pull request about 2 months ago