Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[V1] Add all_token_ids attribute to Request
github.com/vllm-project/vllm - WoosukKwon opened this pull request 3 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 3 months ago
Rename vllm.logging to vllm.logging_utils
github.com/vllm-project/vllm - flozi00 opened this pull request 3 months ago
github.com/vllm-project/vllm - flozi00 opened this pull request 3 months ago
[help wanted]: rename vllm/logging module to avoid shadowing builtin logging module
github.com/vllm-project/vllm - youkaichao opened this issue 3 months ago
github.com/vllm-project/vllm - youkaichao opened this issue 3 months ago
[Feature] [Spec decode]: Enable MLPSpeculator/Medusa and `prompt_logprobs` with ChunkedPrefill
github.com/vllm-project/vllm - NickLucche opened this pull request 3 months ago
github.com/vllm-project/vllm - NickLucche opened this pull request 3 months ago
[Kernel]Enable HPU for Speculative Decoding
github.com/vllm-project/vllm - xuechendi opened this pull request 3 months ago
github.com/vllm-project/vllm - xuechendi opened this pull request 3 months ago
[Mistral] FP8 format
github.com/vllm-project/vllm - patrickvonplaten opened this pull request 3 months ago
github.com/vllm-project/vllm - patrickvonplaten opened this pull request 3 months ago
[Bug]: can not serve microsoft/llava-med-v1.5-mistral-7b
github.com/vllm-project/vllm - cubense opened this issue 3 months ago
github.com/vllm-project/vllm - cubense opened this issue 3 months ago
[WIP] Prefix Cache Aware Scheduling [1/n]
github.com/vllm-project/vllm - rickyyx opened this pull request 3 months ago
github.com/vllm-project/vllm - rickyyx opened this pull request 3 months ago
[V1][Bugfix] Propagate V1 LLMEngine properly
github.com/vllm-project/vllm - comaniac opened this pull request 3 months ago
github.com/vllm-project/vllm - comaniac opened this pull request 3 months ago
[Usage]: VLLM failing to stream response after 512+ prompt tokens.
github.com/vllm-project/vllm - aghbd opened this issue 3 months ago
github.com/vllm-project/vllm - aghbd opened this issue 3 months ago
[Core] Add padding-aware scheduling for 2D prefills
github.com/vllm-project/vllm - kzawora-intel opened this pull request 3 months ago
github.com/vllm-project/vllm - kzawora-intel opened this pull request 3 months ago
[Usage]: Engine iteration timed out. (during using qwen2-vl-7b)
github.com/vllm-project/vllm - HuiyuanYan opened this issue 3 months ago
github.com/vllm-project/vllm - HuiyuanYan opened this issue 3 months ago
[CI/Build] Always run mypy
github.com/vllm-project/vllm - russellb opened this pull request 3 months ago
github.com/vllm-project/vllm - russellb opened this pull request 3 months ago
[V1] Allow piecewise cuda graphs to run with custom allreduce
github.com/vllm-project/vllm - SageMoore opened this pull request 3 months ago
github.com/vllm-project/vllm - SageMoore opened this pull request 3 months ago
Fix quantization config of vl model
github.com/vllm-project/vllm - jinzhen-lin opened this pull request 3 months ago
github.com/vllm-project/vllm - jinzhen-lin opened this pull request 3 months ago
[New Model]: dunzhang/stella_en_1.5B_v5
github.com/vllm-project/vllm - cavities opened this issue 3 months ago
github.com/vllm-project/vllm - cavities opened this issue 3 months ago
[Bug]: vllm0.6.3.post1 7B model can not use cmd vllm.entrypoints.openai.api_server on wsl
github.com/vllm-project/vllm - xiezhipeng-git opened this issue 3 months ago
github.com/vllm-project/vllm - xiezhipeng-git opened this issue 3 months ago
[Doc]: follow the doc but got error
github.com/vllm-project/vllm - husheng-liu opened this issue 3 months ago
github.com/vllm-project/vllm - husheng-liu opened this issue 3 months ago
[RFC]: Merge input processor and input mapper for multi-modal models
github.com/vllm-project/vllm - DarkLight1337 opened this issue 3 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this issue 3 months ago
[Hardware][CPU][torch.compile] integrate torch compile
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 3 months ago
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 3 months ago
[Bugfix] Make image processor respect `mm_processor_kwargs` for Qwen2-VL
github.com/vllm-project/vllm - li-plus opened this pull request 3 months ago
github.com/vllm-project/vllm - li-plus opened this pull request 3 months ago
[Bug]: When apply continue_final_message for OpenAI server, the `"echo":false` is ignored.
github.com/vllm-project/vllm - DIYer22 opened this issue 3 months ago
github.com/vllm-project/vllm - DIYer22 opened this issue 3 months ago
[Hardware][CPU][bugfix] Fix half dtype support on AVX2-only target
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 3 months ago
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 3 months ago
[Hardware][XPU] AWQ/GPTQ support for xpu backend
github.com/vllm-project/vllm - yma11 opened this pull request 3 months ago
github.com/vllm-project/vllm - yma11 opened this pull request 3 months ago
[Misc] Add Gamma-Distribution Request Generation Support for Serving Benchmark.
github.com/vllm-project/vllm - spliii opened this pull request 3 months ago
github.com/vllm-project/vllm - spliii opened this pull request 3 months ago
[Bug]: Engine loop has died for Meta-Llama-3.1-8B-Instruct TP=2
github.com/vllm-project/vllm - HaoyuWang4188 opened this issue 3 months ago
github.com/vllm-project/vllm - HaoyuWang4188 opened this issue 3 months ago
[V1][BugFix] Fix Generator construction in greedy + seed case
github.com/vllm-project/vllm - njhill opened this pull request 3 months ago
github.com/vllm-project/vllm - njhill opened this pull request 3 months ago
Add hf_transfer to testing image
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
[Kernel]Generalize Speculative decode from Cuda
github.com/vllm-project/vllm - xuechendi opened this pull request 3 months ago
github.com/vllm-project/vllm - xuechendi opened this pull request 3 months ago
[Usage]: disable pydantic request validation
github.com/vllm-project/vllm - matbee-eth opened this issue 3 months ago
github.com/vllm-project/vllm - matbee-eth opened this issue 3 months ago
Splitting attention kernel file
github.com/vllm-project/vllm - maleksan85 opened this pull request 3 months ago
github.com/vllm-project/vllm - maleksan85 opened this pull request 3 months ago
[Feature]: Enhance integration with advanced LB/gateways with better load/cost reporting and LoRA management
github.com/vllm-project/vllm - liu-cong opened this issue 3 months ago
github.com/vllm-project/vllm - liu-cong opened this issue 3 months ago
[CI/Build] Automate PR body text cleanup
github.com/vllm-project/vllm - russellb opened this pull request 3 months ago
github.com/vllm-project/vllm - russellb opened this pull request 3 months ago
[Bug]:Structured outputs inference often took a very long time,and eventually causing a timeout and vLLM engine crushing.
github.com/vllm-project/vllm - hpx502766238 opened this issue 3 months ago
github.com/vllm-project/vllm - hpx502766238 opened this issue 3 months ago
[Feature]: Add Gamma Distribution Request Support for Serving Benchmark.
github.com/vllm-project/vllm - spliii opened this issue 3 months ago
github.com/vllm-project/vllm - spliii opened this issue 3 months ago
[Performance]: Throughput and Latency degradation with a single LoRA adapter on A100 40 GB
github.com/vllm-project/vllm - kaushikmitr opened this issue 3 months ago
github.com/vllm-project/vllm - kaushikmitr opened this issue 3 months ago
[Core] Add dynamic chunk size calculation
github.com/vllm-project/vllm - prashantgupta24 opened this pull request 3 months ago
github.com/vllm-project/vllm - prashantgupta24 opened this pull request 3 months ago
[Build] Fix for the Wswitch-bool clang warning
github.com/vllm-project/vllm - gshtras opened this pull request 3 months ago
github.com/vllm-project/vllm - gshtras opened this pull request 3 months ago
[Doc] Updated TPU install instructions
github.com/vllm-project/vllm - mikegre-google opened this pull request 3 months ago
github.com/vllm-project/vllm - mikegre-google opened this pull request 3 months ago
[Kernel] Refactor Cutlass c3x
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 3 months ago
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 3 months ago
[Benchmark] guided decoding
github.com/vllm-project/vllm - aarnphm opened this pull request 3 months ago
github.com/vllm-project/vllm - aarnphm opened this pull request 3 months ago
[0/N] Rename `MultiModalInputs` to `MultiModalKwargs`
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 3 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 3 months ago
[Bug]: PyTorch 2.5.x vLLM 1.0.0 dev issue with tensor parallel size > 1
github.com/vllm-project/vllm - CortexEdgeUser opened this issue 3 months ago
github.com/vllm-project/vllm - CortexEdgeUser opened this issue 3 months ago
Online video support for VLMs
github.com/vllm-project/vllm - litianjian opened this pull request 3 months ago
github.com/vllm-project/vllm - litianjian opened this pull request 3 months ago
[Bugfix] Free cross attention block table for preempted-for-recompute sequence group.
github.com/vllm-project/vllm - kathyyu-google opened this pull request 3 months ago
github.com/vllm-project/vllm - kathyyu-google opened this pull request 3 months ago
Adding cascade inference to vLLM
github.com/vllm-project/vllm - raywanb opened this pull request 3 months ago
github.com/vllm-project/vllm - raywanb opened this pull request 3 months ago
[Bug]: vLLM multi-step scheduling crashes when input prompt is long
github.com/vllm-project/vllm - Terranlee opened this issue 3 months ago
github.com/vllm-project/vllm - Terranlee opened this issue 3 months ago
[Bugfix] Upgrade to pytorch 2.5.1
github.com/vllm-project/vllm - bnellnm opened this pull request 3 months ago
github.com/vllm-project/vllm - bnellnm opened this pull request 3 months ago
[BugFix] Do not raise a `ValueError` when `tool_choice` is set to the supported `none` option and `tools` are not defined.
github.com/vllm-project/vllm - gcalmettes opened this pull request 3 months ago
github.com/vllm-project/vllm - gcalmettes opened this pull request 3 months ago
[Doc] Update VLM doc about loading from local files
github.com/vllm-project/vllm - ywang96 opened this pull request 3 months ago
github.com/vllm-project/vllm - ywang96 opened this pull request 3 months ago
[Bug]: last_token_time is equal to arrival_time
github.com/vllm-project/vllm - wolfgangsmdt opened this issue 3 months ago
github.com/vllm-project/vllm - wolfgangsmdt opened this issue 3 months ago
[Misc] Modify BNB parameter name
github.com/vllm-project/vllm - jeejeelee opened this pull request 3 months ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 3 months ago
[Core] Enhance memory profiling in determine_num_available_blocks with error handling and fallback
github.com/vllm-project/vllm - Ahmed14z opened this pull request 3 months ago
github.com/vllm-project/vllm - Ahmed14z opened this pull request 3 months ago
[Bug]: For speculative decoding with a draft model, the "determine_num_available_blocks" only considers the memory usage of the target model
github.com/vllm-project/vllm - hustxiayang opened this issue 3 months ago
github.com/vllm-project/vllm - hustxiayang opened this issue 3 months ago
[Core] Use os.sched_yield in ShmRingBuffer instead of time.sleep
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 3 months ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 3 months ago
[Bug]: Segment fault when import decord before import vllm
github.com/vllm-project/vllm - litianjian opened this issue 3 months ago
github.com/vllm-project/vllm - litianjian opened this issue 3 months ago
[Performance]: FP8 performance worse than FP16 for Qwen2-VL-2B-Instruct
github.com/vllm-project/vllm - LinJianping opened this issue 3 months ago
github.com/vllm-project/vllm - LinJianping opened this issue 3 months ago
[Bug]: Llama3.2 tool calling OpenAI API not working
github.com/vllm-project/vllm - SinanAkkoyun opened this issue 3 months ago
github.com/vllm-project/vllm - SinanAkkoyun opened this issue 3 months ago
[Bug]: I cannot able to load the model on TESLA T4 GPU in Full precision
github.com/vllm-project/vllm - VpkPrasanna opened this issue 3 months ago
github.com/vllm-project/vllm - VpkPrasanna opened this issue 3 months ago
[Bug]: internvl “max_dynamic_patch” not work, and add_special_tokens bug
github.com/vllm-project/vllm - wangpeng138375 opened this issue 3 months ago
github.com/vllm-project/vllm - wangpeng138375 opened this issue 3 months ago
[Bug]: [Regression Issue] The output from Qwen2 VL are different between vLLM v0.6.3-post1 and vLLM v0.6.1-post2
github.com/vllm-project/vllm - tjtanaa opened this issue 3 months ago
github.com/vllm-project/vllm - tjtanaa opened this issue 3 months ago
[Misc]Reduce BNB static variable
github.com/vllm-project/vllm - jeejeelee opened this pull request 3 months ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 3 months ago
[Bug]: Deploying glm4 reported an error:"auto" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set
github.com/vllm-project/vllm - shnyyds opened this issue 3 months ago
github.com/vllm-project/vllm - shnyyds opened this issue 3 months ago
[Usage]: Are there any batch size requirements for offline batch inference? For example, is 10,000 okay?
github.com/vllm-project/vllm - joyyyhuang opened this issue 3 months ago
github.com/vllm-project/vllm - joyyyhuang opened this issue 3 months ago
[Bugfix] Fix E2EL mean and median stats
github.com/vllm-project/vllm - daitran2k1 opened this pull request 3 months ago
github.com/vllm-project/vllm - daitran2k1 opened this pull request 3 months ago
[5/N] pass the whole config to model
github.com/vllm-project/vllm - youkaichao opened this pull request 3 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 3 months ago
[Encoder Decoder] Update Mllama to run with both FlashAttention and XFormers
github.com/vllm-project/vllm - sroy745 opened this pull request 3 months ago
github.com/vllm-project/vllm - sroy745 opened this pull request 3 months ago
[Installation]: Model Architectures FalconMambaForCasualLM are not supported for now.
github.com/vllm-project/vllm - RohithDAces opened this issue 3 months ago
github.com/vllm-project/vllm - RohithDAces opened this issue 3 months ago
[4/N] make quant config first-class citizen
github.com/vllm-project/vllm - youkaichao opened this pull request 3 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 3 months ago
[Feature]: do you plan to support "suffix" of "v1/completions"
github.com/vllm-project/vllm - qiao-wei opened this issue 3 months ago
github.com/vllm-project/vllm - qiao-wei opened this issue 3 months ago
[Bugfix][OpenVINO] Fix circular reference #9939
github.com/vllm-project/vllm - MengqingCao opened this pull request 3 months ago
github.com/vllm-project/vllm - MengqingCao opened this pull request 3 months ago
[Bugfix] Fix `MQLLMEngine` hanging
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 3 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 3 months ago
[V1] Prefix caching (take 2)
github.com/vllm-project/vllm - comaniac opened this pull request 3 months ago
github.com/vllm-project/vllm - comaniac opened this pull request 3 months ago
[Doc] correct schema in example batch jsonl file: max_completion_tokens -> max_tokens
github.com/vllm-project/vllm - staeiou opened this pull request 3 months ago
github.com/vllm-project/vllm - staeiou opened this pull request 3 months ago
[CI] Basic Integration Test For TPU
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 3 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 3 months ago
[Usage]: How to use `llava-hf/llava-1.5-7b-hf` with bitsandbytes quantization in `vllm serve`?
github.com/vllm-project/vllm - asadfgglie opened this issue 3 months ago
github.com/vllm-project/vllm - asadfgglie opened this issue 3 months ago
[Bug]: ValueError:Could not broadcast input array from shape (542,) into shape (512,)
github.com/vllm-project/vllm - sherlockma11 opened this issue 3 months ago
github.com/vllm-project/vllm - sherlockma11 opened this issue 3 months ago
[help wanted]: fix broken xverse model
github.com/vllm-project/vllm - youkaichao opened this issue 3 months ago
github.com/vllm-project/vllm - youkaichao opened this issue 3 months ago
[Hardware][CPU] Add ARM CPU backend
github.com/vllm-project/vllm - ShawnD200 opened this pull request 3 months ago
github.com/vllm-project/vllm - ShawnD200 opened this pull request 3 months ago
[BugFix]: properly deserialize `tool_calls` iterator before processing by mistral-common when MistralTokenizer is used
github.com/vllm-project/vllm - gcalmettes opened this pull request 3 months ago
github.com/vllm-project/vllm - gcalmettes opened this pull request 3 months ago
[V1][VLM] Enable proper chunked prefill for multimodal models
github.com/vllm-project/vllm - ywang96 opened this pull request 3 months ago
github.com/vllm-project/vllm - ywang96 opened this pull request 3 months ago
[Bugfix] Fix Phi-3 BNB quantization with tensor parallel
github.com/vllm-project/vllm - Isotr0py opened this pull request 3 months ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 3 months ago
[V1] Support per-request seed
github.com/vllm-project/vllm - njhill opened this pull request 3 months ago
github.com/vllm-project/vllm - njhill opened this pull request 3 months ago
[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1
github.com/vllm-project/vllm - FurtherAI opened this pull request 3 months ago
github.com/vllm-project/vllm - FurtherAI opened this pull request 3 months ago
[Doc] Add documentation for Structured Outputs
github.com/vllm-project/vllm - ismael-dm opened this pull request 3 months ago
github.com/vllm-project/vllm - ismael-dm opened this pull request 3 months ago
Bump the patch-update group with 3 updates
github.com/vllm-project/vllm - dependabot[bot] opened this pull request 3 months ago
github.com/vllm-project/vllm - dependabot[bot] opened this pull request 3 months ago
[Core]Add New Run:ai Streamer Load format.
github.com/vllm-project/vllm - pandyamarut opened this pull request 3 months ago
github.com/vllm-project/vllm - pandyamarut opened this pull request 3 months ago
[CI] Prune tests/models/decoder_only/language/* tests
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
[Bug]: from vllm.platforms import current_platform infinite loop error with OpenVino Build.
github.com/vllm-project/vllm - CalebXDonoho opened this issue 3 months ago
github.com/vllm-project/vllm - CalebXDonoho opened this issue 3 months ago
[Bug]: Phi-3 cannot be used with bitsandbytes
github.com/vllm-project/vllm - yananchen1989 opened this issue 3 months ago
github.com/vllm-project/vllm - yananchen1989 opened this issue 3 months ago
[CI] Prune down LM Eval test time
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
[ci/build] Have dependabot ignore pinned dependencies
github.com/vllm-project/vllm - khluu opened this pull request 3 months ago
github.com/vllm-project/vllm - khluu opened this pull request 3 months ago
[CI] Prune back the number of tests in tests/kernels/*
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 3 months ago
[Bugfix] Fix pickle of input when async output processing is on
github.com/vllm-project/vllm - wallashss opened this pull request 3 months ago
github.com/vllm-project/vllm - wallashss opened this pull request 3 months ago
[misc] Allow partial prefix benchmarking & random input generation for prefix benchmarking
github.com/vllm-project/vllm - rickyyx opened this pull request 3 months ago
github.com/vllm-project/vllm - rickyyx opened this pull request 3 months ago
Doc: Improve benchmark documentation
github.com/vllm-project/vllm - rafvasq opened this pull request 3 months ago
github.com/vllm-project/vllm - rafvasq opened this pull request 3 months ago
[RFC] Propose a vulnerability management team
github.com/vllm-project/vllm - russellb opened this pull request 3 months ago
github.com/vllm-project/vllm - russellb opened this pull request 3 months ago