Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Usage]: guided_regex in offline model
github.com/vllm-project/vllm - RonanKMcGovern opened this issue 21 days ago
github.com/vllm-project/vllm - RonanKMcGovern opened this issue 21 days ago
[Model][VLM] Add multi-video support for LLaVA-Onevision
github.com/vllm-project/vllm - litianjian opened this pull request 21 days ago
github.com/vllm-project/vllm - litianjian opened this pull request 21 days ago
[CI/Build] setuptools-scm fixes
github.com/vllm-project/vllm - dtrifiro opened this pull request 21 days ago
github.com/vllm-project/vllm - dtrifiro opened this pull request 21 days ago
[Performance]: Talk about the model parallelism
github.com/vllm-project/vllm - baifanxxx opened this issue 21 days ago
github.com/vllm-project/vllm - baifanxxx opened this issue 21 days ago
[Hardware][intel GPU] add async output process for xpu
github.com/vllm-project/vllm - jikunshang opened this pull request 21 days ago
github.com/vllm-project/vllm - jikunshang opened this pull request 21 days ago
[Model] Support Qwen2.5-Math-RM-72B
github.com/vllm-project/vllm - zhuzilin opened this pull request 21 days ago
github.com/vllm-project/vllm - zhuzilin opened this pull request 21 days ago
[Bug]: AssertionError When deploy API serve of Qwen2-VL-72B
github.com/vllm-project/vllm - niuwa2333 opened this issue 21 days ago
github.com/vllm-project/vllm - niuwa2333 opened this issue 21 days ago
[Bug]: RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
github.com/vllm-project/vllm - Hothan01 opened this issue 21 days ago
github.com/vllm-project/vllm - Hothan01 opened this issue 21 days ago
[Bugfix] Support testing prefill throughput with benchmark_serving.py --hf-output-len 1
github.com/vllm-project/vllm - heheda12345 opened this pull request 21 days ago
github.com/vllm-project/vllm - heheda12345 opened this pull request 21 days ago
[Bug]: Variance Between Mutiple Prefix Cache Example runs
github.com/vllm-project/vllm - Imss27 opened this issue 21 days ago
github.com/vllm-project/vllm - Imss27 opened this issue 21 days ago
[Bugfix] Fix PP for Multi-Step
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 21 days ago
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 21 days ago
[Bugfix] Fix multi nodes TP+PP for XPU
github.com/vllm-project/vllm - yma11 opened this pull request 21 days ago
github.com/vllm-project/vllm - yma11 opened this pull request 21 days ago
[Bug]: assert len(self._async_stopped) == 0
github.com/vllm-project/vllm - sfc-gh-zhwang opened this issue 21 days ago
github.com/vllm-project/vllm - sfc-gh-zhwang opened this issue 21 days ago
[Usage]: OOM when using Llama-3.2-11B-Vision-Instruct
github.com/vllm-project/vllm - hrson-1203 opened this issue 21 days ago
github.com/vllm-project/vllm - hrson-1203 opened this issue 21 days ago
[Installation]: Cannot compile flash attention when building from source
github.com/vllm-project/vllm - KevinRSX opened this issue 21 days ago
github.com/vllm-project/vllm - KevinRSX opened this issue 21 days ago
[CI/Build] Update models tests & examples
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 22 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 22 days ago
[BugFix] Fix seeded random sampling with encoder-decoder models
github.com/vllm-project/vllm - njhill opened this pull request 22 days ago
github.com/vllm-project/vllm - njhill opened this pull request 22 days ago
[Bugfix] Fix print_warning_once's line info
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 22 days ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 22 days ago
[Core] Refactor GGUF parameters packing and forwarding
github.com/vllm-project/vllm - Isotr0py opened this pull request 22 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 22 days ago
support input embeddings for qwen2vl
github.com/vllm-project/vllm - whyiug opened this pull request 22 days ago
github.com/vllm-project/vllm - whyiug opened this pull request 22 days ago
[Installation]: Cannot install with Poetry
github.com/vllm-project/vllm - LLIALLIJLblK opened this issue 22 days ago
github.com/vllm-project/vllm - LLIALLIJLblK opened this issue 22 days ago
[CI/Build] Per file CUDA Archs (improve wheel size and dev build times)
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 22 days ago
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 22 days ago
[Bug]: exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
github.com/vllm-project/vllm - ZHUHF123 opened this issue 22 days ago
github.com/vllm-project/vllm - ZHUHF123 opened this issue 22 days ago
[Spec Decode] (1/2) Remove batch expansion
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 22 days ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 22 days ago
[Misc] Update config loading for Qwen2-VL and remove Granite
github.com/vllm-project/vllm - ywang96 opened this pull request 22 days ago
github.com/vllm-project/vllm - ywang96 opened this pull request 22 days ago
[CI/Build] Fix/skip failed tests due to `transformers` v4.45.0 release
github.com/vllm-project/vllm - ywang96 opened this pull request 22 days ago
github.com/vllm-project/vllm - ywang96 opened this pull request 22 days ago
Llama3.2 Vision Model: Guides and Issues
github.com/vllm-project/vllm - simon-mo opened this issue 23 days ago
github.com/vllm-project/vllm - simon-mo opened this issue 23 days ago
[Bugfix] Block manager v2 with preemption and lookahead slots
github.com/vllm-project/vllm - sroy745 opened this pull request 23 days ago
github.com/vllm-project/vllm - sroy745 opened this pull request 23 days ago
[Core] Improve choice of Python multiprocessing method
github.com/vllm-project/vllm - russellb opened this pull request 23 days ago
github.com/vllm-project/vllm - russellb opened this pull request 23 days ago
[Bug]: Later version have degradation based on `vllm:time_to_first_token_seconds_sum` metric
github.com/vllm-project/vllm - oandreeva-nv opened this issue 23 days ago
github.com/vllm-project/vllm - oandreeva-nv opened this issue 23 days ago
[New Model]: allenai/Molmo-7B-0-0924 VisionLM
github.com/vllm-project/vllm - K-Mistele opened this issue 23 days ago
github.com/vllm-project/vllm - K-Mistele opened this issue 23 days ago
[Bug]: Decrease generation quality Mixtral
github.com/vllm-project/vllm - thies1006 opened this issue 23 days ago
github.com/vllm-project/vllm - thies1006 opened this issue 23 days ago
[Core] Combined support for multi-step scheduling, chunked prefill & prefix caching
github.com/vllm-project/vllm - afeldman-nm opened this pull request 23 days ago
github.com/vllm-project/vllm - afeldman-nm opened this pull request 23 days ago
[Misc]: Strange `leaked shared_memory` warnings reported by multiprocessing when using vLLM
github.com/vllm-project/vllm - shaoyuyoung opened this issue 23 days ago
github.com/vllm-project/vllm - shaoyuyoung opened this issue 23 days ago
[Feature]: LoRA support for Pixtral
github.com/vllm-project/vllm - spring-anth opened this issue 23 days ago
github.com/vllm-project/vllm - spring-anth opened this issue 23 days ago
[Kernel] Enable BFloat16 inputs in fused Marlin MoE kernels
github.com/vllm-project/vllm - ElizaWszola opened this pull request 23 days ago
github.com/vllm-project/vllm - ElizaWszola opened this pull request 23 days ago
[Bugfix] Fix bug in convert_fp8
github.com/vllm-project/vllm - CharlesRiggins opened this pull request 23 days ago
github.com/vllm-project/vllm - CharlesRiggins opened this pull request 23 days ago
[ci] Add CODEOWNERS for test directories
github.com/vllm-project/vllm - khluu opened this pull request 23 days ago
github.com/vllm-project/vllm - khluu opened this pull request 23 days ago
[Bug]: Port binding failure when using pp > 1 after commit 7c7714d856eee6fa94aade729b67f00584f72a4c
github.com/vllm-project/vllm - dengminhao opened this issue 23 days ago
github.com/vllm-project/vllm - dengminhao opened this issue 23 days ago
[Bugfix] load fc bias from config for eagle
github.com/vllm-project/vllm - sohamparikh opened this pull request 23 days ago
github.com/vllm-project/vllm - sohamparikh opened this pull request 23 days ago
[Bugfix] load bias from config for Eagle model
github.com/vllm-project/vllm - sohampnow opened this pull request 23 days ago
github.com/vllm-project/vllm - sohampnow opened this pull request 23 days ago
[Installation]: Installing vLLM on ROCm - Distro:Gentoo
github.com/vllm-project/vllm - rohitnanda1443 opened this issue 23 days ago
github.com/vllm-project/vllm - rohitnanda1443 opened this issue 23 days ago
Add RWKV v5 (Eagle) support
github.com/vllm-project/vllm - harrisonvanderbyl opened this pull request 23 days ago
github.com/vllm-project/vllm - harrisonvanderbyl opened this pull request 23 days ago
[Tracking Issue][Help Wanted]: FlashInfer backend improvements
github.com/vllm-project/vllm - comaniac opened this issue 23 days ago
github.com/vllm-project/vllm - comaniac opened this issue 23 days ago
[Int4-AWQ] Fix ROCm AWQ Marlin check
github.com/vllm-project/vllm - hegemanjw4amd opened this pull request 23 days ago
github.com/vllm-project/vllm - hegemanjw4amd opened this pull request 23 days ago
[Bug]: Disabling Marlin by setting --quantization gptq doesn't work when using a draft model
github.com/vllm-project/vllm - cduk opened this issue 24 days ago
github.com/vllm-project/vllm - cduk opened this issue 24 days ago
[Bug]: Decode n tokens gives different output for first seq position compared to decode 1 token
github.com/vllm-project/vllm - 0amp opened this issue 24 days ago
github.com/vllm-project/vllm - 0amp opened this issue 24 days ago
[RFC]: Add Goodput Metric to Benchmark Serving
github.com/vllm-project/vllm - Imss27 opened this issue 24 days ago
github.com/vllm-project/vllm - Imss27 opened this issue 24 days ago
[Bugfix] No num_gpus for ROCm and XPU when connecting to a ray cluster
github.com/vllm-project/vllm - HollowMan6 opened this pull request 24 days ago
github.com/vllm-project/vllm - HollowMan6 opened this pull request 24 days ago
Fix test_schedule_swapped_simple in test_scheduler.py
github.com/vllm-project/vllm - sroy745 opened this pull request 24 days ago
github.com/vllm-project/vllm - sroy745 opened this pull request 24 days ago
[Bug]: LLMEngine cannot be pickled error vllm 0.6.1.post2
github.com/vllm-project/vllm - stikkireddy opened this issue 24 days ago
github.com/vllm-project/vllm - stikkireddy opened this issue 24 days ago
[DoNotMerge] [CI] [Documentation][ROCm] ci fix and doc update
github.com/vllm-project/vllm - hongxiayang opened this pull request 24 days ago
github.com/vllm-project/vllm - hongxiayang opened this pull request 24 days ago
[Bugfix][Kernel] Implement acquire/release polyfill for Pascal
github.com/vllm-project/vllm - sasha0552 opened this pull request 24 days ago
github.com/vllm-project/vllm - sasha0552 opened this pull request 24 days ago
[CI/Build] migrate project metadata from setup.py to pyproject.toml
github.com/vllm-project/vllm - dtrifiro opened this pull request 24 days ago
github.com/vllm-project/vllm - dtrifiro opened this pull request 24 days ago
[CI/Build] fix setuptools-scm usage
github.com/vllm-project/vllm - dtrifiro opened this pull request 24 days ago
github.com/vllm-project/vllm - dtrifiro opened this pull request 24 days ago
[Hardware][CPU] Enable mrope and support Qwen2-VL on CPU backend
github.com/vllm-project/vllm - Isotr0py opened this pull request 24 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 24 days ago
[Usage]: Total generated tokens in benchmarking script
github.com/vllm-project/vllm - double-vin opened this issue 24 days ago
github.com/vllm-project/vllm - double-vin opened this issue 24 days ago
[[Misc]Upgrade bitsandbytes to the latest version 0.44.0
github.com/vllm-project/vllm - jeejeelee opened this pull request 24 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 24 days ago
[Bugfix] Ray 2.9.x doesn't expose available_resources_per_node
github.com/vllm-project/vllm - darthhexx opened this pull request 24 days ago
github.com/vllm-project/vllm - darthhexx opened this pull request 24 days ago
[misc] soft drop beam search
github.com/vllm-project/vllm - youkaichao opened this pull request 24 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 24 days ago
[Frontend] MQLLMEngine supports profiling.
github.com/vllm-project/vllm - abatom opened this pull request 24 days ago
github.com/vllm-project/vllm - abatom opened this pull request 24 days ago
[Core] Rename `PromptInputs` and `inputs`, with backwards compatibility
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 24 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 24 days ago
[CI/Build] Add examples folder into Docker image so that we can leverage the templates*.jinja when serving models
github.com/vllm-project/vllm - panpan0000 opened this pull request 24 days ago
github.com/vllm-project/vllm - panpan0000 opened this pull request 24 days ago
[Bug]: use cpu_offload_gb in gguf failed.
github.com/vllm-project/vllm - Minami-su opened this issue 24 days ago
github.com/vllm-project/vllm - Minami-su opened this issue 24 days ago
[Bug]: OLMoForCausalLM not supported
github.com/vllm-project/vllm - sert121 opened this issue 24 days ago
github.com/vllm-project/vllm - sert121 opened this issue 24 days ago
Fix tests in test_chunked_prefill_scheduler which fail with BlockManager V2
github.com/vllm-project/vllm - sroy745 opened this pull request 24 days ago
github.com/vllm-project/vllm - sroy745 opened this pull request 24 days ago
[Kernel][Quantization] Custom Floating-Point Runtime Quantization
github.com/vllm-project/vllm - AlpinDale opened this pull request 24 days ago
github.com/vllm-project/vllm - AlpinDale opened this pull request 24 days ago
[Bugfix] Fix torch dynamo fixes caused by `replace_parameters`
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 25 days ago
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 25 days ago
[Bug]: OLMoE produces incorrect output with TP>1
github.com/vllm-project/vllm - mgoin opened this issue 25 days ago
github.com/vllm-project/vllm - mgoin opened this issue 25 days ago
[Hardware][Neuron] Add on-device sampling support for Neuron
github.com/vllm-project/vllm - chongmni-aws opened this pull request 25 days ago
github.com/vllm-project/vllm - chongmni-aws opened this pull request 25 days ago
Why is the bitsandbytes model significantly slower than the AWQ model?
github.com/vllm-project/vllm - hahmad2008 opened this issue 25 days ago
github.com/vllm-project/vllm - hahmad2008 opened this issue 25 days ago
[Feature]: Support Inference Overrides for mm_processor_kwargs
github.com/vllm-project/vllm - alex-jw-brooks opened this issue 25 days ago
github.com/vllm-project/vllm - alex-jw-brooks opened this issue 25 days ago
[Bugfix] Fix Marlin MoE act order when is_k_full == False
github.com/vllm-project/vllm - ElizaWszola opened this pull request 25 days ago
github.com/vllm-project/vllm - ElizaWszola opened this pull request 25 days ago
[Bug]: stuck at "generating GPU P2P access cache in /home/luban/.cache/vllm/gpu_p2p_access_cache_for_0,1.json"
github.com/vllm-project/vllm - immusferr opened this issue 25 days ago
github.com/vllm-project/vllm - immusferr opened this issue 25 days ago
[Misc] Add conftest plugin for applying forking decorator
github.com/vllm-project/vllm - kevin314 opened this pull request 25 days ago
github.com/vllm-project/vllm - kevin314 opened this pull request 25 days ago
[Misc]: Unit test failures with BlockManager v2
github.com/vllm-project/vllm - sroy745 opened this issue 26 days ago
github.com/vllm-project/vllm - sroy745 opened this issue 26 days ago
[Core][VLM] Test registration for OOT multimodal models
github.com/vllm-project/vllm - ywang96 opened this pull request 26 days ago
github.com/vllm-project/vllm - ywang96 opened this pull request 26 days ago
[Kernel][Hardware][AMD][ROCm] Fix rocm/attention.cu compilation on ROCm 6.0.3
github.com/vllm-project/vllm - HollowMan6 opened this pull request 26 days ago
github.com/vllm-project/vllm - HollowMan6 opened this pull request 26 days ago
[Bugfix] fix tool_parser error handling when serve a model not support it
github.com/vllm-project/vllm - liuyanyi opened this pull request 26 days ago
github.com/vllm-project/vllm - liuyanyi opened this pull request 26 days ago
Feature 'f16 arithemetic and compare instructions' requires .target sm_53 or higher
github.com/vllm-project/vllm - shahizat opened this issue 26 days ago
github.com/vllm-project/vllm - shahizat opened this issue 26 days ago
Fix paligemma, fuyu and persimmon with transformers 4.45 : use config.text_config.vocab_size
github.com/vllm-project/vllm - janimo opened this pull request 26 days ago
github.com/vllm-project/vllm - janimo opened this pull request 26 days ago
[Usage]: Is there any difference between max_tokens and max_model_len?
github.com/vllm-project/vllm - DankoZhang opened this issue 26 days ago
github.com/vllm-project/vllm - DankoZhang opened this issue 26 days ago
[Misc] Use NamedTuple in Multi-image example
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 26 days ago
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 26 days ago
[Core] Deprecating block manager v1 and make block manager v2 default
github.com/vllm-project/vllm - KuntaiDu opened this pull request 26 days ago
github.com/vllm-project/vllm - KuntaiDu opened this pull request 26 days ago
[MISC] rename CudaMemoryProfiler to DeviceMemoryProfiler
github.com/vllm-project/vllm - statelesshz opened this pull request 26 days ago
github.com/vllm-project/vllm - statelesshz opened this pull request 26 days ago
[Bugfix] Avoid some bogus messages RE CUTLASS's revision when building
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 26 days ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 26 days ago
[SpecDec][Misc] Cleanup, remove bonus token logic.
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 26 days ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 26 days ago
[Feature]: Support for Seq classification/Reward models
github.com/vllm-project/vllm - ariaattar opened this issue 27 days ago
github.com/vllm-project/vllm - ariaattar opened this issue 27 days ago
[ci][build] fix vllm-flash-attn
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
[Bug]: Low trhoughput on AMD MI250 using llama 3.1 (6 toks/s)
github.com/vllm-project/vllm - huberemanuel opened this issue 27 days ago
github.com/vllm-project/vllm - huberemanuel opened this issue 27 days ago
[Bug]: AssertionError when loading Qwen 2.5 GGUF q3 model in vLLM
github.com/vllm-project/vllm - frei-x opened this issue 27 days ago
github.com/vllm-project/vllm - frei-x opened this issue 27 days ago
[Model] Support pp for qwen2-vl
github.com/vllm-project/vllm - liuyanyi opened this pull request 27 days ago
github.com/vllm-project/vllm - liuyanyi opened this pull request 27 days ago
[Core] Enable Memory Tiering for vLLM
github.com/vllm-project/vllm - PanJason opened this pull request 27 days ago
github.com/vllm-project/vllm - PanJason opened this pull request 27 days ago
[Bug]: Pixtral-12B not supported on CPU
github.com/vllm-project/vllm - joelimgu opened this issue 27 days ago
github.com/vllm-project/vllm - joelimgu opened this issue 27 days ago
[MISC] Support multi node inference with Neuron
github.com/vllm-project/vllm - sssrijan-amazon opened this pull request 27 days ago
github.com/vllm-project/vllm - sssrijan-amazon opened this pull request 27 days ago