Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
Support softcap in ROCm Flash Attention
github.com/vllm-project/vllm - hliuca opened this pull request about 1 month ago
github.com/vllm-project/vllm - hliuca opened this pull request about 1 month ago
[CI/Build] Dockerfile build for ARM64 / GH200
github.com/vllm-project/vllm - drikster80 opened this pull request about 1 month ago
github.com/vllm-project/vllm - drikster80 opened this pull request about 1 month ago
[Bugfix] GPU memory profiling should be per LLM instance
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 1 month ago
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 1 month ago
[Frontend] Add Command-R and Llama-3 chat template
github.com/vllm-project/vllm - ccs96307 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ccs96307 opened this pull request about 1 month ago
[Misc] Increase default video fetch timeout
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
[Bugfix] Embedding model pooling_type equals ALL and multi input's bug
github.com/vllm-project/vllm - BBuf opened this pull request about 1 month ago
github.com/vllm-project/vllm - BBuf opened this pull request about 1 month ago
[Bug]: Error when calling vLLM with audio input using Qwen/Qwen2-Audio-7B-Instruct model
github.com/vllm-project/vllm - jiahansu opened this issue about 1 month ago
github.com/vllm-project/vllm - jiahansu opened this issue about 1 month ago
[V1] Replace traversal search with lookup table
github.com/vllm-project/vllm - Abatom opened this pull request about 1 month ago
github.com/vllm-project/vllm - Abatom opened this pull request about 1 month ago
[Bugfix] Handle transformers v4.47 and fix placeholder matching in merged multi-modal processors
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
Add support for reporting metrics in completion response headers in o…
github.com/vllm-project/vllm - coolkp opened this pull request about 1 month ago
github.com/vllm-project/vllm - coolkp opened this pull request about 1 month ago
[torch.compile] limit inductor threads and lazy import quant
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Usage]: VSCode debugger is hanging
github.com/vllm-project/vllm - jeejeelee opened this issue about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this issue about 1 month ago
[Bug]: vLLM CPU mode broken Unable to get JIT kernel for brgemm
github.com/vllm-project/vllm - samos123 opened this issue about 1 month ago
github.com/vllm-project/vllm - samos123 opened this issue about 1 month ago
[Usage]: Cant use vllm on a multiGPU node
github.com/vllm-project/vllm - 4k1s opened this issue about 1 month ago
github.com/vllm-project/vllm - 4k1s opened this issue about 1 month ago
[Misc] Add multipstep chunked-prefill support for FlashInfer
github.com/vllm-project/vllm - elfiegg opened this pull request about 1 month ago
github.com/vllm-project/vllm - elfiegg opened this pull request about 1 month ago
[Bugfix]: allow extra fields in requests to openai compatible server
github.com/vllm-project/vllm - gcalmettes opened this pull request about 1 month ago
github.com/vllm-project/vllm - gcalmettes opened this pull request about 1 month ago
[Core] Add Sliding Window Support with Flashinfer
github.com/vllm-project/vllm - pavanimajety opened this pull request about 1 month ago
github.com/vllm-project/vllm - pavanimajety opened this pull request about 1 month ago
[Bugfix] Fix the LoRA weight sharding in ColumnParallelLinearWithLoRA
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Pixtral-Large] Pixtral actually has no bias in vision-lang adapter
github.com/vllm-project/vllm - patrickvonplaten opened this pull request about 1 month ago
github.com/vllm-project/vllm - patrickvonplaten opened this pull request about 1 month ago
[misc][plugin] improve plugin loading
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Bug]: Speculative decoding + guided decoding not working
github.com/vllm-project/vllm - arunpatala opened this issue about 1 month ago
github.com/vllm-project/vllm - arunpatala opened this issue about 1 month ago
[CI][CPU] adding numa node number as container name suffix
github.com/vllm-project/vllm - zhouyuan opened this pull request about 1 month ago
github.com/vllm-project/vllm - zhouyuan opened this pull request about 1 month ago
[Bug]: Input prompt (35247 tokens) is too long and exceeds limit of 1000
github.com/vllm-project/vllm - Crista23 opened this issue about 1 month ago
github.com/vllm-project/vllm - Crista23 opened this issue about 1 month ago
[Bug]: Unable to run Qwen2.5-0.5B-Instruct model in v0.6.4.post1 version, Error: No available memory for the cache blocks
github.com/vllm-project/vllm - Valdanitooooo opened this issue about 1 month ago
github.com/vllm-project/vllm - Valdanitooooo opened this issue about 1 month ago
[Misc] Avoid misleading warning messages
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[6/N] torch.compile rollout to users
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[ci/build] Have dependabot ignore all patch update
github.com/vllm-project/vllm - khluu opened this pull request about 1 month ago
github.com/vllm-project/vllm - khluu opened this pull request about 1 month ago
Compressed tensors w8a8 tpu
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 1 month ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 1 month ago
[CI/Build] Update Dockerfile.rocm
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request about 1 month ago
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request about 1 month ago
Add openai.beta.chat.completions.parse example to structured_outputs.rst
github.com/vllm-project/vllm - mgoin opened this pull request about 1 month ago
github.com/vllm-project/vllm - mgoin opened this pull request about 1 month ago
[Bug]: vllm server crash when num-scheduler-steps > 1 and max_tokens=0
github.com/vllm-project/vllm - atanikan opened this issue about 1 month ago
github.com/vllm-project/vllm - atanikan opened this issue about 1 month ago
[ci][bugfix] fix kernel tests
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Bugfix] Guard for negative counter metrics to prevent crash
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 1 month ago
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 1 month ago
[Doc]: Pages were moved without a redirect
github.com/vllm-project/vllm - shannonxtreme opened this issue about 1 month ago
github.com/vllm-project/vllm - shannonxtreme opened this issue about 1 month ago
[Doc]: Migrate to Markdown
github.com/vllm-project/vllm - rafvasq opened this issue about 1 month ago
github.com/vllm-project/vllm - rafvasq opened this issue about 1 month ago
Fix open_collective value in FUNDING.yml
github.com/vllm-project/vllm - andrew opened this pull request about 1 month ago
github.com/vllm-project/vllm - andrew opened this pull request about 1 month ago
[Doc] Update doc for LoRA support in GLM-4V
github.com/vllm-project/vllm - B-201 opened this pull request about 1 month ago
github.com/vllm-project/vllm - B-201 opened this pull request about 1 month ago
[CI/Build] Support compilation with local cutlass path (#10423)
github.com/vllm-project/vllm - wchen61 opened this pull request about 1 month ago
github.com/vllm-project/vllm - wchen61 opened this pull request about 1 month ago
[Feature]: Add Support for Specifying Local CUTLASS Source Directory via Environment Variable
github.com/vllm-project/vllm - wchen61 opened this issue about 1 month ago
github.com/vllm-project/vllm - wchen61 opened this issue about 1 month ago
[Misc] Reduce medusa weight
github.com/vllm-project/vllm - skylee-01 opened this pull request about 1 month ago
github.com/vllm-project/vllm - skylee-01 opened this pull request about 1 month ago
Fix: Build error seen on Power Architecture
github.com/vllm-project/vllm - mikejuliet13 opened this pull request about 1 month ago
github.com/vllm-project/vllm - mikejuliet13 opened this pull request about 1 month ago
[Model][LoRA]LoRA support added for glm-4v
github.com/vllm-project/vllm - B-201 opened this pull request about 1 month ago
github.com/vllm-project/vllm - B-201 opened this pull request about 1 month ago
[Bugfix]Fix Phi-3 BNB online quantization
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Bug]: Encountered issues when deploying Llama-3.2-11B-Vision-Instruct for online inference.
github.com/vllm-project/vllm - CapitalLiu opened this issue about 1 month ago
github.com/vllm-project/vllm - CapitalLiu opened this issue about 1 month ago
[Model] Remove transformers attention porting in VITs
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
Bump the patch-update group with 2 updates
github.com/vllm-project/vllm - dependabot[bot] opened this pull request about 1 month ago
github.com/vllm-project/vllm - dependabot[bot] opened this pull request about 1 month ago
[core] Bump ray to use _overlap_gpu_communication in compiled graph tests
github.com/vllm-project/vllm - ruisearch42 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ruisearch42 opened this pull request about 1 month ago
[Bug]: (Program crashes after increasing --tensor-parallel-size) with error pynvml.NVMLError_InvalidArgument: Invalid Argument
github.com/vllm-project/vllm - JohnConnor123 opened this issue about 1 month ago
github.com/vllm-project/vllm - JohnConnor123 opened this issue about 1 month ago
[Bug]: 使用vllm和transformer部署Qwen2vl,同一张图片输出结果不一致
github.com/vllm-project/vllm - Apricot1225 opened this issue about 1 month ago
github.com/vllm-project/vllm - Apricot1225 opened this issue about 1 month ago
[5/N][torch.compile] torch.jit.script --> torch.compile
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Model][Bugfix] Support TP for PixtralHF ViT
github.com/vllm-project/vllm - mgoin opened this pull request about 1 month ago
github.com/vllm-project/vllm - mgoin opened this pull request about 1 month ago
[platforms] refactor cpu code
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[4/N][torch.compile] clean up set_torch_compile_backend
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
Support Cross encoder models
github.com/vllm-project/vllm - maxdebayser opened this pull request about 1 month ago
github.com/vllm-project/vllm - maxdebayser opened this pull request about 1 month ago
[3/N][torch.compile] consolidate custom op logging
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395)
github.com/vllm-project/vllm - xiyuan-lee opened this pull request about 1 month ago
github.com/vllm-project/vllm - xiyuan-lee opened this pull request about 1 month ago
[V1] Add code owners for V1
github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[BugFix] Fix hermes tool parser output error stream arguments in some cases
github.com/vllm-project/vllm - xiyuan-lee opened this pull request about 1 month ago
github.com/vllm-project/vllm - xiyuan-lee opened this pull request about 1 month ago
[Bug]: Hermes tool parser output error stream arguments in some cases.
github.com/vllm-project/vllm - xiyuan-lee opened this issue about 1 month ago
github.com/vllm-project/vllm - xiyuan-lee opened this issue about 1 month ago
[Bugfix][Hardware][CPU] Fix CPU embedding runner with tensor parallel
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
[Misc] Enhance offline_inference to support user-configurable paramet…
github.com/vllm-project/vllm - wchen61 opened this pull request about 1 month ago
github.com/vllm-project/vllm - wchen61 opened this pull request about 1 month ago
[Feature]: Enhance offline_inference.py with Configurable Parameters for Greater Flexibility
github.com/vllm-project/vllm - wchen61 opened this issue about 1 month ago
github.com/vllm-project/vllm - wchen61 opened this issue about 1 month ago
Add ngram speculation to API
github.com/vllm-project/vllm - flozi00 opened this pull request about 1 month ago
github.com/vllm-project/vllm - flozi00 opened this pull request about 1 month ago
[Bug]: v0.6.4.post1 crashed:Error in model execution: CUDA error: an illegal memory access was encountered
github.com/vllm-project/vllm - wciq1208 opened this issue about 1 month ago
github.com/vllm-project/vllm - wciq1208 opened this issue about 1 month ago
[Bugfix] Fix M-RoPE position calculation when chunked prefill is enabled
github.com/vllm-project/vllm - imkero opened this pull request about 1 month ago
github.com/vllm-project/vllm - imkero opened this pull request about 1 month ago
[Misc]: Ask for the roadmap of async output processing support for speculative decoding
github.com/vllm-project/vllm - Lin-Qingyang-Alec opened this issue about 1 month ago
github.com/vllm-project/vllm - Lin-Qingyang-Alec opened this issue about 1 month ago
[misc][plugin] improve log messages
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[BugFix] [Kernel] Fix GPU SEGV occuring in fused_moe kernel
github.com/vllm-project/vllm - rasmith opened this pull request about 1 month ago
github.com/vllm-project/vllm - rasmith opened this pull request about 1 month ago
[CI/Build] Fix IDC hpu [Device not found] issue
github.com/vllm-project/vllm - xuechendi opened this pull request about 1 month ago
github.com/vllm-project/vllm - xuechendi opened this pull request about 1 month ago
[2/N][torch.compile] make compilation cfg part of vllm cfg
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[v1] V1EngineArgs for better config handling
github.com/vllm-project/vllm - rickyyx opened this pull request about 1 month ago
github.com/vllm-project/vllm - rickyyx opened this pull request about 1 month ago
[BugFix] [Kernel] Fix GPU SEGV occuring in fused_moe kernel
github.com/vllm-project/vllm - rasmith opened this pull request about 1 month ago
github.com/vllm-project/vllm - rasmith opened this pull request about 1 month ago
Fix integer overflow causing gpu segfault
github.com/vllm-project/vllm - rasmith opened this pull request about 1 month ago
github.com/vllm-project/vllm - rasmith opened this pull request about 1 month ago
[Bug]: Granite 3.0 disconnect between parser and example template
github.com/vllm-project/vllm - wilbry opened this issue about 1 month ago
github.com/vllm-project/vllm - wilbry opened this issue about 1 month ago
Test k8s agent
github.com/vllm-project/vllm - dhonnappa-amd opened this pull request about 1 month ago
github.com/vllm-project/vllm - dhonnappa-amd opened this pull request about 1 month ago
[Feature]: NVIDIA Triton GenAI Perf Benchmark
github.com/vllm-project/vllm - simon-mo opened this issue about 1 month ago
github.com/vllm-project/vllm - simon-mo opened this issue about 1 month ago
[Bug]: Guided Decoding Broken in Streaming mode
github.com/vllm-project/vllm - JC1DA opened this issue about 1 month ago
github.com/vllm-project/vllm - JC1DA opened this issue about 1 month ago
[Bugfix] Ignore ray reinit error when current platform is ROCm or XPU
github.com/vllm-project/vllm - HollowMan6 opened this pull request about 1 month ago
github.com/vllm-project/vllm - HollowMan6 opened this pull request about 1 month ago
[V1] Refactor model executable interface for all text-only language models
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
[Bug]: VLM benchmark_serving request not working
github.com/vllm-project/vllm - gracehonv opened this issue about 1 month ago
github.com/vllm-project/vllm - gracehonv opened this issue about 1 month ago
[doc] add doc for the plugin system
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Doc] Add the start of an arch overview page
github.com/vllm-project/vllm - russellb opened this pull request about 1 month ago
github.com/vllm-project/vllm - russellb opened this pull request about 1 month ago
[CI/Build] Add sphinx/rst linter for docs
github.com/vllm-project/vllm - rafvasq opened this pull request about 1 month ago
github.com/vllm-project/vllm - rafvasq opened this pull request about 1 month ago
[Misc] Medusa supports custom bias
github.com/vllm-project/vllm - skylee-01 opened this pull request about 1 month ago
github.com/vllm-project/vllm - skylee-01 opened this pull request about 1 month ago
[Bug]: contine generation but do not return the output
github.com/vllm-project/vllm - siyuyuan opened this issue about 1 month ago
github.com/vllm-project/vllm - siyuyuan opened this issue about 1 month ago
[Platform][Refactor] Extract func `get_default_attn_backend` to `Platform`
github.com/vllm-project/vllm - MengqingCao opened this pull request about 1 month ago
github.com/vllm-project/vllm - MengqingCao opened this pull request about 1 month ago
[Hardware][CPU] Support chunked-prefill and prefix-caching on CPU
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request about 1 month ago
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request about 1 month ago
Add KV-Cache int8 quant support
github.com/vllm-project/vllm - YanyunDuanIEI opened this pull request about 1 month ago
github.com/vllm-project/vllm - YanyunDuanIEI opened this pull request about 1 month ago
[Core] Interface for accessing model from engine
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
[Bugfix] Fix fully sharded LoRA bug
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Misc] Consolidate pooler config overrides
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
[Bugfix] Qwen-vl output is inconsistent in speculative decoding
github.com/vllm-project/vllm - skylee-01 opened this pull request about 1 month ago
github.com/vllm-project/vllm - skylee-01 opened this pull request about 1 month ago
[Misc] Fix import error in tensorizer tests and cleanup some code
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
[Doc] Remove float32 choice from --lora-dtype
github.com/vllm-project/vllm - xyang16 opened this pull request about 1 month ago
github.com/vllm-project/vllm - xyang16 opened this pull request about 1 month ago
Add default value to avoid Falcon crash (#5363)
github.com/vllm-project/vllm - wchen61 opened this pull request about 1 month ago
github.com/vllm-project/vllm - wchen61 opened this pull request about 1 month ago
[DRAFT] Cutlass 2:4
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 1 month ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 1 month ago
[Usage]: cuda oom when serving multi task on same server
github.com/vllm-project/vllm - reneix opened this issue about 1 month ago
github.com/vllm-project/vllm - reneix opened this issue about 1 month ago
[Misc]: Snowflake Arctic out of memory error with TP-8
github.com/vllm-project/vllm - rajagond opened this issue about 1 month ago
github.com/vllm-project/vllm - rajagond opened this issue about 1 month ago
[Feature]: Allow head_size smaller than 128 on TPU with Pallas backend
github.com/vllm-project/vllm - manninglucas opened this issue about 1 month ago
github.com/vllm-project/vllm - manninglucas opened this issue about 1 month ago