Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[vlm] Remove vision language config.
github.com/vllm-project/vllm - xwjiang2010 opened this pull request 4 months ago
github.com/vllm-project/vllm - xwjiang2010 opened this pull request 4 months ago
[ Misc ] Expand Fp8 MoE Support to Qwen
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
[Usage]: Why is the useage information missing in the streaming call. Not streaming is there.
github.com/vllm-project/vllm - alfgo opened this issue 4 months ago
github.com/vllm-project/vllm - alfgo opened this issue 4 months ago
[ Misc ] Refactor Marlin Python Utilities
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
[Kernel][Attention] Separate `Attention.kv_scale` into `k_scale` and `v_scale`
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
[Feature]: Add readiness endpoint /ready and return /health earlier (vLLM on Kubernetes)
github.com/vllm-project/vllm - frittentheke opened this issue 4 months ago
github.com/vllm-project/vllm - frittentheke opened this issue 4 months ago
[Bug]: Loading LoRA is super slow when using tensor parallel
github.com/vllm-project/vllm - markovalexander opened this issue 4 months ago
github.com/vllm-project/vllm - markovalexander opened this issue 4 months ago
[Gemma 2 27B]: Update docker hub image to support gemma-2-27B-it
github.com/vllm-project/vllm - vipulgote1999 opened this issue 4 months ago
github.com/vllm-project/vllm - vipulgote1999 opened this issue 4 months ago
[Usage]: how to initiate the gemma2-27b with a 4-bit quantization?
github.com/vllm-project/vllm - maxin9966 opened this issue 4 months ago
github.com/vllm-project/vllm - maxin9966 opened this issue 4 months ago
[Feature]: support Ascend 910B in the future
github.com/vllm-project/vllm - jkl375 opened this issue 4 months ago
github.com/vllm-project/vllm - jkl375 opened this issue 4 months ago
[Bug]: benchmark_serving.py cannot calculate Median TTFT correctly
github.com/vllm-project/vllm - Sekri0 opened this issue 4 months ago
github.com/vllm-project/vllm - Sekri0 opened this issue 4 months ago
[Installation]: how to disable NCCL support on Jetson cevices
github.com/vllm-project/vllm - thunder95 opened this issue 4 months ago
github.com/vllm-project/vllm - thunder95 opened this issue 4 months ago
[Bug]: ValidationError using langchain_community.llms.VLLM
github.com/vllm-project/vllm - santurini opened this issue 4 months ago
github.com/vllm-project/vllm - santurini opened this issue 4 months ago
[Doc] Reinstate doc dependencies
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
[Bug]: Garbled Tokens appears in vllm generation result every time change to new LLM model (Qwen)
github.com/vllm-project/vllm - Jason-csc opened this issue 4 months ago
github.com/vllm-project/vllm - Jason-csc opened this issue 4 months ago
[Usage]: How to use beam search when request OpenAI Completions API
github.com/vllm-project/vllm - nguyenhoanganh2002 opened this issue 4 months ago
github.com/vllm-project/vllm - nguyenhoanganh2002 opened this issue 4 months ago
[Usage]: How to use --pipeline-parallel-size
github.com/vllm-project/vllm - XiaoYu2022 opened this issue 4 months ago
github.com/vllm-project/vllm - XiaoYu2022 opened this issue 4 months ago
[Kernel] Unify the kernel used in flash attention backend
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 4 months ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 4 months ago
[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 4 months ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 4 months ago
Benchmark: add H100 suite
github.com/vllm-project/vllm - simon-mo opened this pull request 4 months ago
github.com/vllm-project/vllm - simon-mo opened this pull request 4 months ago
[Bug]: call for stack trace for "Watchdog caught collective operation timeout"
github.com/vllm-project/vllm - youkaichao opened this issue 4 months ago
github.com/vllm-project/vllm - youkaichao opened this issue 4 months ago
[Kernel] Support Microsoft Runtime Kernel Lib for our Low Precision Computation
github.com/vllm-project/vllm - LeiWang1999 opened this pull request 4 months ago
github.com/vllm-project/vllm - LeiWang1999 opened this pull request 4 months ago
[Bugfix] Make spec. decode respect per-request seed.
github.com/vllm-project/vllm - tdoublep opened this pull request 4 months ago
github.com/vllm-project/vllm - tdoublep opened this pull request 4 months ago
[Core] Introduce SPMD worker execution using Ray accelerated DAG
github.com/vllm-project/vllm - ruisearch42 opened this pull request 4 months ago
github.com/vllm-project/vllm - ruisearch42 opened this pull request 4 months ago
Support for quantized kv cache (compressed-tensors)
github.com/vllm-project/vllm - dbogunowicz opened this pull request 4 months ago
github.com/vllm-project/vllm - dbogunowicz opened this pull request 4 months ago
[Bug]: Producer process has been terminated before all shared CUDA tensors released (v 0.5.0 post1, v 0.4.3)
github.com/vllm-project/vllm - yaronr opened this issue 4 months ago
github.com/vllm-project/vllm - yaronr opened this issue 4 months ago
[Bug]: There are differences in the output results of the same prompt between vllm offline and online calls
github.com/vllm-project/vllm - ArlanCooper opened this issue 4 months ago
github.com/vllm-project/vllm - ArlanCooper opened this issue 4 months ago
[New Model]: facebook/seamless-m4t-v2-large
github.com/vllm-project/vllm - frittentheke opened this issue 4 months ago
github.com/vllm-project/vllm - frittentheke opened this issue 4 months ago
[ci][distributed] add distributed test gptq_marlin with tp = 2
github.com/vllm-project/vllm - llmpros opened this pull request 4 months ago
github.com/vllm-project/vllm - llmpros opened this pull request 4 months ago
[Hardware][Intel CPU] Adding intel openmp tunings in Docker file
github.com/vllm-project/vllm - zhouyuan opened this pull request 4 months ago
github.com/vllm-project/vllm - zhouyuan opened this pull request 4 months ago
[Feature][Hardware][AMD] Enable Scaled FP8 GEMM on ROCm
github.com/vllm-project/vllm - HaiShaw opened this pull request 4 months ago
github.com/vllm-project/vllm - HaiShaw opened this pull request 4 months ago
[ci][distributed] fix some cuda init that makes it necessary to use spawn
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[ci][distributed] fix phi-3v test failure
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[CI/Build] Temporarily Remove Phi3-Vision from TP Test
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
[CI/Build] Reuse code for checking output consistency
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
[BugFix] Ensure worker model loop is always stopped at the right time
github.com/vllm-project/vllm - njhill opened this pull request 4 months ago
github.com/vllm-project/vllm - njhill opened this pull request 4 months ago
[Frontend] Bad words sampling parameter
github.com/vllm-project/vllm - Alvant opened this pull request 4 months ago
github.com/vllm-project/vllm - Alvant opened this pull request 4 months ago
[New Model]: support for BartForSequenceClassification
github.com/vllm-project/vllm - Sapessii opened this issue 4 months ago
github.com/vllm-project/vllm - Sapessii opened this issue 4 months ago
[Bug]: TypeError: FlashAttentionMetadata.__init__() missing 10 required positional arguments
github.com/vllm-project/vllm - lonngxiang opened this issue 4 months ago
github.com/vllm-project/vllm - lonngxiang opened this issue 4 months ago
[Bug]: AttributeError: 'NoneType' object has no attribute 'prefill_metadata'
github.com/vllm-project/vllm - lonngxiang opened this issue 4 months ago
github.com/vllm-project/vllm - lonngxiang opened this issue 4 months ago
[Misc] Update Phi-3-Vision Example
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
[wip][Core] Introduce SPMD worker execution using Ray accelerated DAG
github.com/vllm-project/vllm - stephanie-wang opened this pull request 4 months ago
github.com/vllm-project/vllm - stephanie-wang opened this pull request 4 months ago
[misc][doc] try to add warning for latest html
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[Bugfix][TPU] Fix TPU sampler output
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
[Bugfix][TPU] Fix pad slot id
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
[Kernel] Expand FP8 support to Ampere GPUs using FP8 Marlin
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
[Core] Optimize `SequenceStatus.is_finished` by switching to IntEnum
github.com/vllm-project/vllm - Yard1 opened this pull request 4 months ago
github.com/vllm-project/vllm - Yard1 opened this pull request 4 months ago
[Draft][Core] Refactor _prepare_model_input_tensors
github.com/vllm-project/vllm - comaniac opened this pull request 4 months ago
github.com/vllm-project/vllm - comaniac opened this pull request 4 months ago
[Misc] Fix `get_min_capability`
github.com/vllm-project/vllm - dsikka opened this pull request 4 months ago
github.com/vllm-project/vllm - dsikka opened this pull request 4 months ago
[ Misc ] Isolate Fp8Moe From Mixtral
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
[Bug]: Phi-3 vision crash: TypeError: only integer tensors of a single element can be converted to an index
github.com/vllm-project/vllm - pseudotensor opened this issue 4 months ago
github.com/vllm-project/vllm - pseudotensor opened this issue 4 months ago
[misc][optimization] optimize data structure in allocator
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[CI/Build] [3/3] Reorganize entrypoints tests
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
[Model] Changes to MLPSpeculator to support tie_weights and input_scale
github.com/vllm-project/vllm - tdoublep opened this pull request 4 months ago
github.com/vllm-project/vllm - tdoublep opened this pull request 4 months ago
[Bugfix] Fix Engine Failing After Invalid Request - AsyncEngineDeadError
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
Unmark more files as executable
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 4 months ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 4 months ago
[Bug]: vLLM crash when running Phi-3-small-8k-instruct with enable-chunked-prefill
github.com/vllm-project/vllm - yaronr opened this issue 4 months ago
github.com/vllm-project/vllm - yaronr opened this issue 4 months ago
[Core] Adding Priority Scheduling
github.com/vllm-project/vllm - apatke opened this pull request 4 months ago
github.com/vllm-project/vllm - apatke opened this pull request 4 months ago
[Bug]: qwen1.5-32b-chat no response
github.com/vllm-project/vllm - linpan opened this issue 4 months ago
github.com/vllm-project/vllm - linpan opened this issue 4 months ago
Add support for multi-node on CI
github.com/vllm-project/vllm - khluu opened this pull request 4 months ago
github.com/vllm-project/vllm - khluu opened this pull request 4 months ago
[Bugfix] Support `eos_token_id` from `config.json`
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
[Bug]: Chunked prefill vs. non-chunked output is different for a long prompt
github.com/vllm-project/vllm - felixzhu555 opened this issue 4 months ago
github.com/vllm-project/vllm - felixzhu555 opened this issue 4 months ago
[Bugfix][CI/Build][Hardware][AMD] Install matching torchvision to fix AMD tests
github.com/vllm-project/vllm - mawong-amd opened this pull request 4 months ago
github.com/vllm-project/vllm - mawong-amd opened this pull request 4 months ago
add FAQ doc under 'serving'
github.com/vllm-project/vllm - llmpros opened this pull request 4 months ago
github.com/vllm-project/vllm - llmpros opened this pull request 4 months ago
[Usage]: can I save log to a file?
github.com/vllm-project/vllm - chenchunhui97 opened this issue 4 months ago
github.com/vllm-project/vllm - chenchunhui97 opened this issue 4 months ago
[Hardware][Intel CPU]use ipex varlen attention to compute prompts for better performance.
github.com/vllm-project/vllm - jikunshang opened this pull request 4 months ago
github.com/vllm-project/vllm - jikunshang opened this pull request 4 months ago
[core][optimization] use a pool of numpy ndarray to hold seq data
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[Kernel] Add per-tensor and per-token AZP epilogues
github.com/vllm-project/vllm - ProExpertProg opened this pull request 4 months ago
github.com/vllm-project/vllm - ProExpertProg opened this pull request 4 months ago
[ Misc ] Refactor w8a8 to use `process_weights_after_load` (Simplify Weight Loading)
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
[Kernel] Raise an exception in MoE kernel if the batch size is larger then 65k
github.com/vllm-project/vllm - comaniac opened this pull request 4 months ago
github.com/vllm-project/vllm - comaniac opened this pull request 4 months ago
Virtual Office Hours: July 9 and July 25
github.com/vllm-project/vllm - mgoin opened this issue 4 months ago
github.com/vllm-project/vllm - mgoin opened this issue 4 months ago
[Bugfix] Only add `Attention.kv_scale` if kv cache quantization is enabled
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
[Frontend]openai base64 embedding: remove the message blocker for base64 embedding
github.com/vllm-project/vllm - llmpros opened this pull request 4 months ago
github.com/vllm-project/vllm - llmpros opened this pull request 4 months ago
[Kernel] Add punica dimensions for Granite 3b and 8b
github.com/vllm-project/vllm - joerunde opened this pull request 4 months ago
github.com/vllm-project/vllm - joerunde opened this pull request 4 months ago
[Bugfix] fix missing last itl in openai completions benchmark
github.com/vllm-project/vllm - mcalman opened this pull request 4 months ago
github.com/vllm-project/vllm - mcalman opened this pull request 4 months ago
[Misc] Extend vLLM Metrics logging API
github.com/vllm-project/vllm - SolitaryThinker opened this pull request 4 months ago
github.com/vllm-project/vllm - SolitaryThinker opened this pull request 4 months ago
[Frontend] Support for chat completions input in the tokenize endpoint
github.com/vllm-project/vllm - sasha0552 opened this pull request 4 months ago
github.com/vllm-project/vllm - sasha0552 opened this pull request 4 months ago
[ Bugfix ] Enabling Loading Models With Fused QKV/MLP on Disk with FP8
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 4 months ago
[Model] Initial support for BLIP-2
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 4 months ago
[Kernel] Prototype integration of bytedance/flux kernels
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 4 months ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 4 months ago
[Bug]: FP8 checkpoints with fused linear modules fail to load scales correctly
github.com/vllm-project/vllm - mgoin opened this issue 4 months ago
github.com/vllm-project/vllm - mgoin opened this issue 4 months ago
[Bugfix] Fix precisions in Gemma 1
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
[Lora] Use safetensor keys instead of adapter_config.json to find unexpected modules.
github.com/vllm-project/vllm - rkooo567 opened this pull request 4 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 4 months ago
[Bug]: TRACKING ISSUE: CUDA OOM with Logprobs
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 4 months ago
[Bug]: TRACKING ISSUE: `AsyncEngineDeadError`
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 4 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 4 months ago
[Bug]: Inconsistent Responses with VLLM When Batch Size > 1 even temperature = 0
github.com/vllm-project/vllm - gjgjos opened this issue 4 months ago
github.com/vllm-project/vllm - gjgjos opened this issue 4 months ago