Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Doc]: AutoAWQ quantization example fails
github.com/vllm-project/vllm - stas00 opened this issue 2 months ago
github.com/vllm-project/vllm - stas00 opened this issue 2 months ago
[Frontend] Improve Startup Failure UX
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 2 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 2 months ago
[Multi-step] Remove redundant CPU to GPU transfer for non-last rank PP/TP
github.com/vllm-project/vllm - SolitaryThinker opened this pull request 2 months ago
github.com/vllm-project/vllm - SolitaryThinker opened this pull request 2 months ago
[Bug]: Unable to use fp8 kv cache with chunked prefill on ampere
github.com/vllm-project/vllm - w013nad opened this issue 2 months ago
github.com/vllm-project/vllm - w013nad opened this issue 2 months ago
[Hardware][Intel GPU] refactor xpu_model_runner, fix xpu tensor parallel
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
[Model] Fix Phi-3.5-vision-instruct 'num_crops' issue
github.com/vllm-project/vllm - zifeitong opened this pull request 2 months ago
github.com/vllm-project/vllm - zifeitong opened this pull request 2 months ago
Fix ShardedStateLoader for vllm fp8 quantization
github.com/vllm-project/vllm - sfc-gh-zhwang opened this pull request 2 months ago
github.com/vllm-project/vllm - sfc-gh-zhwang opened this pull request 2 months ago
[Spec Decoding] Use target model max length as default for draft model
github.com/vllm-project/vllm - njhill opened this pull request 2 months ago
github.com/vllm-project/vllm - njhill opened this pull request 2 months ago
[ci] Cleanup & refactor Dockerfile to pass different Python versions and sccache bucket
github.com/vllm-project/vllm - khluu opened this pull request 2 months ago
github.com/vllm-project/vllm - khluu opened this pull request 2 months ago
[Kernel] (2/N) Machete - Integrate into CompressedTensorsWNA16 and GPTQMarlin
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 2 months ago
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 2 months ago
[Usage]: how to abort request?
github.com/vllm-project/vllm - TaxiF-D opened this issue 2 months ago
github.com/vllm-project/vllm - TaxiF-D opened this issue 2 months ago
[Bug]: vLLM inconsistently crashes on startup for multinode cluster
github.com/vllm-project/vllm - jgreer013 opened this issue 2 months ago
github.com/vllm-project/vllm - jgreer013 opened this issue 2 months ago
[BugFix] Avoid premature async generator exit and raise all exception variations
github.com/vllm-project/vllm - njhill opened this pull request 2 months ago
github.com/vllm-project/vllm - njhill opened this pull request 2 months ago
[Kernel] Support prefill and decode attention kernel in parallel
github.com/vllm-project/vllm - cassiewilliam opened this pull request 2 months ago
github.com/vllm-project/vllm - cassiewilliam opened this pull request 2 months ago
[Usage]: Periodic snapshots for spot instances
github.com/vllm-project/vllm - ma9o opened this issue 2 months ago
github.com/vllm-project/vllm - ma9o opened this issue 2 months ago
[RFC]: Add Ascend NPU as a new backend
github.com/vllm-project/vllm - wangshuai09 opened this issue 2 months ago
github.com/vllm-project/vllm - wangshuai09 opened this issue 2 months ago
[WIP][Model][Kernel][Bugfix] Commits for new MSFT PhiMoE model
github.com/vllm-project/vllm - wenxcs opened this pull request 2 months ago
github.com/vllm-project/vllm - wenxcs opened this pull request 2 months ago
[Feature]: Overlap model weight loading and model prefill
github.com/vllm-project/vllm - candyzone opened this issue 2 months ago
github.com/vllm-project/vllm - candyzone opened this issue 2 months ago
[Usage]: Qwen2 GGUF model can't run successfully
github.com/vllm-project/vllm - QB-Chen opened this issue 2 months ago
github.com/vllm-project/vllm - QB-Chen opened this issue 2 months ago
[OpenVINO] Updated documentation
github.com/vllm-project/vllm - ilya-lavrenov opened this pull request 2 months ago
github.com/vllm-project/vllm - ilya-lavrenov opened this pull request 2 months ago
[Hardware] [Intel GPU] refactor xpu worker/executor
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
[Intel GPU] fix xpu not support punica kernel (which use torch.library.custom_op)
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
[Feature]: Pipeline Parallelism support for the Vision Language Models
github.com/vllm-project/vllm - Manikandan-Thangaraj-ZS0321 opened this issue 2 months ago
github.com/vllm-project/vllm - Manikandan-Thangaraj-ZS0321 opened this issue 2 months ago
[Misc]: how to add tests for new backends?
github.com/vllm-project/vllm - ilya-lavrenov opened this issue 2 months ago
github.com/vllm-project/vllm - ilya-lavrenov opened this issue 2 months ago
[Bug]: when `echo=True`, vllm will append chat template(`assistant`) after the last message
github.com/vllm-project/vllm - DIYer22 opened this issue 2 months ago
github.com/vllm-project/vllm - DIYer22 opened this issue 2 months ago
[Usage, bug]: vLLM Docker | ValueError: OpenTelemetry packages must be installed before configuring 'otlp_traces_endpoint' during vLLM startup
github.com/vllm-project/vllm - vipulgote1999 opened this issue 2 months ago
github.com/vllm-project/vllm - vipulgote1999 opened this issue 2 months ago
[Feature]: Please Support FATRelu
github.com/vllm-project/vllm - ForcewithMe66 opened this issue 2 months ago
github.com/vllm-project/vllm - ForcewithMe66 opened this issue 2 months ago
[Usage]: how to use guided_decoding_backend?
github.com/vllm-project/vllm - estuday opened this issue 2 months ago
github.com/vllm-project/vllm - estuday opened this issue 2 months ago
[misc][cuda] add warning for pynvml user
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
[misc] add nvidia related library in collect env
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
[Core] Refactor executor classes to make it easier to inherit GPUExecutor
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
github.com/vllm-project/vllm - jikunshang opened this pull request 2 months ago
[Model][Bugfix] Add glm-4v Model and Fix bnb Quantization Issue
github.com/vllm-project/vllm - alexw994 opened this pull request 2 months ago
github.com/vllm-project/vllm - alexw994 opened this pull request 2 months ago
[Bug]: Streaming API: Abort functionality not working as expected
github.com/vllm-project/vllm - jonzhep opened this issue 2 months ago
github.com/vllm-project/vllm - jonzhep opened this issue 2 months ago
[XPU] fallback to native implementation for xpu custom op
github.com/vllm-project/vllm - jianyizh opened this pull request 2 months ago
github.com/vllm-project/vllm - jianyizh opened this pull request 2 months ago
[Bug]: Mismatch in the number of image tokens and placeholders during batch inference
github.com/vllm-project/vllm - sayanbiswas59 opened this issue 2 months ago
github.com/vllm-project/vllm - sayanbiswas59 opened this issue 2 months ago
[Bug]: when using llama-3.1-70b-instruct for inference, input with large number of tokens(>8k) will result in endless output
github.com/vllm-project/vllm - YinSonglin1997 opened this issue 2 months ago
github.com/vllm-project/vllm - YinSonglin1997 opened this issue 2 months ago
[ci] Install Buildkite test suite analysis
github.com/vllm-project/vllm - khluu opened this pull request 2 months ago
github.com/vllm-project/vllm - khluu opened this pull request 2 months ago
[Frontend][Core] Move logits processor construction to engine
github.com/vllm-project/vllm - joerunde opened this pull request 2 months ago
github.com/vllm-project/vllm - joerunde opened this pull request 2 months ago
[Bugfix] use StoreBoolean instead of type=bool for --disable-logprobs-during-spec-decoding
github.com/vllm-project/vllm - tjohnson31415 opened this pull request 2 months ago
github.com/vllm-project/vllm - tjohnson31415 opened this pull request 2 months ago
[Bugfix] Don't disable existing loggers
github.com/vllm-project/vllm - a-ys opened this pull request 2 months ago
github.com/vllm-project/vllm - a-ys opened this pull request 2 months ago
[Core] Add `AttentionState` abstraction
github.com/vllm-project/vllm - Yard1 opened this pull request 2 months ago
github.com/vllm-project/vllm - Yard1 opened this pull request 2 months ago
[Feature]: GGUF quantization with tensor parallelism
github.com/vllm-project/vllm - chrismrutherford opened this issue 2 months ago
github.com/vllm-project/vllm - chrismrutherford opened this issue 2 months ago
[Misc] Allow for unsigned zero NAN representation in ScalarType
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 2 months ago
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 2 months ago
[TPU] Fix redundant input tensor cloning
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
[doc] fix doc build error caused by msgspec
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
[Kernel] Add opcheck tests for punica kernels
github.com/vllm-project/vllm - bnellnm opened this pull request 2 months ago
github.com/vllm-project/vllm - bnellnm opened this pull request 2 months ago
Virtual Office Hours: August 8 and August 21
github.com/vllm-project/vllm - mgoin opened this issue 2 months ago
github.com/vllm-project/vllm - mgoin opened this issue 2 months ago
[Feature]: json_schema support in OpenAI compat server
github.com/vllm-project/vllm - rockwotj opened this issue 2 months ago
github.com/vllm-project/vllm - rockwotj opened this issue 2 months ago
[Frontend] add json_schema support from OpenAI protocol
github.com/vllm-project/vllm - rockwotj opened this pull request 2 months ago
github.com/vllm-project/vllm - rockwotj opened this pull request 2 months ago
[Bug]: Error happened with Large scale requests based on 0.5.4 vllm
github.com/vllm-project/vllm - TangJiakai opened this issue 2 months ago
github.com/vllm-project/vllm - TangJiakai opened this issue 2 months ago
[Core] Logprobs support in Multi-step
github.com/vllm-project/vllm - afeldman-nm opened this pull request 2 months ago
github.com/vllm-project/vllm - afeldman-nm opened this pull request 2 months ago
[Kernel/Model] Migrate mamba_ssm and causal_conv1d kernels to vLLM
github.com/vllm-project/vllm - mzusman opened this pull request 2 months ago
github.com/vllm-project/vllm - mzusman opened this pull request 2 months ago
[Bug]: Gemma2 models inference using vLLM 0.5.4 produces incorrect responses
github.com/vllm-project/vllm - eduardzl opened this issue 2 months ago
github.com/vllm-project/vllm - eduardzl opened this issue 2 months ago
fix fused_moe get_config_file_name func bytes decode error
github.com/vllm-project/vllm - BBuf opened this pull request 2 months ago
github.com/vllm-project/vllm - BBuf opened this pull request 2 months ago
[Misc] Refactor a Dockerfile for CPU backend
github.com/vllm-project/vllm - PHILO-HE opened this pull request 2 months ago
github.com/vllm-project/vllm - PHILO-HE opened this pull request 2 months ago
[WIP][SPMD] Support spec decoding
github.com/vllm-project/vllm - rkooo567 opened this pull request 2 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 2 months ago
[Misc] Use torch.compile for GemmaRMSNorm
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
[Frontend] Publish Prometheus metrics in run_batch API
github.com/vllm-project/vllm - pooyadavoodi opened this pull request 2 months ago
github.com/vllm-project/vllm - pooyadavoodi opened this pull request 2 months ago
[Bugfix] Fix run_batch logger
github.com/vllm-project/vllm - pooyadavoodi opened this pull request 2 months ago
github.com/vllm-project/vllm - pooyadavoodi opened this pull request 2 months ago
[Misc] Remove Gemma RoPE
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
[Misc] Refactor Llama3 RoPE initialization
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
[TPU] Optimize RoPE forward_native2
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
[Misc]: TTFT profiling with respect to prompt length
github.com/vllm-project/vllm - luowenjie14 opened this issue 2 months ago
github.com/vllm-project/vllm - luowenjie14 opened this issue 2 months ago
[TPU] Use mark_dynamic only for dummy run
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
[Feature]: Exit on failures
github.com/vllm-project/vllm - pseudotensor opened this issue 2 months ago
github.com/vllm-project/vllm - pseudotensor opened this issue 2 months ago
[Bug]: assert num_new_tokens > 0 crashes entire worker instead of just failing single API call
github.com/vllm-project/vllm - pseudotensor opened this issue 2 months ago
github.com/vllm-project/vllm - pseudotensor opened this issue 2 months ago
[Encoder decoder] Add cuda graph support during decoding for encoder-decoder models
github.com/vllm-project/vllm - sroy745 opened this pull request 2 months ago
github.com/vllm-project/vllm - sroy745 opened this pull request 2 months ago
[TPU] Skip creating empty tensor
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 2 months ago
[ci][test] allow longer wait time for api server
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
[Bug]: OpenGVLab/InternVL-Chat-V1-5 never stops properly
github.com/vllm-project/vllm - pseudotensor opened this issue 2 months ago
github.com/vllm-project/vllm - pseudotensor opened this issue 2 months ago
[Documentation request]: Add documentation on lossless guarantees of speculative decoding in vLLM
github.com/vllm-project/vllm - jmkuebler opened this issue 2 months ago
github.com/vllm-project/vllm - jmkuebler opened this issue 2 months ago
[Misc]Fix BitAndBytes exception messages
github.com/vllm-project/vllm - jeejeelee opened this pull request 2 months ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 2 months ago
[Bug]: AttributeError: Model BitsAndBytesModelLoader does not support BitsAndBytes quantization yet
github.com/vllm-project/vllm - yananchen1989 opened this issue 2 months ago
github.com/vllm-project/vllm - yananchen1989 opened this issue 2 months ago
[Doc]: Has the offline chat inference function been updated?
github.com/vllm-project/vllm - waylonli opened this issue 2 months ago
github.com/vllm-project/vllm - waylonli opened this issue 2 months ago
[Bug]: vllm server 部署base和lora模型后,请求lora模型失败
github.com/vllm-project/vllm - cat-knight opened this issue 2 months ago
github.com/vllm-project/vllm - cat-knight opened this issue 2 months ago
[ci][test] fix engine/logger test
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
[core][misc] update libcudart finding
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 2 months ago
[Performance]: Block manager v2 has low throughput with prefix caching warmup
github.com/vllm-project/vllm - comaniac opened this issue 2 months ago
github.com/vllm-project/vllm - comaniac opened this issue 2 months ago
[Bugfix] Clear engine reference in AsyncEngineRPCServer
github.com/vllm-project/vllm - ruisearch42 opened this pull request 2 months ago
github.com/vllm-project/vllm - ruisearch42 opened this pull request 2 months ago
[Bugfix] Fix custom_ar support check
github.com/vllm-project/vllm - bnellnm opened this pull request 2 months ago
github.com/vllm-project/vllm - bnellnm opened this pull request 2 months ago
[CI] Organizing performance benchmark files
github.com/vllm-project/vllm - KuntaiDu opened this pull request 2 months ago
github.com/vllm-project/vllm - KuntaiDu opened this pull request 2 months ago
[Model] Add UltravoxModel and UltravoxConfig
github.com/vllm-project/vllm - petersalas opened this pull request 2 months ago
github.com/vllm-project/vllm - petersalas opened this pull request 2 months ago
[Bug]: CUDA error: an illegal memory access was encountered when running autofp8
github.com/vllm-project/vllm - sfc-gh-zhwang opened this issue 2 months ago
github.com/vllm-project/vllm - sfc-gh-zhwang opened this issue 2 months ago
Support vLLM single and multi-host TPUs on GKE
github.com/vllm-project/vllm - richardsliu opened this pull request 2 months ago
github.com/vllm-project/vllm - richardsliu opened this pull request 2 months ago
[Bugfix] add >= 1.0 constraint for openai dependency
github.com/vllm-project/vllm - metasyn opened this pull request 2 months ago
github.com/vllm-project/vllm - metasyn opened this pull request 2 months ago
[Model] Align nemotron config with final HF state and fix lm-eval-small
github.com/vllm-project/vllm - mgoin opened this pull request 2 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 2 months ago
.[Build/CI] Enabling passing AMD tests.
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request 2 months ago
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request 2 months ago
[Installation]: container images - too big and need to publish also cpu versions
github.com/vllm-project/vllm - yairyairyair opened this issue 2 months ago
github.com/vllm-project/vllm - yairyairyair opened this issue 2 months ago
[Bug]: ModuleNotFoundError: No module named 'openai.types'
github.com/vllm-project/vllm - metasyn opened this issue 2 months ago
github.com/vllm-project/vllm - metasyn opened this issue 2 months ago
[Feature]: Enable Prefix caching kernel on Pallas for TPU backend
github.com/vllm-project/vllm - miladm opened this issue 2 months ago
github.com/vllm-project/vllm - miladm opened this issue 2 months ago
[MISC] Add prefix cache hit rate to metrics
github.com/vllm-project/vllm - comaniac opened this pull request 2 months ago
github.com/vllm-project/vllm - comaniac opened this pull request 2 months ago
[Build/CI] Adjusting AMD test structure.
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request 2 months ago
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request 2 months ago
[Model] Pipeline parallel support for JAIS
github.com/vllm-project/vllm - mrbesher opened this pull request 2 months ago
github.com/vllm-project/vllm - mrbesher opened this pull request 2 months ago
[CI/Build] Remove failing Minitron from LM Eval Small Test
github.com/vllm-project/vllm - mgoin opened this pull request 2 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 2 months ago
[aDAG] Unflake aDAG + PP tests
github.com/vllm-project/vllm - rkooo567 opened this pull request 2 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 2 months ago
[Misc] Add logging for engine and executor cleanup
github.com/vllm-project/vllm - ruisearch42 opened this pull request 2 months ago
github.com/vllm-project/vllm - ruisearch42 opened this pull request 2 months ago
[Kernel] fix types used in aqlm and ggml kernels to support dynamo
github.com/vllm-project/vllm - bnellnm opened this pull request 2 months ago
github.com/vllm-project/vllm - bnellnm opened this pull request 2 months ago