Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Bug]: vLLM was installed and used without issues, but recently, during more frequent usage, it suddenly throws an error on a particular request and stops working entirely. Even nvidia-smi cannot return any output. The log is as follows:
github.com/vllm-project/vllm - alexchenyu opened this issue 5 days ago
github.com/vllm-project/vllm - alexchenyu opened this issue 5 days ago
[Bug]: 当vLLM 部署实现 OpenAI API,并且生成模型使用llama 3 8b instruct做RAG任务时,模型生成不停
github.com/vllm-project/vllm - asilverlight opened this issue 5 days ago
github.com/vllm-project/vllm - asilverlight opened this issue 5 days ago
[Bug]: Installed vllm successfully for AMD MI60 but inference is failing
github.com/vllm-project/vllm - Said-Akbar opened this issue 5 days ago
github.com/vllm-project/vllm - Said-Akbar opened this issue 5 days ago
[Usage]: [rank0]: AttributeError: 'LLMEngine' object has no attribute 'driver_worker'
github.com/vllm-project/vllm - xuyuemei opened this issue 5 days ago
github.com/vllm-project/vllm - xuyuemei opened this issue 5 days ago
[CI] Fix merge conflict
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 5 days ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 5 days ago
[Bug]: KeyError during loading of Mixtral 8x22B in FP8
github.com/vllm-project/vllm - IowaSovereign opened this issue 6 days ago
github.com/vllm-project/vllm - IowaSovereign opened this issue 6 days ago
[help wanted]: write tests for python-only development
github.com/vllm-project/vllm - youkaichao opened this issue 6 days ago
github.com/vllm-project/vllm - youkaichao opened this issue 6 days ago
[RFC]: Let every model be a reward model/embedding model for PRMs
github.com/vllm-project/vllm - zhuzilin opened this issue 6 days ago
github.com/vllm-project/vllm - zhuzilin opened this issue 6 days ago
[Bug]: different generation result when changing parameters using `copy_` and `=` method
github.com/vllm-project/vllm - hxdtest opened this issue 6 days ago
github.com/vllm-project/vllm - hxdtest opened this issue 6 days ago
[Bug]: api_server.py: error: argument --tool-call-parser: invalid choice: 'llama3_json' (choose from 'mistral', 'hermes')
github.com/vllm-project/vllm - joestein-ssc opened this issue 6 days ago
github.com/vllm-project/vllm - joestein-ssc opened this issue 6 days ago
[Bugfix] Update grafana dashboard
github.com/vllm-project/vllm - zhan9san opened this pull request 6 days ago
github.com/vllm-project/vllm - zhan9san opened this pull request 6 days ago
[Bug]: vllm mistralai--Codestral-22B-v0.1 response is truncated
github.com/vllm-project/vllm - Fly-Pluche opened this issue 6 days ago
github.com/vllm-project/vllm - Fly-Pluche opened this issue 6 days ago
[Misc][Installation] Improve source installation script and related documentation
github.com/vllm-project/vllm - cermeng opened this pull request 6 days ago
github.com/vllm-project/vllm - cermeng opened this pull request 6 days ago
[Bug]: Process group watchdog thread terminated with exception: CUDA error: an illegal memory access was encountered
github.com/vllm-project/vllm - eyuansu62 opened this issue 6 days ago
github.com/vllm-project/vllm - eyuansu62 opened this issue 6 days ago
[Bug]: latest docker build (0.6.2) got error due to VLLM_MAX_SIZE_MB
github.com/vllm-project/vllm - ZJLi2013 opened this issue 6 days ago
github.com/vllm-project/vllm - ZJLi2013 opened this issue 6 days ago
[Bug]: Failed to pickle inputs of failed execution: CUDA error: an illegal memory access was encountered
github.com/vllm-project/vllm - Clint-chan opened this issue 6 days ago
github.com/vllm-project/vllm - Clint-chan opened this issue 6 days ago
[Installation]: vllm installation error
github.com/vllm-project/vllm - leoneyar opened this issue 6 days ago
github.com/vllm-project/vllm - leoneyar opened this issue 6 days ago
[Bug]: Hermes 2 Pro Tool parser could not locate tool call start/end tokens in the tokenizer!
github.com/vllm-project/vllm - LuckLittleBoy opened this issue 6 days ago
github.com/vllm-project/vllm - LuckLittleBoy opened this issue 6 days ago
[Model] VLM2Vec, the first multimodal embedding model in vLLM
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 6 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 6 days ago
[core] try to remove seq group from core
github.com/vllm-project/vllm - youkaichao opened this pull request 6 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 days ago
[Quantization][TPU] `compressed-tensors` integration for TPU
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 days ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 days ago
[misc] Fine-grained CustomOp enabling mechanism
github.com/vllm-project/vllm - ProExpertProg opened this pull request 6 days ago
github.com/vllm-project/vllm - ProExpertProg opened this pull request 6 days ago
[Bugfix] Fix support for dimension like integers and ScalarType
github.com/vllm-project/vllm - bnellnm opened this pull request 6 days ago
github.com/vllm-project/vllm - bnellnm opened this pull request 6 days ago
[SpecDec] Remove Batch Expansion (2/3)
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 6 days ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 6 days ago
[CI/Build] Adds a test for multi step with TPUs
github.com/vllm-project/vllm - allenwang28 opened this pull request 7 days ago
github.com/vllm-project/vllm - allenwang28 opened this pull request 7 days ago
[Frontend] merge beam search implementations
github.com/vllm-project/vllm - LunrEclipse opened this pull request 7 days ago
github.com/vllm-project/vllm - LunrEclipse opened this pull request 7 days ago
[bugfix] fix f-string for error
github.com/vllm-project/vllm - prashantgupta24 opened this pull request 7 days ago
github.com/vllm-project/vllm - prashantgupta24 opened this pull request 7 days ago
[New Model]: meta-llama/Llama-Guard-3-1B
github.com/vllm-project/vllm - ayeganov opened this issue 7 days ago
github.com/vllm-project/vllm - ayeganov opened this issue 7 days ago
[Misc] Add environment variables collection in collect_env.py tool
github.com/vllm-project/vllm - ycool opened this pull request 7 days ago
github.com/vllm-project/vllm - ycool opened this pull request 7 days ago
[Model] Support Mamba2 (Codestral Mamba)
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 7 days ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 7 days ago
[Feature] [Spec decode]: Combine chunked prefill with speculative decoding
github.com/vllm-project/vllm - NickLucche opened this pull request 7 days ago
github.com/vllm-project/vllm - NickLucche opened this pull request 7 days ago
[Bug]: Out of memory with large multi-step and large gpu-memory-utilization values - `--num-scheduler-steps 16 --gpu-memory-utilization 0.941`
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this issue 7 days ago
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this issue 7 days ago
[Doc] Remove outdated comment to avoid misunderstanding
github.com/vllm-project/vllm - homeffjy opened this pull request 7 days ago
github.com/vllm-project/vllm - homeffjy opened this pull request 7 days ago
[Bugfix]Fix MiniCPM's LoRA bug
github.com/vllm-project/vllm - jeejeelee opened this pull request 7 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 7 days ago
Fixes a typo about 'max_decode_seq_len' which causes crashes with cuda graph.
github.com/vllm-project/vllm - sighingnow opened this pull request 7 days ago
github.com/vllm-project/vllm - sighingnow opened this pull request 7 days ago
[Bug]: Simultaneous mm calls lead to permanently degraded performance.
github.com/vllm-project/vllm - SeanIsYoung opened this issue 7 days ago
github.com/vllm-project/vllm - SeanIsYoung opened this issue 7 days ago
[Bug]: MiniCPM3-4B is support lora by --enable-lora ?
github.com/vllm-project/vllm - ML-GCN opened this issue 7 days ago
github.com/vllm-project/vllm - ML-GCN opened this issue 7 days ago
`seed_everything` doesn't handle HPU
github.com/vllm-project/vllm - SanjuCSudhakaran opened this pull request 7 days ago
github.com/vllm-project/vllm - SanjuCSudhakaran opened this pull request 7 days ago
[Bug]: VLLM doesn't support LoRa with config `modules_to_save`
github.com/vllm-project/vllm - fahadh4ilyas opened this issue 7 days ago
github.com/vllm-project/vllm - fahadh4ilyas opened this issue 7 days ago
[Bugfix][CI/Build][Hardware][AMD] Shard ID parameters in AMD tests running parallel jobs
github.com/vllm-project/vllm - hissu-hyvarinen opened this pull request 7 days ago
github.com/vllm-project/vllm - hissu-hyvarinen opened this pull request 7 days ago
[CI] add `ignore_eos` for `benchmark_serving.py`
github.com/vllm-project/vllm - jikunshang opened this pull request 7 days ago
github.com/vllm-project/vllm - jikunshang opened this pull request 7 days ago
[Bugfix] Fix priority in multiprocessing engine
github.com/vllm-project/vllm - schoennenbeck opened this pull request 7 days ago
github.com/vllm-project/vllm - schoennenbeck opened this pull request 7 days ago
[Bugfix] fix error due to an uninitialized tokenizer when using `skip_tokenizer_init` with `num_scheduler_steps`
github.com/vllm-project/vllm - junstar92 opened this pull request 7 days ago
github.com/vllm-project/vllm - junstar92 opened this pull request 7 days ago
[Misc][LoRA] Support loading LoRA weights for target_modules in reg format
github.com/vllm-project/vllm - jeejeelee opened this pull request 7 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 7 days ago
[Usage]: Manually Increasing inference time
github.com/vllm-project/vllm - Playerrrrr opened this issue 7 days ago
github.com/vllm-project/vllm - Playerrrrr opened this issue 7 days ago
[Usage]: VLLM 0.6.2 includes vllm-flash-attn, is it no longer necessary to install flash-attn separately?
github.com/vllm-project/vllm - Rssevenyu opened this issue 7 days ago
github.com/vllm-project/vllm - Rssevenyu opened this issue 7 days ago
[Bug]: priority scheduling doesn't work on vllm-0.6.3.dev152+gde895f16.d20241010
github.com/vllm-project/vllm - tonyaw opened this issue 7 days ago
github.com/vllm-project/vllm - tonyaw opened this issue 7 days ago
[Usage]: blip2 inference code
github.com/vllm-project/vllm - zhaoxueqi6666 opened this issue 7 days ago
github.com/vllm-project/vllm - zhaoxueqi6666 opened this issue 7 days ago
[RFC]: Make device agnostic for diverse hardware support
github.com/vllm-project/vllm - wangshuai09 opened this issue 7 days ago
github.com/vllm-project/vllm - wangshuai09 opened this issue 7 days ago
[CI/Build] mypy: Resolve some errors from checking vllm/engine
github.com/vllm-project/vllm - russellb opened this pull request 7 days ago
github.com/vllm-project/vllm - russellb opened this pull request 7 days ago
[Feature]: Improve Logging For Embedding Models
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 7 days ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 7 days ago
[Frontend, Core] Adding stop and stop_token_ids for beam search.
github.com/vllm-project/vllm - nFunctor opened this pull request 7 days ago
github.com/vllm-project/vllm - nFunctor opened this pull request 7 days ago
[Bug]: AsyncLLMEngine stuck on a single too long request
github.com/vllm-project/vllm - rickyyx opened this issue 7 days ago
github.com/vllm-project/vllm - rickyyx opened this issue 7 days ago
[ci/build] Add placeholder command for custom models test and add comments
github.com/vllm-project/vllm - khluu opened this pull request 7 days ago
github.com/vllm-project/vllm - khluu opened this pull request 7 days ago
[misc] hide best_of from engine
github.com/vllm-project/vllm - youkaichao opened this pull request 7 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 7 days ago
[Bug]: Streaming response fails after one token (0.5.3.post1)
github.com/vllm-project/vllm - NeonDaniel opened this issue 8 days ago
github.com/vllm-project/vllm - NeonDaniel opened this issue 8 days ago
[CI/Build] Adopt Mergify for auto-labeling PRs
github.com/vllm-project/vllm - russellb opened this pull request 8 days ago
github.com/vllm-project/vllm - russellb opened this pull request 8 days ago
[torch.compile] generic decorators
github.com/vllm-project/vllm - youkaichao opened this pull request 8 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 8 days ago
[Doc][Neuron] add note to neuron documentation about resolving triton issue
github.com/vllm-project/vllm - omrishiv opened this pull request 8 days ago
github.com/vllm-project/vllm - omrishiv opened this pull request 8 days ago
[Usage]: running gated models offline
github.com/vllm-project/vllm - SamuelBG13 opened this issue 8 days ago
github.com/vllm-project/vllm - SamuelBG13 opened this issue 8 days ago
[Bugfix][CI/Build] Fix docker build where CUDA archs < 7.0 are being detected
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 8 days ago
github.com/vllm-project/vllm - LucasWilkinson opened this pull request 8 days ago
[Bug]: new beam search implementation ignores stop conditions
github.com/vllm-project/vllm - nFunctor opened this issue 8 days ago
github.com/vllm-project/vllm - nFunctor opened this issue 8 days ago
[Misc] Standardize RoPE handling for Qwen2-VL
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 8 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 8 days ago
[Model] Add Qwen2-Audio model support
github.com/vllm-project/vllm - faychu opened this pull request 8 days ago
github.com/vllm-project/vllm - faychu opened this pull request 8 days ago
[Doc]: The relationship between FlashAttentionBackend and paged_attention_kernel
github.com/vllm-project/vllm - zhaotyer opened this issue 8 days ago
github.com/vllm-project/vllm - zhaotyer opened this issue 8 days ago
[Kernel] adding fused moe kernel config for L40S TP4
github.com/vllm-project/vllm - bringlein opened this pull request 8 days ago
github.com/vllm-project/vllm - bringlein opened this pull request 8 days ago
[Model] Add GLM-4v support and meet vllm==0.6.2
github.com/vllm-project/vllm - sixsixcoder opened this pull request 8 days ago
github.com/vllm-project/vllm - sixsixcoder opened this pull request 8 days ago
Questions about the inference performance of the GPTQ model
github.com/vllm-project/vllm - Rssevenyu opened this issue 8 days ago
github.com/vllm-project/vllm - Rssevenyu opened this issue 8 days ago
[Model] support input image embedding for minicpmv
github.com/vllm-project/vllm - whyiug opened this pull request 8 days ago
github.com/vllm-project/vllm - whyiug opened this pull request 8 days ago
[Bug]: AssertionError When deploy API serve of Qwen2-VL-72B in Docker
github.com/vllm-project/vllm - FBR65 opened this issue 8 days ago
github.com/vllm-project/vllm - FBR65 opened this issue 8 days ago
[Misc] Fix sampling from sonnet for long context case
github.com/vllm-project/vllm - Imss27 opened this pull request 8 days ago
github.com/vllm-project/vllm - Imss27 opened this pull request 8 days ago
[issue tracker] make quantization compatible with dynamo dynamic shape
github.com/vllm-project/vllm - youkaichao opened this issue 8 days ago
github.com/vllm-project/vllm - youkaichao opened this issue 8 days ago
[Misc] Collect model support info in a single process per model
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 8 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 8 days ago
[Bug]: AssertionError: Error in memory profiling. Initial free memory 85470478336, current free memory 85470478336. This happens when the GPU memory was not properly cleaned up before initializing the vLLM instance. [rank0]:[W1010 16:28:18.581149478 CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors] ERROR 10-10 16:28:20 api_server.py:188] RPCServer process died before responding to readiness probe
github.com/vllm-project/vllm - imrankh46 opened this issue 8 days ago
github.com/vllm-project/vllm - imrankh46 opened this issue 8 days ago
[Bug]: Qwen2.5-72B-Instruct压测出现AsyncLLMEngine has failed, terminating server process
github.com/vllm-project/vllm - WangJianQ-0118 opened this issue 8 days ago
github.com/vllm-project/vllm - WangJianQ-0118 opened this issue 8 days ago
vLLM ARM Enablement for AARCH64 CPUs
github.com/vllm-project/vllm - sanketkaleoss opened this pull request 8 days ago
github.com/vllm-project/vllm - sanketkaleoss opened this pull request 8 days ago
[Bug]: Could not `pip install vllm` inside dockerfile after certain commit in `main` branch
github.com/vllm-project/vllm - fahadh4ilyas opened this issue 8 days ago
github.com/vllm-project/vllm - fahadh4ilyas opened this issue 8 days ago
[VLM] Enable overriding whether post layernorm is used in vision encoder
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 8 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 8 days ago
[BugFix] Fix tool call finish reason in streaming case
github.com/vllm-project/vllm - maxdebayser opened this pull request 8 days ago
github.com/vllm-project/vllm - maxdebayser opened this pull request 8 days ago
[Bugfix] Sets `is_first_step_output` for TPUModelRunner
github.com/vllm-project/vllm - allenwang28 opened this pull request 9 days ago
github.com/vllm-project/vllm - allenwang28 opened this pull request 9 days ago
Bump actions/setup-python from 3 to 5
github.com/vllm-project/vllm - dependabot[bot] opened this pull request 9 days ago
github.com/vllm-project/vllm - dependabot[bot] opened this pull request 9 days ago
[RFC]: Adopt mergify for auto-labeling PRs
github.com/vllm-project/vllm - russellb opened this issue 9 days ago
github.com/vllm-project/vllm - russellb opened this issue 9 days ago
[Performance]: phi 3.5 vision model consuming high CPU RAM and the process getting killed
github.com/vllm-project/vllm - kuladeephx opened this issue 9 days ago
github.com/vllm-project/vllm - kuladeephx opened this issue 9 days ago
[Misc]: Repeat the sample sonnet.txt contents to accomodate large seq lengths in benchmarking
github.com/vllm-project/vllm - Bihan opened this issue 9 days ago
github.com/vllm-project/vllm - Bihan opened this issue 9 days ago
[Installation]: pip install vllm-0.6.2.zip err:setuptools-scm was unable to detect version for /tmp/pip-req-build-7ptioibj
github.com/vllm-project/vllm - uRENu opened this issue 9 days ago
github.com/vllm-project/vllm - uRENu opened this issue 9 days ago
[Core]: (2/N) Support prefill only models by Workflow Defined Engine - Prefill only scheduler
github.com/vllm-project/vllm - noooop opened this pull request 9 days ago
github.com/vllm-project/vllm - noooop opened this pull request 9 days ago
[Bugfix] Fix lora loading for Compressed Tensors in #9120
github.com/vllm-project/vllm - fahadh4ilyas opened this pull request 9 days ago
github.com/vllm-project/vllm - fahadh4ilyas opened this pull request 9 days ago
[TPU] Fix memory profiling
github.com/vllm-project/vllm - WoosukKwon opened this pull request 9 days ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 9 days ago
[Bug]: quantization does not work with dummy weight format
github.com/vllm-project/vllm - youkaichao opened this issue 9 days ago
github.com/vllm-project/vllm - youkaichao opened this issue 9 days ago
[Bug]: Extreme low throughput when using pipeline parallelism when Batch Size(running req) is small
github.com/vllm-project/vllm - AlvL1225 opened this issue 9 days ago
github.com/vllm-project/vllm - AlvL1225 opened this issue 9 days ago
[Bug]: Error Running Qwen2.5-7B-Instruct on CPU
github.com/vllm-project/vllm - xiayouran opened this issue 9 days ago
github.com/vllm-project/vllm - xiayouran opened this issue 9 days ago
[Model] Remap FP8 kv_scale in CommandR and DBRX
github.com/vllm-project/vllm - hliuca opened this pull request 9 days ago
github.com/vllm-project/vllm - hliuca opened this pull request 9 days ago
Update link to KServe deployment guide
github.com/vllm-project/vllm - terrytangyuan opened this pull request 9 days ago
github.com/vllm-project/vllm - terrytangyuan opened this pull request 9 days ago
[Bug]: Port binding keep failing due to unnecessary code
github.com/vllm-project/vllm - James4Ever0 opened this issue 9 days ago
github.com/vllm-project/vllm - James4Ever0 opened this issue 9 days ago