Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[bitsandbytes]: support read bnb pre-quantized model
github.com/vllm-project/vllm - thesues opened this pull request 4 months ago
github.com/vllm-project/vllm - thesues opened this pull request 4 months ago
[Core] Support sparse KV cache framework
github.com/vllm-project/vllm - chizhang118 opened this pull request 4 months ago
github.com/vllm-project/vllm - chizhang118 opened this pull request 4 months ago
[RFC]: Support sparse KV cache framework
github.com/vllm-project/vllm - chizhang118 opened this issue 4 months ago
github.com/vllm-project/vllm - chizhang118 opened this issue 4 months ago
compressed-tensors accuracy testing
github.com/vllm-project/vllm - dsikka opened this pull request 4 months ago
github.com/vllm-project/vllm - dsikka opened this pull request 4 months ago
[Bug]: Detokenizer stage is causing a significant delay
github.com/vllm-project/vllm - hbikki opened this issue 4 months ago
github.com/vllm-project/vllm - hbikki opened this issue 4 months ago
[Core] Add fault tolerance for `RayTokenizerGroupPool`
github.com/vllm-project/vllm - Yard1 opened this pull request 4 months ago
github.com/vllm-project/vllm - Yard1 opened this pull request 4 months ago
[Bug]: 'int' object has no attribute 'expansion'
github.com/vllm-project/vllm - RobertFischer opened this issue 4 months ago
github.com/vllm-project/vllm - RobertFischer opened this issue 4 months ago
[ci][test] fix ca test in main
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[Doc] Documentation on supported hardware for quantization methods
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
[BugFix] [Kernel] Add Cutlass2x fallback kernels
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 4 months ago
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 4 months ago
[ROCm] add some utility apis and fix some unit test based on torch version
github.com/vllm-project/vllm - hongxiayang opened this pull request 4 months ago
github.com/vllm-project/vllm - hongxiayang opened this pull request 4 months ago
[Frontend] Continuous usage stats in OpenAI completion API
github.com/vllm-project/vllm - jvlunteren opened this pull request 4 months ago
github.com/vllm-project/vllm - jvlunteren opened this pull request 4 months ago
[Feature]: Need CPU inferencing support for non-x86 architectures
github.com/vllm-project/vllm - ChipKerchner opened this issue 4 months ago
github.com/vllm-project/vllm - ChipKerchner opened this issue 4 months ago
[Bug]: KeyError: '/psm_ed65b7e3'
github.com/vllm-project/vllm - randydl opened this issue 4 months ago
github.com/vllm-project/vllm - randydl opened this issue 4 months ago
[Bugfix] fix the bug for lora request
github.com/vllm-project/vllm - InkdyeHuang opened this pull request 4 months ago
github.com/vllm-project/vllm - InkdyeHuang opened this pull request 4 months ago
[Bug]: VLLM usage on AWS Inferentia instances
github.com/vllm-project/vllm - ashutoshsaboo opened this issue 4 months ago
github.com/vllm-project/vllm - ashutoshsaboo opened this issue 4 months ago
[Usage]: has vllm supported encoder-only model such as bge-m3?
github.com/vllm-project/vllm - chenchunhui97 opened this issue 4 months ago
github.com/vllm-project/vllm - chenchunhui97 opened this issue 4 months ago
[Bug]: which torchvision version required
github.com/vllm-project/vllm - tusharraskar opened this issue 4 months ago
github.com/vllm-project/vllm - tusharraskar opened this issue 4 months ago
[Draft] Tensor parallel for CPU
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 4 months ago
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 4 months ago
[Feature]: Support for OpenAIEmbeddings with Langchain
github.com/vllm-project/vllm - yuhon0528 opened this issue 4 months ago
github.com/vllm-project/vllm - yuhon0528 opened this issue 4 months ago
[LoRA] Adds support for bias in LoRA
github.com/vllm-project/vllm - followumesh opened this pull request 4 months ago
github.com/vllm-project/vllm - followumesh opened this pull request 4 months ago
[Bug]: asyncio.exceptions.CancelledError asyncio.exceptions.TimeoutError
github.com/vllm-project/vllm - ZZhangxian opened this issue 4 months ago
github.com/vllm-project/vllm - ZZhangxian opened this issue 4 months ago
api_server.py: error: unrecognized arguments: --tool-use-prompt-template --enable-api-tools --enable-auto-tool-choice
github.com/vllm-project/vllm - lk1983823 opened this issue 4 months ago
github.com/vllm-project/vllm - lk1983823 opened this issue 4 months ago
[Bug]: asyncio.exceptions.CancelledError asyncio.exceptions.TimeoutError
github.com/vllm-project/vllm - ZZhangxian opened this issue 4 months ago
github.com/vllm-project/vllm - ZZhangxian opened this issue 4 months ago
[Misc]: how to understand: NUM_ELEMS_PER_THREAD = HEAD_SIZE / THREAD_GROUP_SIZE
github.com/vllm-project/vllm - ZJLi2013 opened this issue 4 months ago
github.com/vllm-project/vllm - ZJLi2013 opened this issue 4 months ago
[Bug]: Two V100 server with a total of 16GPU running Distributed Inference and Serving Vllm with error
github.com/vllm-project/vllm - warlockedward opened this issue 4 months ago
github.com/vllm-project/vllm - warlockedward opened this issue 4 months ago
[Bugfix] support `tie_word_embeddings` for all models
github.com/vllm-project/vllm - zijian-hu opened this pull request 4 months ago
github.com/vllm-project/vllm - zijian-hu opened this pull request 4 months ago
[RFC]: Add runtime weight update API
github.com/vllm-project/vllm - lyuqin-scale opened this issue 4 months ago
github.com/vllm-project/vllm - lyuqin-scale opened this issue 4 months ago
[New Model]: Support Nemotron-4-340B
github.com/vllm-project/vllm - dskhudia opened this issue 4 months ago
github.com/vllm-project/vllm - dskhudia opened this issue 4 months ago
[New Model]: Chameleon support
github.com/vllm-project/vllm - nopperl opened this issue 4 months ago
github.com/vllm-project/vllm - nopperl opened this issue 4 months ago
[Bug fix]: enumerate's seq is not equal to quant_states'key
github.com/vllm-project/vllm - thesues opened this pull request 4 months ago
github.com/vllm-project/vllm - thesues opened this pull request 4 months ago
[Distributed] Add send and recv helpers
github.com/vllm-project/vllm - andoorve opened this pull request 4 months ago
github.com/vllm-project/vllm - andoorve opened this pull request 4 months ago
[Frontend] Add FlexibleArgumentParser to support both underscore and dash in names
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
[Kernel][CPU] Add Quick `gelu` to CPU
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
[Bug]: "Triton Error [CUDA]: device kernel image is invalid" when loading Mixtral-8x7B-Instruct-v0.1 in fused_moe.py
github.com/vllm-project/vllm - xiangcao opened this issue 4 months ago
github.com/vllm-project/vllm - xiangcao opened this issue 4 months ago
[Model] Support Qwen-VL and Qwen-VL-Chat models with text-only inputs
github.com/vllm-project/vllm - DamonFool opened this pull request 4 months ago
github.com/vllm-project/vllm - DamonFool opened this pull request 4 months ago
[Feature]: Continuous streaming of `UsageInfo`
github.com/vllm-project/vllm - tdoublep opened this issue 4 months ago
github.com/vllm-project/vllm - tdoublep opened this issue 4 months ago
[Misc]: 我在使用vllm启动的openai api在进行对话时出现这样的情况
github.com/vllm-project/vllm - ArboterJams opened this issue 4 months ago
github.com/vllm-project/vllm - ArboterJams opened this issue 4 months ago
max_tokens must be at least 1, got -160
github.com/vllm-project/vllm - njhouse365 opened this issue 4 months ago
github.com/vllm-project/vllm - njhouse365 opened this issue 4 months ago
[Misc] optimize sampler with top_p=1 and top_k>0
github.com/vllm-project/vllm - gx16377 opened this pull request 4 months ago
github.com/vllm-project/vllm - gx16377 opened this pull request 4 months ago
[Bugfix] Add verbose error if scipy is missing for blocksparse attention
github.com/vllm-project/vllm - JGSweets opened this pull request 4 months ago
github.com/vllm-project/vllm - JGSweets opened this pull request 4 months ago
[Bug]: vision chat completion output with odd Instruction/Output prompting.
github.com/vllm-project/vllm - pseudotensor opened this issue 4 months ago
github.com/vllm-project/vllm - pseudotensor opened this issue 4 months ago
[Bug]:Qwen2-57B-A14B 两卡 推理报错
github.com/vllm-project/vllm - CXLiang123 opened this issue 4 months ago
github.com/vllm-project/vllm - CXLiang123 opened this issue 4 months ago
[WIP] [Speculative Decoding] Use MQA kernel for target model verification
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 4 months ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 4 months ago
[Installation]: poetry add vllm not working on my Mac -- xformers (0.0.26.post1) not supporting PEP 517 builds.
github.com/vllm-project/vllm - srushti98 opened this issue 4 months ago
github.com/vllm-project/vllm - srushti98 opened this issue 4 months ago
[Hardware][Intel GPU] Refactor distributed Executor for xpu device
github.com/vllm-project/vllm - jikunshang opened this pull request 4 months ago
github.com/vllm-project/vllm - jikunshang opened this pull request 4 months ago
[Bug]: `flash_attn_cuda.varlen_fwd` may output a bad result when enabling prefix caching
github.com/vllm-project/vllm - syGOAT opened this issue 4 months ago
github.com/vllm-project/vllm - syGOAT opened this issue 4 months ago
[Misc] Making launch_tgi_server.sh script parameterizable
github.com/vllm-project/vllm - AllenDou opened this pull request 4 months ago
github.com/vllm-project/vllm - AllenDou opened this pull request 4 months ago
ValueError: The input size is not aligned with the quantized weight shape. This can be caused by too large tensor parallel size.[Bug]:
github.com/vllm-project/vllm - QuanhuiGuan opened this issue 4 months ago
github.com/vllm-project/vllm - QuanhuiGuan opened this issue 4 months ago
[Installation]: pip install -e failed
github.com/vllm-project/vllm - chunniunai220ml opened this issue 4 months ago
github.com/vllm-project/vllm - chunniunai220ml opened this issue 4 months ago
[WIP][Misc] Create setup_files dir for cleanup
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 4 months ago
[Installation]: Build from source: Could NOT find Python. Could not build wheels for vllm.
github.com/vllm-project/vllm - Brennanzuz opened this issue 4 months ago
github.com/vllm-project/vllm - Brennanzuz opened this issue 4 months ago
[BugFix] exclude version 1.15.0 for modelscope
github.com/vllm-project/vllm - zhyncs opened this pull request 4 months ago
github.com/vllm-project/vllm - zhyncs opened this pull request 4 months ago
[Bug]: Eabling Prefix-Caching doesn't speed up inference
github.com/vllm-project/vllm - yangelaboy opened this issue 4 months ago
github.com/vllm-project/vllm - yangelaboy opened this issue 4 months ago
[Usage]: Does class LLM support inference quantization on CPU?
github.com/vllm-project/vllm - rsong0606 opened this issue 4 months ago
github.com/vllm-project/vllm - rsong0606 opened this issue 4 months ago
[Bug]: Qwen2-72B-Instruct-gptq-int4 Repetitive issues
github.com/vllm-project/vllm - Storm0921 opened this issue 4 months ago
github.com/vllm-project/vllm - Storm0921 opened this issue 4 months ago
[Bug]: Ray distributed backend does not support out-of-tree models via ModelRegistry APIs
github.com/vllm-project/vllm - SamKG opened this issue 4 months ago
github.com/vllm-project/vllm - SamKG opened this issue 4 months ago
IfEval Metrics not consistent with different vLLM versions
github.com/vllm-project/vllm - akjindal53244 opened this issue 4 months ago
github.com/vllm-project/vllm - akjindal53244 opened this issue 4 months ago
Support CPU inference with VSX PowerPC ISA
github.com/vllm-project/vllm - ChipKerchner opened this pull request 4 months ago
github.com/vllm-project/vllm - ChipKerchner opened this pull request 4 months ago
[Feature] OpenAI-Compatible Tools API + Streaming for Hermes & Mistral models
github.com/vllm-project/vllm - K-Mistele opened this pull request 4 months ago
github.com/vllm-project/vllm - K-Mistele opened this pull request 4 months ago
[build][misc] remove nvidia runtime docker base image
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[Bugfix] [Core] don't schedule prefill if freeing kv cache
github.com/vllm-project/vllm - toslunar opened this pull request 4 months ago
github.com/vllm-project/vllm - toslunar opened this pull request 4 months ago
[Installation]: 是否支持在windows操作系统上安装vllm
github.com/vllm-project/vllm - hiahia121 opened this issue 4 months ago
github.com/vllm-project/vllm - hiahia121 opened this issue 4 months ago
[Misc]Add param max-model-len in benchmark_latency.py
github.com/vllm-project/vllm - DearPlanet opened this pull request 4 months ago
github.com/vllm-project/vllm - DearPlanet opened this pull request 4 months ago
[Bugfix] Fix Phi-3 Long RoPE scaling implementation
github.com/vllm-project/vllm - ShukantPal opened this pull request 4 months ago
github.com/vllm-project/vllm - ShukantPal opened this pull request 4 months ago
Why is the GPU KV cache usage very low?
github.com/vllm-project/vllm - tammypi opened this issue 4 months ago
github.com/vllm-project/vllm - tammypi opened this issue 4 months ago
[Misc] Remove import from transformers logging
github.com/vllm-project/vllm - CatherineSue opened this pull request 4 months ago
github.com/vllm-project/vllm - CatherineSue opened this pull request 4 months ago
[ci] Deprecate original CI template
github.com/vllm-project/vllm - khluu opened this pull request 4 months ago
github.com/vllm-project/vllm - khluu opened this pull request 4 months ago
[CI/Build][Misc] Update Pytest Marker for VLMs
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
github.com/vllm-project/vllm - ywang96 opened this pull request 4 months ago
[Bugfix][Model]Solve the problem of inability to infer on vllm after minicpm training or awq
github.com/vllm-project/vllm - LDLINGLINGLING opened this pull request 4 months ago
github.com/vllm-project/vllm - LDLINGLINGLING opened this pull request 4 months ago
[Bug]: many models may not load the weights correctly if `tie_word_embeddings` is enabled
github.com/vllm-project/vllm - zijian-hu opened this issue 4 months ago
github.com/vllm-project/vllm - zijian-hu opened this issue 4 months ago
[misc][typo] fix typo
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[misc][distributed] use localhost for single-node
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[Usage]: Qwen2-7B-Instruct got stuck in infinite loop using vllm==0.5.0 with tp = 2
github.com/vllm-project/vllm - YanXingyu1998 opened this issue 4 months ago
github.com/vllm-project/vllm - YanXingyu1998 opened this issue 4 months ago
[CI][Hardware][Intel GPU] add Intel GPU(XPU) ci pipeline
github.com/vllm-project/vllm - jikunshang opened this pull request 4 months ago
github.com/vllm-project/vllm - jikunshang opened this pull request 4 months ago
[CI] Avoid naming different metrics with the same name in performance benchmark
github.com/vllm-project/vllm - KuntaiDu opened this pull request 4 months ago
github.com/vllm-project/vllm - KuntaiDu opened this pull request 4 months ago
[Doc] Update docker references
github.com/vllm-project/vllm - rafvasq opened this pull request 4 months ago
github.com/vllm-project/vllm - rafvasq opened this pull request 4 months ago
[Bug]: Failed: /home/runner/work/vllm/vllm/csrc/custom_all_reduce.cuh:310 'invalid argument'
github.com/vllm-project/vllm - XianmingJin08 opened this issue 4 months ago
github.com/vllm-project/vllm - XianmingJin08 opened this issue 4 months ago
[bugfix][distributed] do not error if two processes do not agree on p2p capability
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[Model] Add support for Qwen2 for embeddings
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 4 months ago
[ci] Setup Release pipeline and build release wheels with cache
github.com/vllm-project/vllm - khluu opened this pull request 4 months ago
github.com/vllm-project/vllm - khluu opened this pull request 4 months ago
[Feature]: Initial LLM token
github.com/vllm-project/vllm - CHesketh76 opened this issue 4 months ago
github.com/vllm-project/vllm - CHesketh76 opened this issue 4 months ago
feat: adds user information to the input of the scheduler
github.com/vllm-project/vllm - FerranAgulloLopez opened this pull request 4 months ago
github.com/vllm-project/vllm - FerranAgulloLopez opened this pull request 4 months ago
[Bug]: Concurrent requests messing up GREEDY responses
github.com/vllm-project/vllm - prashantgupta24 opened this issue 4 months ago
github.com/vllm-project/vllm - prashantgupta24 opened this issue 4 months ago
[Fix] Use utf-8 encoding in entrypoints/openai/run_batch.py
github.com/vllm-project/vllm - zifeitong opened this pull request 4 months ago
github.com/vllm-project/vllm - zifeitong opened this pull request 4 months ago
[Feature]: Access to user information in scheduler
github.com/vllm-project/vllm - FerranAgulloLopez opened this issue 4 months ago
github.com/vllm-project/vllm - FerranAgulloLopez opened this issue 4 months ago
[bugfix][distributed] fix 16 gpus local rank arrangement
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 4 months ago
[LoRA] Add support for pinning lora adapters in the LRU cache
github.com/vllm-project/vllm - rohithkrn opened this pull request 4 months ago
github.com/vllm-project/vllm - rohithkrn opened this pull request 4 months ago
[Core] Optimize block_manager_v2 vs block_manager_v1 (to make V2 default)
github.com/vllm-project/vllm - alexm-neuralmagic opened this pull request 4 months ago
github.com/vllm-project/vllm - alexm-neuralmagic opened this pull request 4 months ago
RuntimeError: No suitable kernel. h_in=16 h_out=4096 dtype=Float out_dtype=Half
github.com/vllm-project/vllm - yangelaboy opened this issue 4 months ago
github.com/vllm-project/vllm - yangelaboy opened this issue 4 months ago
[Feature]: support Qwen2 embedding
github.com/vllm-project/vllm - DavidPeleg6 opened this issue 4 months ago
github.com/vllm-project/vllm - DavidPeleg6 opened this issue 4 months ago
[Core] Add use_dummy_dirver to parallel config
github.com/vllm-project/vllm - DriverSong opened this pull request 4 months ago
github.com/vllm-project/vllm - DriverSong opened this pull request 4 months ago
[Feature]: Add config of use_dummy_driver rather than default 'False'
github.com/vllm-project/vllm - DriverSong opened this issue 4 months ago
github.com/vllm-project/vllm - DriverSong opened this issue 4 months ago
[Bug]: GPTQ-Marlin kernel illegal memory access with `group_size=32`, `desc_act=True`, `tp=4`
github.com/vllm-project/vllm - danieldk opened this issue 4 months ago
github.com/vllm-project/vllm - danieldk opened this issue 4 months ago
[Model] Rename Phi3 rope scaling type
github.com/vllm-project/vllm - garg-amit opened this pull request 4 months ago
github.com/vllm-project/vllm - garg-amit opened this pull request 4 months ago