Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://github.com/vllm-project/vllm

[ci] Diff check step

khluu opened this pull request 7 months ago
[CI/Build] Disable LLaVA-NeXT CPU test

DarkLight1337 opened this pull request 7 months ago
[Core][Distributed] improve p2p cache generation

youkaichao opened this pull request 7 months ago
[CI/Build] [1/3] Reorganize entrypoints tests

DarkLight1337 opened this pull request 7 months ago
[Core] Remove duplicate processing in async engine

DarkLight1337 opened this pull request 7 months ago
[Misc] Fix arg names

AllenDou opened this pull request 7 months ago
bump version to v0.5.0.post1

simon-mo opened this pull request 7 months ago
[Bug]: Shutdown error when using multiproc_gpu_executor

wooyeonlee0 opened this issue 7 months ago
[RFC]: Usage Data Enhancement for v0.5.*

simon-mo opened this issue 7 months ago
Limit visible devices for 2gpu tests

khluu opened this pull request 7 months ago
Add basic correctness 2 GPU tests to 4 GPU pipeline

Yard1 opened this pull request 7 months ago
[Kernel] Fix CUTLASS 3.x custom broadcast load epilogue

tlrmchlsmth opened this pull request 7 months ago
[Misc] Log cudagraph memory usage

ymwangg opened this pull request 7 months ago
[Kernel] Update Cutlass int8 kernel configs for SM90

varun-sundar-rabindranath opened this pull request 7 months ago
[Bug]: Error loading FP8 weights for `gpt_bigcode` model

tdoublep opened this issue 7 months ago
[misc][distributed] fix benign error in `is_in_the_same_node`

youkaichao opened this pull request 7 months ago
[misc] fix format.sh

youkaichao opened this pull request 7 months ago
[CI/Build] Disable test_fp8.py

tlrmchlsmth opened this pull request 7 months ago
[Bugfix]typofix

AllenDou opened this pull request 7 months ago
[Bug]: Illegal memory access in CUTLASS FP8 kernels

tlrmchlsmth opened this issue 7 months ago
[Kernel] Disable CUTLASS kernels for fp8

tlrmchlsmth opened this pull request 7 months ago
[Bug]: ModuleNotFoundError: No module named 'bitsandbytes'

emillykkejensen opened this issue 7 months ago
support load qwen2-72b-instruct lora

NiuBlibing opened this pull request 7 months ago
[Bug]: Qwen/Qwen2-72B-Instruct 128k server down

junior-zsy opened this issue 7 months ago
[Bug]: ray not work when tp>=2

Jimmy-Lu opened this issue 7 months ago
[Usage]: How do I get the FP8 scaling factors for KV cache?

CharlesRiggins opened this issue 7 months ago
[Hardware][Intel] fp8 kv cache support for CPU

jikunshang opened this pull request 7 months ago
Enable random seed option to make latency benchmarking more configurable

qingquansong opened this pull request 7 months ago
[Bug]: NCCL hangs and causes timeout

wjj19950828 opened this issue 7 months ago
[Misc] add code to get git hash info for vllm

dhuangnm opened this pull request 7 months ago
[CI/Build] Update CPU tests to include all "standard" tests

DarkLight1337 opened this pull request 7 months ago
Add `cuda_device_count_stateless`

Yard1 opened this pull request 7 months ago
[Doc] Update documentation on Tensorizer

sangstar opened this pull request 7 months ago
[ci] Upload wheels

khluu opened this pull request 7 months ago
[misc] add hint for AttributeError

youkaichao opened this pull request 7 months ago
[Bug]: Torch2.3 run fail

lucasjinreal opened this issue 7 months ago
[Bugfix] Enable loading FP8 checkpoints for gpt_bigcode models

tdoublep opened this pull request 7 months ago
[Feature]: PagedAttention multiple of 8

barschiiii opened this issue 7 months ago
[Bug]: Error when --tensor-parallel-size > 1

javi111717 opened this issue 7 months ago
[Installation]: M2 Mac Dependency Torch 2.1.2 (Incompatible)

velocity33 opened this issue 7 months ago
[Bug]: Outdated binaries when re-building vLLM from source

DarkLight1337 opened this issue 7 months ago
[Bugfix] Skip test temporarily; failing quantization test

dsikka opened this pull request 7 months ago
[Usage] Clarify and Update Argument for Specifying Model Revisions

Etelis opened this pull request 7 months ago
[Hardware][Intel] Support CPU inference with AVX2 ISA

DamonFool opened this pull request 7 months ago
[Bugfix] Fix wrong multi_modal_input format for CPU runner

Isotr0py opened this pull request 7 months ago
[Bug]: vllm v0.5.0 internal assert failed

changshivek opened this issue 7 months ago
[Usage]: How to serve embedding model and LLM at the same time

weiyunfei opened this issue 7 months ago
[Model] Bert Embedding Model

laishzh opened this pull request 7 months ago
multilora_inference调用qwen2-1.5b报错

zigangzhao-ai opened this issue 7 months ago
[Bugfix] TYPE_CHECKING for MultiModalData

kimdwkimdw opened this pull request 7 months ago
[Bug]: v0.4.3 AsyncEngineDeadError

changshivek opened this issue 7 months ago
[Bugfix] Avoid to warmup when world size is 1

kerthcet opened this pull request 7 months ago
[Kernel] Add punica dimension for Qwen2 LoRA

jinzhen-lin opened this pull request 7 months ago
[Bug]: TypeError: a bytes-like object is required, not 'str'

yaoyasong opened this issue 7 months ago
[Bug]: resource_tracker unregister error with 2*3090

xuhao916 opened this issue 7 months ago
[Doc] Update debug docs

DarkLight1337 opened this pull request 7 months ago
[Doc] Update LLaVA docs

DarkLight1337 opened this pull request 7 months ago
`compressed-tensors` marlin 24 support

dsikka opened this pull request 7 months ago
[Feature]: Add guided-* Parameters to Sampling Parameters

zhanghx0905 opened this issue 7 months ago
[ Misc ] Rs/compressed tensors cleanup

robertgshaw2-neuralmagic opened this pull request 7 months ago
[Feature]: Support [RecurrentGemmaForCausalLM]

sung-ho-moon opened this issue 7 months ago
[Bugfix] fix lora_dtype value type in arg_utils.py - part 2

c3-ali opened this pull request 7 months ago
[Docs] [Spec decode] Fix docs error in code example

cadedaniel opened this pull request 7 months ago
[Feature]: ci test with vGPU

youkaichao opened this issue 7 months ago
[Frontend] Add "input speed" to tqdm postfix alongside output speed

mgoin opened this pull request 7 months ago
cache image build

khluu opened this pull request 7 months ago
[Doc]: Urgent MoE question

ymmm-4 opened this issue 7 months ago