Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://github.com/vllm-project/vllm

[Misc] Improve type annotations for `support_torch_compile`

DarkLight1337 opened this pull request 23 days ago
support download Lora Model from ModelScope and download private mode…

AlphaINF opened this pull request 23 days ago
[platform] Add verify_quantization in platform.

wangxiyuan opened this pull request 23 days ago
[Misc]: Qwen2VL Vision ID Support

yusufani opened this issue 23 days ago
[Feature]: Beam search: top_p, min_p and logit processors

denadai2 opened this issue 23 days ago
[Feature]: Enable `/score` endpoint for all embedding models

maxdebayser opened this issue 24 days ago
[Model] Clean up MiniCPMV

DarkLight1337 opened this pull request 24 days ago
Configuration of the model parallelism does not make sense

fajavadi opened this pull request 24 days ago
[Misc][XPU] Avoid torch compile for XPU platform

yma11 opened this pull request 24 days ago
[Misc] typo find in sampling_metadata.py

noooop opened this pull request 24 days ago
[V1] Optimize the CPU overheads in FlashAttention custom op

WoosukKwon opened this pull request 24 days ago
[doc]Update config docstring

wangxiyuan opened this pull request 24 days ago
[WIP][V1] Ray executor

rkooo567 opened this pull request 24 days ago
[Doc]: BNB 8 bit quantization is undocumented

molereddy opened this issue 24 days ago
[Bugfix] Fix BNB loader target_modules

jeejeelee opened this pull request 25 days ago
[Model] Update multi-modal processor to support Mantis(LLaVA) model

DarkLight1337 opened this pull request 25 days ago
[Bug]: VLLM run very very slow in ARM cpu

feikiss opened this issue 25 days ago
[WIP][CI]add genai-perf benchmark in nightly benchmark

jikunshang opened this pull request 25 days ago
[V1] Initial support of multimodal models for V1 re-arch

ywang96 opened this pull request 25 days ago
[Model] Implement merged input processor for LLaVA model

DarkLight1337 opened this pull request 26 days ago
[RFC]: Make any vLLM model a pooling model

DarkLight1337 opened this issue 26 days ago
[Doc] Add github links for source code references

russellb opened this pull request 26 days ago
[V1] VLM - Run the mm_mapper preprocessor in the frontend process

alexm-neuralmagic opened this pull request 27 days ago
[Model] Enable optional prefix when loading embedding models

DarkLight1337 opened this pull request 27 days ago
[Bug]: Authorization ignored when root_path is set

chaunceyjiang opened this pull request 28 days ago
[fix] Correct num_accepted_tokens counting

KexinFeng opened this pull request 28 days ago
[doc] update the code to add models

youkaichao opened this pull request 28 days ago
Revert "[CI/Build] Print running script to enhance CI log readability"

youkaichao opened this pull request 28 days ago
[Bug]: GGUF Model Output Repeats Nonsensically

Mayflyyh opened this issue 28 days ago
[model][utils] add extract_layer_index utility function

youkaichao opened this pull request 28 days ago
[Misc]Further reduce BNB static variable

jeejeelee opened this pull request 28 days ago
[CI/Build] Print running script to enhance CI log readability

jeejeelee opened this pull request 29 days ago
[Interleaved ATTN] Support for Mistral-8B

patrickvonplaten opened this pull request 29 days ago
[Kernel] Remove hard-dependencies of Speculative decode to CUDA workers

xuechendi opened this pull request 29 days ago
[Bug]: Duplicate request_id breaks the engine

tjohnson31415 opened this issue 29 days ago
[Core] Update to outlines > 0.1.4

russellb opened this pull request 30 days ago
[V1] Refactor model executable interface for multimodal models

ywang96 opened this pull request 30 days ago
[Hardware][Intel-Gaudi] Enable LoRA support for Intel Gaudi (HPU)

SanjuCSudhakaran opened this pull request 30 days ago
[Docs] Add dedicated tool calling page to docs

mgoin opened this pull request about 1 month ago
[Usage]: How to use ROPE scaling for llama3.1 and gemma2?

hahmad2008 opened this issue about 1 month ago
[CI][Installation] Avoid uploading CUDA 11.8 wheel

cermeng opened this pull request about 1 month ago
[Usage]: Fail to load config.json

dequeueing opened this issue about 1 month ago
[Bug]: vllm failed to run two instance with one gpu

pandada8 opened this issue about 1 month ago
Add Sageattention backend

flozi00 opened this pull request about 1 month ago
[Bug]: Authorization ignored when root_path is set

OskarLiew opened this issue about 1 month ago
[Misc] Suppress duplicated logging regarding multimodal input pipeline

ywang96 opened this pull request about 1 month ago
[8/N] enable cli flag without a space

youkaichao opened this pull request about 1 month ago
[V1] Fix Compilation config & Enable CUDA graph by default

WoosukKwon opened this pull request about 1 month ago
[Feature]: Additional possible value for `tool_choice`: `required`

fahadh4ilyas opened this issue about 1 month ago
[Bug]: Gemma2 becomes a fool.

Foreist opened this issue about 1 month ago
fix the issue that len(tokenizer(prompt)["input_ids"]) > prompt_len

sywangyi opened this pull request about 1 month ago
[Kernel] Register punica ops directly

jeejeelee opened this pull request about 1 month ago
[platforms] improve error message for unspecified platforms

youkaichao opened this pull request about 1 month ago
[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server

angkywilliam opened this pull request about 1 month ago
[Model] Expose `dynamic_image_size` as mm_processor_kwargs for InternVL2 models

Isotr0py opened this pull request about 1 month ago
[Feature]: Manually inject Prefix KV Cache

toilaluan opened this issue about 1 month ago
[Model]: Add support for Aria model

xffxff opened this pull request about 1 month ago
[Doc] fix a small typo in docstring of llama_tool_parser

FerdinandZhong opened this pull request about 1 month ago
[core] overhaul memory profiling and fix backward compatibility

youkaichao opened this pull request about 1 month ago
[Feature]: Multimodel prefix-caching features

justzhanghong opened this issue about 1 month ago
[Usage]:

Lukas-123 opened this issue about 1 month ago
[Platforms] Add `device_type` in `Platform`

MengqingCao opened this pull request about 1 month ago
[WIP][v1] Refactor KVCacheManager for more hash input than token ids

rickyyx opened this pull request about 1 month ago
Need to update the jax and jaxlib version

vanbasten23 opened this pull request about 1 month ago
Turn on V1 for H200 build

simon-mo opened this pull request about 1 month ago
Metrics model name when using multiple loras

mces89 opened this issue about 1 month ago
[Model] Add OLMo November 2024 model

2015aroras opened this pull request about 1 month ago
[Core] Implement disagg prefill by StatelessProcessGroup

KuntaiDu opened this pull request about 1 month ago
Setting default for EmbeddingChatRequest.add_generation_prompt to False

noamgat opened this pull request about 1 month ago