Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
Support `BERTModel` (first `encoder-only` embedding model)
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 15 days ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 15 days ago
[CI] set block manager to v1 when running spec decode compatibility test
github.com/vllm-project/vllm - KuntaiDu opened this pull request 15 days ago
github.com/vllm-project/vllm - KuntaiDu opened this pull request 15 days ago
Fix failing spec decode test
github.com/vllm-project/vllm - sroy745 opened this pull request 15 days ago
github.com/vllm-project/vllm - sroy745 opened this pull request 15 days ago
[Bug]: Different behavior with tool-use response parsing with streaming vs non-streaming when using max_tokens
github.com/vllm-project/vllm - tjohnson31415 opened this issue 15 days ago
github.com/vllm-project/vllm - tjohnson31415 opened this issue 15 days ago
[Usage]: LLama-3.1-405B Inference with vLLM TPU
github.com/vllm-project/vllm - ryanaoleary opened this issue 15 days ago
github.com/vllm-project/vllm - ryanaoleary opened this issue 15 days ago
[Usage]: Tried using vllm with GGUF models. Got an infer device type error.
github.com/vllm-project/vllm - asokans11 opened this issue 15 days ago
github.com/vllm-project/vllm - asokans11 opened this issue 15 days ago
[Bug]: TypeError: inputs must be a string, TextPrompt, or TokensPrompt
github.com/vllm-project/vllm - johnathanchiu opened this issue 15 days ago
github.com/vllm-project/vllm - johnathanchiu opened this issue 15 days ago
[Bugfix] Fix IndexError when choosing tool while having a tool parser
github.com/vllm-project/vllm - tjohnson31415 opened this pull request 15 days ago
github.com/vllm-project/vllm - tjohnson31415 opened this pull request 15 days ago
[Misc] Enable multi-step output streaming by default
github.com/vllm-project/vllm - mgoin opened this pull request 15 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 15 days ago
[Model] Support NVLM-D and fix QK Norm in InternViT
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 15 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 15 days ago
[Bugfix][Hardware][CPU] Fix CPU model input for decode
github.com/vllm-project/vllm - Isotr0py opened this pull request 15 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 15 days ago
[Installation]: Cuda 11.8 binaries for vLLM v0.6.2
github.com/vllm-project/vllm - bakszero opened this issue 15 days ago
github.com/vllm-project/vllm - bakszero opened this issue 15 days ago
[Bugfix] Weight loading fix for OPT model
github.com/vllm-project/vllm - domenVres opened this pull request 15 days ago
github.com/vllm-project/vllm - domenVres opened this pull request 15 days ago
[Hardware][PowerPC] Make oneDNN dependency optional for Power
github.com/vllm-project/vllm - varad-ahirwadkar opened this pull request 15 days ago
github.com/vllm-project/vllm - varad-ahirwadkar opened this pull request 15 days ago
[Bugfix] Fix incorrect updates to num_computed_tokens in multi-step scheduling
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 15 days ago
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 15 days ago
[Bug]: Error Running Llama 3.2 1B on CPU
github.com/vllm-project/vllm - kunalmohan opened this issue 15 days ago
github.com/vllm-project/vllm - kunalmohan opened this issue 15 days ago
Support Pixtral models in the HF Transformers format
github.com/vllm-project/vllm - mgoin opened this pull request 15 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 15 days ago
llama32 support in rocm vllm
github.com/vllm-project/vllm - maleksan85 opened this pull request 15 days ago
github.com/vllm-project/vllm - maleksan85 opened this pull request 15 days ago
[Bugfix] Fix vLLM UsageInfo and logprobs None AssertionError with empty token_ids
github.com/vllm-project/vllm - CatherineSue opened this pull request 15 days ago
github.com/vllm-project/vllm - CatherineSue opened this pull request 15 days ago
[Bug]: VLLM_USE_MODELSCOPE didn't work
github.com/vllm-project/vllm - TaoSeekAI opened this issue 15 days ago
github.com/vllm-project/vllm - TaoSeekAI opened this issue 15 days ago
[Performance]: Transformer 4.45.1 slows down `outlines` guided decoding
github.com/vllm-project/vllm - joerunde opened this issue 16 days ago
github.com/vllm-project/vllm - joerunde opened this issue 16 days ago
mLlama load error with non-default vocabulary sizes
github.com/vllm-project/vllm - Reichenbachian opened this pull request 16 days ago
github.com/vllm-project/vllm - Reichenbachian opened this pull request 16 days ago
[misc] add forward context for attention
github.com/vllm-project/vllm - youkaichao opened this pull request 16 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 16 days ago
[Bug]: Continuous usage stats are incorrect when chunked prefill is enabled
github.com/vllm-project/vllm - tdoublep opened this issue 16 days ago
github.com/vllm-project/vllm - tdoublep opened this issue 16 days ago
[Frontend] Tool calling parser for granite-8b-instruct
github.com/vllm-project/vllm - maxdebayser opened this pull request 16 days ago
github.com/vllm-project/vllm - maxdebayser opened this pull request 16 days ago
[Bugfix] Fix bug of xformer prefill for encoder-decoder
github.com/vllm-project/vllm - xiangxu-google opened this pull request 16 days ago
github.com/vllm-project/vllm - xiangxu-google opened this pull request 16 days ago
[Doc] Update Granite model docs
github.com/vllm-project/vllm - njhill opened this pull request 16 days ago
github.com/vllm-project/vllm - njhill opened this pull request 16 days ago
[Bug]: When I use cpu VLLM, it show [IndexError: list index out of range]
github.com/vllm-project/vllm - WangRongsheng opened this issue 16 days ago
github.com/vllm-project/vllm - WangRongsheng opened this issue 16 days ago
[Frontend] Don't log duplicate error stacktrace for every request in the batch
github.com/vllm-project/vllm - wallashss opened this pull request 16 days ago
github.com/vllm-project/vllm - wallashss opened this pull request 16 days ago
[Performance]: Why is Llama 3.1 405B 5 times faster than 70B on benchmarks?
github.com/vllm-project/vllm - tommy-function opened this issue 16 days ago
github.com/vllm-project/vllm - tommy-function opened this issue 16 days ago
[Misc]: Missing cu118 wheels for 0.6.2 release
github.com/vllm-project/vllm - tderrmann opened this issue 16 days ago
github.com/vllm-project/vllm - tderrmann opened this issue 16 days ago
[BugFix] Enforce Mistral ToolCall id constraint when using the Mistral tool call parser
github.com/vllm-project/vllm - gcalmettes opened this pull request 16 days ago
github.com/vllm-project/vllm - gcalmettes opened this pull request 16 days ago
[Bug]: ToolCall IDs generated by Mistral tool call parser do not comply with Mistral tool calls and template constraints
github.com/vllm-project/vllm - gcalmettes opened this issue 16 days ago
github.com/vllm-project/vllm - gcalmettes opened this issue 16 days ago
[BugFix] Prevent exporting duplicate OpenTelemetry spans
github.com/vllm-project/vllm - ronensc opened this pull request 16 days ago
github.com/vllm-project/vllm - ronensc opened this pull request 16 days ago
[Model] Molmo vLLM Integration
github.com/vllm-project/vllm - mrsalehi opened this pull request 16 days ago
github.com/vllm-project/vllm - mrsalehi opened this pull request 16 days ago
[Kernel] Explicitly specify other value in tl.load calls
github.com/vllm-project/vllm - angusYuhao opened this pull request 16 days ago
github.com/vllm-project/vllm - angusYuhao opened this pull request 16 days ago
[Bugfix] Add random_seed to sample_hf_requests in benchmark_serving script
github.com/vllm-project/vllm - wukaixingxp opened this pull request 16 days ago
github.com/vllm-project/vllm - wukaixingxp opened this pull request 16 days ago
Fix incorrect image channels when dealing with 1x1 image #8954
github.com/vllm-project/vllm - zyddnys opened this pull request 17 days ago
github.com/vllm-project/vllm - zyddnys opened this pull request 17 days ago
[Doc]: Why is FP8 static quantization marked as deprecated?
github.com/vllm-project/vllm - dongluw opened this issue 17 days ago
github.com/vllm-project/vllm - dongluw opened this issue 17 days ago
[Model] add a bunch of supported lora modules for mixtral
github.com/vllm-project/vllm - prashantgupta24 opened this pull request 17 days ago
github.com/vllm-project/vllm - prashantgupta24 opened this pull request 17 days ago
[Bugfix] example template should not add parallel_tool_prompt if tools is none
github.com/vllm-project/vllm - tjohnson31415 opened this pull request 17 days ago
github.com/vllm-project/vllm - tjohnson31415 opened this pull request 17 days ago
[Roadmap] vLLM Roadmap Q4 2024
github.com/vllm-project/vllm - simon-mo opened this issue 17 days ago
github.com/vllm-project/vllm - simon-mo opened this issue 17 days ago
[Model] Support Gemma2 embedding model
github.com/vllm-project/vllm - xyang16 opened this pull request 17 days ago
github.com/vllm-project/vllm - xyang16 opened this pull request 17 days ago
[Model] New interface and automatic detection for PP support
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 17 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 17 days ago
Adds truncate_prompt_tokens param for embeddings creation
github.com/vllm-project/vllm - flaviabeo opened this pull request 17 days ago
github.com/vllm-project/vllm - flaviabeo opened this pull request 17 days ago
[Usage]: Not able to run LLMEngine in two sequential tests using pytest
github.com/vllm-project/vllm - fgebhart opened this issue 17 days ago
github.com/vllm-project/vllm - fgebhart opened this issue 17 days ago
[Bug]: Vllm server (docker) fails to load mistralai/Mixtral-8x22B-Instruct-v0.1
github.com/vllm-project/vllm - ggbetz opened this issue 17 days ago
github.com/vllm-project/vllm - ggbetz opened this issue 17 days ago
[Bug]: Unable to load the tokenizers of certain models
github.com/vllm-project/vllm - Wafaa014 opened this issue 17 days ago
github.com/vllm-project/vllm - Wafaa014 opened this issue 17 days ago
[Bug]: openai.serving_chat tries to call _create_chat_logprobs when the output.text is empty
github.com/vllm-project/vllm - CatherineSue opened this issue 17 days ago
github.com/vllm-project/vllm - CatherineSue opened this issue 17 days ago
[Doc] Update list of supported models
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 17 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 17 days ago
[Bug]: Crash with num-scheduler-steps > 1 and response_format type json object
github.com/vllm-project/vllm - warlock135 opened this issue 17 days ago
github.com/vllm-project/vllm - warlock135 opened this issue 17 days ago
[Bugfix] Fixes for Phi3v and Ultravox Multimodal EmbeddingInputs Support
github.com/vllm-project/vllm - hhzhang16 opened this pull request 18 days ago
github.com/vllm-project/vllm - hhzhang16 opened this pull request 18 days ago
[Usage]: Serving Llama 3.2 `llama-3-2-11b-vision-instruct` hangs
github.com/vllm-project/vllm - rchen19 opened this issue 18 days ago
github.com/vllm-project/vllm - rchen19 opened this issue 18 days ago
[Misc] Update Default Image Mapper Error Log
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 18 days ago
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 18 days ago
[Bug]: Bus error (core dumped)
github.com/vllm-project/vllm - SpaceHunterInf opened this issue 18 days ago
github.com/vllm-project/vllm - SpaceHunterInf opened this issue 18 days ago
[Kernel] Zero point support in fused MarlinMoE kernel + AWQ Fused MoE
github.com/vllm-project/vllm - ElizaWszola opened this pull request 18 days ago
github.com/vllm-project/vllm - ElizaWszola opened this pull request 18 days ago
[Misc] log when using default MoE config
github.com/vllm-project/vllm - divakar-amd opened this pull request 18 days ago
github.com/vllm-project/vllm - divakar-amd opened this pull request 18 days ago
[Doc]: Offline Inference Distributed
github.com/vllm-project/vllm - dakies opened this issue 18 days ago
github.com/vllm-project/vllm - dakies opened this issue 18 days ago
[Core]: (Last/N) Support prefill only models by Workflow Defined Engine
github.com/vllm-project/vllm - noooop opened this pull request 18 days ago
github.com/vllm-project/vllm - noooop opened this pull request 18 days ago
[Hardware][Misc] Make device agnostic
github.com/vllm-project/vllm - wangshuai09 opened this pull request 18 days ago
github.com/vllm-project/vllm - wangshuai09 opened this pull request 18 days ago
[Bugfix] Fix order of arguments matters in config.yaml
github.com/vllm-project/vllm - Imss27 opened this pull request 18 days ago
github.com/vllm-project/vllm - Imss27 opened this pull request 18 days ago
[Misc] Adjust max_position_embeddings for LoRA compatibility
github.com/vllm-project/vllm - jeejeelee opened this pull request 18 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 18 days ago
[Installation]: vllm ROCm failed to build on Docker.
github.com/vllm-project/vllm - limyenkai opened this issue 18 days ago
github.com/vllm-project/vllm - limyenkai opened this issue 18 days ago
[Bugfix] Revert incorrect updates to num_computed_tokens
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 18 days ago
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 18 days ago
[torch.compile] initial integration
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
[Bug]: AsyncLLMEngine CUDA runtime error 'device-side assert triggered'
github.com/vllm-project/vllm - Ouna-the-Dataweaver opened this issue 19 days ago
github.com/vllm-project/vllm - Ouna-the-Dataweaver opened this issue 19 days ago
[Bug]: vllm serve --config.yaml - Order of arguments matters?
github.com/vllm-project/vllm - FloWsnr opened this issue 19 days ago
github.com/vllm-project/vllm - FloWsnr opened this issue 19 days ago
[Question]: Apply LoRA adapter on quantized model
github.com/vllm-project/vllm - Tejaswgupta opened this issue 19 days ago
github.com/vllm-project/vllm - Tejaswgupta opened this issue 19 days ago
[Misc] Fix typo in BlockSpaceManagerV1
github.com/vllm-project/vllm - juncheoll opened this pull request 19 days ago
github.com/vllm-project/vllm - juncheoll opened this pull request 19 days ago
[Model][LoRA]LoRA support added for MiniCPMV2.6
github.com/vllm-project/vllm - jeejeelee opened this pull request 19 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 19 days ago
[Frontend] Added support for HF's new `continue_final_message` parameter
github.com/vllm-project/vllm - danieljannai21 opened this pull request 19 days ago
github.com/vllm-project/vllm - danieljannai21 opened this pull request 19 days ago
[Feature]: Qwen2.5 bitsandbytes support
github.com/vllm-project/vllm - hanan9m opened this issue 19 days ago
github.com/vllm-project/vllm - hanan9m opened this issue 19 days ago
[Usage]: caching with different batches
github.com/vllm-project/vllm - KevinZeng08 opened this issue 19 days ago
github.com/vllm-project/vllm - KevinZeng08 opened this issue 19 days ago
[Bug]: --served-model-name doesn't work with OpenAI benchmarking script
github.com/vllm-project/vllm - samos123 opened this issue 19 days ago
github.com/vllm-project/vllm - samos123 opened this issue 19 days ago
[Bug]: Error when using tensor_parallel in v0.6.1.post1 or 0.6.2
github.com/vllm-project/vllm - ruleGreen opened this issue 19 days ago
github.com/vllm-project/vllm - ruleGreen opened this issue 19 days ago
[Bug]: v0.6.2 Shows a Significant Accuracy Drop Serving Qwen2-VL Model
github.com/vllm-project/vllm - thiner opened this issue 19 days ago
github.com/vllm-project/vllm - thiner opened this issue 19 days ago
speedup hash_of_block function
github.com/vllm-project/vllm - flc666star opened this pull request 19 days ago
github.com/vllm-project/vllm - flc666star opened this pull request 19 days ago
[Bug]: Vllm0.6.2 UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
github.com/vllm-project/vllm - Clint-chan opened this issue 19 days ago
github.com/vllm-project/vllm - Clint-chan opened this issue 19 days ago
Hardware Backend Deprecation Policy
github.com/vllm-project/vllm - youkaichao opened this issue 19 days ago
github.com/vllm-project/vllm - youkaichao opened this issue 19 days ago
[doc] organize installation doc and expose per-commit docker
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 19 days ago
[Build/CI] Set FETCHCONTENT_BASE_DIR to one location for better caching
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 20 days ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 20 days ago
[Bug]: Getting `ValueError: XFormers does not support attention logits soft capping.` in colab on T4
github.com/vllm-project/vllm - brand17 opened this issue 20 days ago
github.com/vllm-project/vllm - brand17 opened this issue 20 days ago
[Frontend] Make beam search emulator temperature modifiable
github.com/vllm-project/vllm - nFunctor opened this pull request 20 days ago
github.com/vllm-project/vllm - nFunctor opened this pull request 20 days ago
[CI/Build][CPU] temporarily disable failed CPU W8A8 test
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 20 days ago
github.com/vllm-project/vllm - bigPYJ1151 opened this pull request 20 days ago
[Feature]: Get logits instead of lobprobs for distillation
github.com/vllm-project/vllm - nivibilla opened this issue 20 days ago
github.com/vllm-project/vllm - nivibilla opened this issue 20 days ago
[CI/Build] Add test decorator for minimum GPU memory
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 20 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 20 days ago
[Bugfix] Error handling when model multimodal config initialisation fails
github.com/vllm-project/vllm - AminAlam opened this pull request 20 days ago
github.com/vllm-project/vllm - AminAlam opened this pull request 20 days ago
[Bug]: Model multimodal config initialisation unhandled and irrelevant error when no architectures found
github.com/vllm-project/vllm - AminAlam opened this issue 20 days ago
github.com/vllm-project/vllm - AminAlam opened this issue 20 days ago
[Misc] Remove vLLM patch of `BaichuanTokenizer`
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 20 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 20 days ago
[misc] fix wheel name
github.com/vllm-project/vllm - youkaichao opened this pull request 21 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 21 days ago
[Performance] TTFT regression from v0.5.4 to 0.6.2
github.com/vllm-project/vllm - rickyyx opened this issue 21 days ago
github.com/vllm-project/vllm - rickyyx opened this issue 21 days ago
Add stream support for Granite 20b Tool Use
github.com/vllm-project/vllm - maxdebayser opened this pull request 21 days ago
github.com/vllm-project/vllm - maxdebayser opened this pull request 21 days ago
[Misc] Separate total and output tokens in benchmark_throughput.py
github.com/vllm-project/vllm - mgoin opened this pull request 21 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 21 days ago
[RFC]: QuantizationConfig and QuantizeMethodBase Refactor for Simplifying Kernel Integrations
github.com/vllm-project/vllm - LucasWilkinson opened this issue 21 days ago
github.com/vllm-project/vllm - LucasWilkinson opened this issue 21 days ago
[Core] Support all head sizes up to 256 with FlashAttention backend
github.com/vllm-project/vllm - njhill opened this pull request 21 days ago
github.com/vllm-project/vllm - njhill opened this pull request 21 days ago
[Misc] Directly use compressed-tensors for checkpoint definitions
github.com/vllm-project/vllm - mgoin opened this pull request 21 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 21 days ago
[Usage]: LLM with tensor_parallel_size larger than n. gpus in one node
github.com/vllm-project/vllm - gpucce opened this issue 21 days ago
github.com/vllm-project/vllm - gpucce opened this issue 21 days ago