Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

vLLM

vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective - Host: opensource - https://opencollective.com/vllm - Code: https://github.com/vllm-project/vllm

[Misc] Use torch.compile for basic custom ops

github.com/vllm-project/vllm - WoosukKwon opened this pull request 3 months ago
[Bugfix] fix spec decode with cuda graph

github.com/vllm-project/vllm - aurickq opened this pull request 3 months ago
Integrate fused Mixtral MoE with Marlin kernels

github.com/vllm-project/vllm - ElizaWszola opened this pull request 3 months ago
[Core] Asynchronous Output Processor

github.com/vllm-project/vllm - megha95 opened this pull request 3 months ago
[BugFix] Fix multiprocessing shutdown errors

github.com/vllm-project/vllm - njhill opened this pull request 3 months ago
[Usage]: weird GPU RAM usage

github.com/vllm-project/vllm - hieunguyenquoc opened this issue 3 months ago
Add Classifier free guidance

github.com/vllm-project/vllm - zhaoyinglia opened this pull request 3 months ago
[core] Multi Step Scheduling

github.com/vllm-project/vllm - SolitaryThinker opened this pull request 3 months ago
[CI/Build] bump minimum cmake version

github.com/vllm-project/vllm - dtrifiro opened this pull request 3 months ago
[Doc] Proofreading documentation

github.com/vllm-project/vllm - sgolebiewski-intel opened this pull request 3 months ago
[WIP] Add Fused MoE W8A8 (Int8) Support

github.com/vllm-project/vllm - qingquansong opened this pull request 3 months ago
[CI/Build][ROCm] Enabling tensorizer tests for ROCm

github.com/vllm-project/vllm - alexeykondrat opened this pull request 3 months ago
[Feature]: Support rerank models

github.com/vllm-project/vllm - etwk opened this issue 3 months ago
[Model] Pipeline parallel support for Qwen2

github.com/vllm-project/vllm - xuyi opened this pull request 3 months ago
merge to main

github.com/vllm-project/vllm - wbdr opened this pull request 3 months ago
[Core] generate from input embeds

github.com/vllm-project/vllm - Nan2018 opened this pull request 3 months ago
[Model] Teleflm Support

github.com/vllm-project/vllm - horizon94 opened this pull request 3 months ago
[CI/Build] upgrade Dockerfile to ubuntu 22.04

github.com/vllm-project/vllm - samos123 opened this pull request 3 months ago
[RFC]: Performance Roadmap

github.com/vllm-project/vllm - simon-mo opened this issue 3 months ago
[RFC]: Isolate OpenAI Server Into Separate Process

github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 3 months ago
[CI] Reproduce SGLANG benchmark results

github.com/vllm-project/vllm - KuntaiDu opened this pull request 3 months ago
[Bugfix] Add synchronize to prevent possible data race

github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 3 months ago
[Feature]: ngram-spec-decode

github.com/vllm-project/vllm - chenglu66 opened this issue 3 months ago
[Core] Use array to speedup padding

github.com/vllm-project/vllm - peng1999 opened this pull request 3 months ago
[Bug]: --max-model-len configuration robustness

github.com/vllm-project/vllm - gargnipungarg opened this issue 3 months ago
[Feature]: chat API assistant prefill

github.com/vllm-project/vllm - pseudotensor opened this issue 3 months ago
[wip] spmd delta optimization

github.com/vllm-project/vllm - rkooo567 opened this pull request 3 months ago
[ Kernel ] Add Fused Layernorm + Dynamic-Per-Token Quant Kernels

github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 3 months ago
[Bugfix] Fix decode tokens w. CUDA graph

github.com/vllm-project/vllm - comaniac opened this pull request 3 months ago
[Bugfix] Fix awq_marlin and gptq_marlin flags

github.com/vllm-project/vllm - alexm-neuralmagic opened this pull request 3 months ago
[Bugfix] Fix speculative decode seeded test

github.com/vllm-project/vllm - njhill opened this pull request 3 months ago
[Model][Jamba] Mamba cache single buffer

github.com/vllm-project/vllm - mzusman opened this pull request 3 months ago
[Bugfix] Fix speculative decode seeded test

github.com/vllm-project/vllm - tdoublep opened this pull request 3 months ago
[Feature]: Add support to Llama-3.1

github.com/vllm-project/vllm - KaifAhmad1 opened this issue 3 months ago
[Bugfix]fix modelscope compatible issue

github.com/vllm-project/vllm - liuyhwangyh opened this pull request 3 months ago
Adjust/openai api server turbo 20240724 v2

github.com/vllm-project/vllm - zyearw1024 opened this pull request 3 months ago
[Feature]: vllm support for Ascend NPU

github.com/vllm-project/vllm - hi-yifeng opened this issue 3 months ago