Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

vLLM

vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective - Host: opensource - https://opencollective.com/vllm - Code: https://github.com/vllm-project/vllm

GGUF support

github.com/vllm-project/vllm - viktor-ferenczi opened this issue about 1 year ago
can model Qwen/Qwen-VL-Chat work well?

github.com/vllm-project/vllm - wangschang opened this issue about 1 year ago
Is there authentication supported?

github.com/vllm-project/vllm - mluogh opened this issue about 1 year ago
Loading Model through Multi-Node Ray Cluster Fails

github.com/vllm-project/vllm - VarunSreenivasan16 opened this issue about 1 year ago
Sagemaker support for inference

github.com/vllm-project/vllm - Tarun3679 opened this issue about 1 year ago
Support for RLHF (ILQL)-trained Models

github.com/vllm-project/vllm - ojus1 opened this issue about 1 year ago
vLLM full name

github.com/vllm-project/vllm - designInno opened this issue about 1 year ago
Stuck in Initializing an LLM engine

github.com/vllm-project/vllm - EvilCalf opened this issue about 1 year ago
Feature request: Support for embedding models

github.com/vllm-project/vllm - mantrakp04 opened this issue about 1 year ago
test qwen-7b-chat model and output incorrect

github.com/vllm-project/vllm - dachengai opened this issue about 1 year ago
vllm如何量化部署

github.com/vllm-project/vllm - xxm1668 opened this issue about 1 year ago
Issue with raylet error

github.com/vllm-project/vllm - ZihanWang314 opened this issue about 1 year ago
Installing with ROCM

github.com/vllm-project/vllm - baderex opened this issue about 1 year ago
Cannot get a simple example working with multi-GPU

github.com/vllm-project/vllm - brevity2021 opened this issue about 1 year ago
多gpus如何使用?

github.com/vllm-project/vllm - xxm1668 opened this issue about 1 year ago
Flash Attention V2

github.com/vllm-project/vllm - nivibilla opened this issue over 1 year ago
Faster model loading

github.com/vllm-project/vllm - imoneoi opened this issue over 1 year ago
+34% higher throughput?

github.com/vllm-project/vllm - naed90 opened this issue over 1 year ago
Support Multiple Models

github.com/vllm-project/vllm - aldrinc opened this issue over 1 year ago
Feature request:support ExLlama

github.com/vllm-project/vllm - alanxmay opened this issue over 1 year ago
8bit support

github.com/vllm-project/vllm - mymusise opened this issue over 1 year ago
Require a "Wrapper" feature

github.com/vllm-project/vllm - jeffchy opened this issue over 1 year ago
CTranslate2

github.com/vllm-project/vllm - Matthieu-Tinycoaching opened this issue over 1 year ago
Remove Ray for the dependency

github.com/vllm-project/vllm - lanking520 opened this issue over 1 year ago
CUDA error: out of memory

github.com/vllm-project/vllm - SunixLiu opened this issue over 1 year ago
Can I directly obtain the logits here?

github.com/vllm-project/vllm - SparkJiao opened this issue over 1 year ago
Whisper support

github.com/vllm-project/vllm - gottlike opened this issue over 1 year ago
Build failure due to CUDA version mismatch

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Support custom models

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Add docstrings to some modules and classes

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Minor code cleaning for SamplingParams

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Add CD to PyPI

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Enhance SamplingParams

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Implement presence and frequency penalties

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Support top-k sampling

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Avoid sorting waiting queue & Minor code cleaning

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Support string-based stopping conditions

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Rename variables and methods

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Log system stats

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Update example prompts in `simple_server.py`

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Support various sampling parameters

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Make sure the system can run on T4 and V100

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Clean up the scheduler code

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Add a system logger

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Use slow tokenizer for LLaMA

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Enhance model loader

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Refactor system architecture

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Use runtime profiling to replace manual memory analyzers

github.com/vllm-project/vllm - zhuohan123 opened this pull request over 1 year ago
Bug in LLaMA fast tokenizer

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
[Minor] Fix a dtype bug

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Specify python package dependencies in requirements.txt

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Clean up Megatron-LM code

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Add license

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Implement client API

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Add docstring

github.com/vllm-project/vllm - zhuohan123 opened this issue over 1 year ago
Use mypy

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Support FP32

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Dangerous floating point comparison

github.com/vllm-project/vllm - merrymercy opened this issue over 1 year ago
Replace FlashAttention with xformers

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Decrease the default size of swap space

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Fix a bug in attention kernel

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Add documents on how to add new models

github.com/vllm-project/vllm - zhuohan123 opened this issue over 1 year ago
Enhance model mapper

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Use dtype from model config & Add Dolly V2

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Support BLOOM

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Add support for GPT-2

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Profile memory usage

github.com/vllm-project/vllm - zhuohan123 opened this issue over 1 year ago
Use pytest for unit tests

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Add dependencies in setup.py

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Support GPT-2

github.com/vllm-project/vllm - WoosukKwon opened this issue over 1 year ago
Support bfloat16 data type

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Refactor attention kernels

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
New weight loader without np copy

github.com/vllm-project/vllm - zhuohan123 opened this pull request over 1 year ago
Add an option to launch cacheflow without ray

github.com/vllm-project/vllm - zhuohan123 opened this pull request over 1 year ago
Add support for GPT-NeoX (Pythia)

github.com/vllm-project/vllm - WoosukKwon opened this pull request over 1 year ago
Add plot scripts

github.com/vllm-project/vllm - Ying1123 opened this pull request over 1 year ago
Improve Weight Loading

github.com/vllm-project/vllm - zhuohan123 opened this issue over 1 year ago
Frontend Improvements

github.com/vllm-project/vllm - zhuohan123 opened this issue over 1 year ago