Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Usage]: How to serve fine-tuned torchtune model with vllm
github.com/vllm-project/vllm - Some-random opened this issue 5 months ago
github.com/vllm-project/vllm - Some-random opened this issue 5 months ago
[Frontend][OpenAI] Add support for OpenAI tools calling
github.com/vllm-project/vllm - Xwdit opened this pull request 5 months ago
github.com/vllm-project/vllm - Xwdit opened this pull request 5 months ago
[Bug]: Ray on multi machine cluster fails to detect all nodes.
github.com/vllm-project/vllm - bks5881 opened this issue 5 months ago
github.com/vllm-project/vllm - bks5881 opened this issue 5 months ago
[Bug]: NCCL timed out during inference
github.com/vllm-project/vllm - enkiid opened this issue 5 months ago
github.com/vllm-project/vllm - enkiid opened this issue 5 months ago
[Model] Snowflake arctic model implementation
github.com/vllm-project/vllm - sfc-gh-hazhang opened this pull request 5 months ago
github.com/vllm-project/vllm - sfc-gh-hazhang opened this pull request 5 months ago
[Bug]: openapi running but "POST /v1/chat/completions HTTP/1.1" 404 Not Found
github.com/vllm-project/vllm - yebangyu opened this issue 5 months ago
github.com/vllm-project/vllm - yebangyu opened this issue 5 months ago
[Scheduler] Warning upon preemption and Swapping
github.com/vllm-project/vllm - rkooo567 opened this pull request 5 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 5 months ago
[CORE] Adding support for insertion of soft-tuned prompts
github.com/vllm-project/vllm - SwapnilDreams100 opened this pull request 5 months ago
github.com/vllm-project/vllm - SwapnilDreams100 opened this pull request 5 months ago
[Frontend][OpenAI] Support for returning max_model_len on /v1/models response
github.com/vllm-project/vllm - Avinash-Raj opened this pull request 5 months ago
github.com/vllm-project/vllm - Avinash-Raj opened this pull request 5 months ago
fix MiniCPM tie_word_embeddings
github.com/vllm-project/vllm - Receiling opened this pull request 5 months ago
github.com/vllm-project/vllm - Receiling opened this pull request 5 months ago
[Bug]: with `worker_use_ray = true`, and tensor_parallel_size > 1, the process is pending forever
github.com/vllm-project/vllm - depenglee1707 opened this issue 5 months ago
github.com/vllm-project/vllm - depenglee1707 opened this issue 5 months ago
[Frontend] Dynamic RoPE scaling
github.com/vllm-project/vllm - sasha0552 opened this pull request 5 months ago
github.com/vllm-project/vllm - sasha0552 opened this pull request 5 months ago
[CI] Add llama 3 model test
github.com/vllm-project/vllm - rkooo567 opened this pull request 5 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 5 months ago
[Model] Add support for IBM Granite Code models
github.com/vllm-project/vllm - yikangshen opened this pull request 5 months ago
github.com/vllm-project/vllm - yikangshen opened this pull request 5 months ago
[CI] Add retry for agent lost
github.com/vllm-project/vllm - cadedaniel opened this pull request 6 months ago
github.com/vllm-project/vllm - cadedaniel opened this pull request 6 months ago
[Performance] [Speculative decoding]: Support draft model on different tensor-parallel size than target model
github.com/vllm-project/vllm - cadedaniel opened this issue 6 months ago
github.com/vllm-project/vllm - cadedaniel opened this issue 6 months ago
Update lm-format-enforcer to 0.10.1
github.com/vllm-project/vllm - noamgat opened this pull request 6 months ago
github.com/vllm-project/vllm - noamgat opened this pull request 6 months ago
[Speculative decoding] [Help wanted] [Performance] Optimize draft-model speculative decoding
github.com/vllm-project/vllm - cadedaniel opened this issue 6 months ago
github.com/vllm-project/vllm - cadedaniel opened this issue 6 months ago
[CI] use ccache actions properly in release workflow
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
[Kernel] Flashinfer for prefill & decode, with Cudagraph support for decode
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 6 months ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this pull request 6 months ago
[Bug]: Add logger and redirect logs to a file
github.com/vllm-project/vllm - hahmad2008 opened this issue 6 months ago
github.com/vllm-project/vllm - hahmad2008 opened this issue 6 months ago
[Bugfix] Fine-tune gptq_marlin configs to be more similar to marlin
github.com/vllm-project/vllm - alexm-nm opened this pull request 6 months ago
github.com/vllm-project/vllm - alexm-nm opened this pull request 6 months ago
[Misc]: int4 support on CPU backend
github.com/vllm-project/vllm - leiwen83 opened this issue 6 months ago
github.com/vllm-project/vllm - leiwen83 opened this issue 6 months ago
[Bugfix] Fix `asyncio.Task` not being subscriptable
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 6 months ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 6 months ago
[Usage]: doubt on computational complexity
github.com/vllm-project/vllm - Juelianqvq opened this issue 6 months ago
github.com/vllm-project/vllm - Juelianqvq opened this issue 6 months ago
[Bug]: `v0.4.2` python3.8 `TypeError: 'type' object is not subscriptable` (python3.9 syntax)
github.com/vllm-project/vllm - Theodotus1243 opened this issue 6 months ago
github.com/vllm-project/vllm - Theodotus1243 opened this issue 6 months ago
[Usage]: How to install vllm in cuda10.2? Cuda version cannot be upgraded due to environmental issues
github.com/vllm-project/vllm - 1193700079 opened this issue 6 months ago
github.com/vllm-project/vllm - 1193700079 opened this issue 6 months ago
[Bug]: always returns invalid tokens in FP8 static mode
github.com/vllm-project/vllm - AnyISalIn opened this issue 6 months ago
github.com/vllm-project/vllm - AnyISalIn opened this issue 6 months ago
[Core] Update `_earliest_arrival_time` calculation of the waiting seq_groups
github.com/vllm-project/vllm - Felix-Zhenghao opened this pull request 6 months ago
github.com/vllm-project/vllm - Felix-Zhenghao opened this pull request 6 months ago
Main backup 2024 05 05
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 months ago
Upstream sync 2024 05 05
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 months ago
Revert to previous main
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request 6 months ago
[Bugfix] Fixed error in slice_lora_b for MergedQKVParallelLinearWithLora
github.com/vllm-project/vllm - FurtherAI opened this pull request 6 months ago
github.com/vllm-project/vllm - FurtherAI opened this pull request 6 months ago
[Usage]: Cannot run the starter code in tutorial
github.com/vllm-project/vllm - zhimin-z opened this issue 6 months ago
github.com/vllm-project/vllm - zhimin-z opened this issue 6 months ago
[Core][Optimization] change python dict to pytorch tensor
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
[Bug]: when dtype='bfloat16', batch_size will cause different inference results
github.com/vllm-project/vllm - yananchen1989 opened this issue 6 months ago
github.com/vllm-project/vllm - yananchen1989 opened this issue 6 months ago
[Bug]: local variable 'lora_b_k' referenced before assignment
github.com/vllm-project/vllm - LucienShui opened this issue 6 months ago
github.com/vllm-project/vllm - LucienShui opened this issue 6 months ago
[Bug]: RuntimeError: CUDA error: no kernel image is available for execution on the device
github.com/vllm-project/vllm - JPonsa opened this issue 6 months ago
github.com/vllm-project/vllm - JPonsa opened this issue 6 months ago
chunked-prefill-doc-syntax
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
[CI] Reduce wheel size by not shipping debug symbols
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
[CI/Build] from scratch build for dockerfile
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
bump version to v0.4.2
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
github.com/vllm-project/vllm - simon-mo opened this pull request 6 months ago
[Bug]: I used vllm=0.4.1 to run the squeezellm, I meet the bug RuntimeError: t == DeviceType::CUDA INTERNAL ASSERT FAILED at "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/include/c10/cuda/impl/CUDAGuardImpl.h":25, please report a bug to PyTorch.
github.com/vllm-project/vllm - RyanWMHI opened this issue 6 months ago
github.com/vllm-project/vllm - RyanWMHI opened this issue 6 months ago
[Bugfix] add truncate_prompt_tokens to work offline, directly from LLM class.
github.com/vllm-project/vllm - yecohn opened this pull request 6 months ago
github.com/vllm-project/vllm - yecohn opened this pull request 6 months ago
[CI] Make mistral tests pass
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
[RFC][WIP] Use llama-3 instead of llama-2 for basic testing
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
[Core] Optimize sampler get_logprobs
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
[BugFix] Fix fp8 quantizer
github.com/vllm-project/vllm - Kev1ntan opened this pull request 6 months ago
github.com/vllm-project/vllm - Kev1ntan opened this pull request 6 months ago
[Dynamic Spec Decoding] Auto-disable by the running queue size
github.com/vllm-project/vllm - comaniac opened this pull request 6 months ago
github.com/vllm-project/vllm - comaniac opened this pull request 6 months ago
[Core][Distributed] refactor pynccl to hold multiple communicators
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
[Bugfix] fix func var in cpuworker.execute_model() [bug 4568]
github.com/vllm-project/vllm - peterauyeung opened this pull request 6 months ago
github.com/vllm-project/vllm - peterauyeung opened this pull request 6 months ago
[Bug]: Loading GenerationConfig to SamplingParams.stop_token_ids interfere with ignore_eos=True
github.com/vllm-project/vllm - CatherineSue opened this issue 6 months ago
github.com/vllm-project/vllm - CatherineSue opened this issue 6 months ago
[Usage]: Difference in language model usage post updating versions form 0.2 to 0.4
github.com/vllm-project/vllm - servient-ashwin opened this issue 6 months ago
github.com/vllm-project/vllm - servient-ashwin opened this issue 6 months ago
[Bug]: vllm 0.4.1 crashing after checking P2P status on single GPU
github.com/vllm-project/vllm - alexandergagliano opened this issue 6 months ago
github.com/vllm-project/vllm - alexandergagliano opened this issue 6 months ago
[Bugfix] Allow "None" or "" to be passed to CLI for string args that default to None
github.com/vllm-project/vllm - mgoin opened this pull request 6 months ago
github.com/vllm-project/vllm - mgoin opened this pull request 6 months ago
[Bug]: `vllm.entrypoints.openai.api_server` CLI command doesn't accept `None` value for `--quantization`
github.com/vllm-project/vllm - dbarbuzzi opened this issue 6 months ago
github.com/vllm-project/vllm - dbarbuzzi opened this issue 6 months ago
[Bug]: Tensorizer model loader blocks multi-GPU loading even for HF serialized models
github.com/vllm-project/vllm - bbrowning opened this issue 6 months ago
github.com/vllm-project/vllm - bbrowning opened this issue 6 months ago
[Bug]: guided jsons with date fields are not valid
github.com/vllm-project/vllm - andreas-22 opened this issue 6 months ago
github.com/vllm-project/vllm - andreas-22 opened this issue 6 months ago
add spec infer related into prometheus metrics.
github.com/vllm-project/vllm - leiwen83 opened this pull request 6 months ago
github.com/vllm-project/vllm - leiwen83 opened this pull request 6 months ago
[Doc]: i want to know. How to run vllms with remote ray cluster
github.com/vllm-project/vllm - Prashantsaini25 opened this issue 6 months ago
github.com/vllm-project/vllm - Prashantsaini25 opened this issue 6 months ago
[Doc] Chunked Prefill Documentation
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
[Misc]: openai compatible server
github.com/vllm-project/vllm - aqx95 opened this issue 6 months ago
github.com/vllm-project/vllm - aqx95 opened this issue 6 months ago
[Bug]: Special tokens split when decoding after 0.4.0.post1
github.com/vllm-project/vllm - DreamGenX opened this issue 6 months ago
github.com/vllm-project/vllm - DreamGenX opened this issue 6 months ago
[Core] Log more GPU memory reservation info
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
[Misc] add installation time env vars
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
[Bugfix][Kernel] allow non-power-of-2 for prefix prefill with alibi
github.com/vllm-project/vllm - DefTruth opened this pull request 6 months ago
github.com/vllm-project/vllm - DefTruth opened this pull request 6 months ago
[Doc] add env vars to the doc
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
[Misc] remove chunk detected debug logs
github.com/vllm-project/vllm - DefTruth opened this pull request 6 months ago
github.com/vllm-project/vllm - DefTruth opened this pull request 6 months ago
[Kernel] Make static FP8 scaling more robust
github.com/vllm-project/vllm - pcmoritz opened this pull request 6 months ago
github.com/vllm-project/vllm - pcmoritz opened this pull request 6 months ago
[CI][Contribution Welcomed] Conditional Testing
github.com/vllm-project/vllm - simon-mo opened this issue 6 months ago
github.com/vllm-project/vllm - simon-mo opened this issue 6 months ago
[Bug]: Query to the openapi server with cpu backend is throwing error
github.com/vllm-project/vllm - navpreet-np7 opened this issue 6 months ago
github.com/vllm-project/vllm - navpreet-np7 opened this issue 6 months ago
[BugFix] Prevent the task of `_force_log` from being garbage collected
github.com/vllm-project/vllm - Atry opened this pull request 6 months ago
github.com/vllm-project/vllm - Atry opened this pull request 6 months ago
[Core][Distributed] enable allreduce for multiple tp groups
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
[RFC]: Automate Speculative Decoding
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this issue 6 months ago
github.com/vllm-project/vllm - LiuXiaoxuanPKU opened this issue 6 months ago
Update requirements-dev.txt
github.com/vllm-project/vllm - yecohn opened this pull request 6 months ago
github.com/vllm-project/vllm - yecohn opened this pull request 6 months ago
[Misc]: Server Does Not Follow Scheduler Policy
github.com/vllm-project/vllm - Bojun-Feng opened this issue 6 months ago
github.com/vllm-project/vllm - Bojun-Feng opened this issue 6 months ago
[BugFix] Include target-device specific requirements.txt in sdist
github.com/vllm-project/vllm - markmc opened this pull request 6 months ago
github.com/vllm-project/vllm - markmc opened this pull request 6 months ago
[CI/Build] Unpin outlines
github.com/vllm-project/vllm - br3no opened this pull request 6 months ago
github.com/vllm-project/vllm - br3no opened this pull request 6 months ago
[Core] Ignore infeasible swap requests.
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
[Bug]: Scheduler fail with assertion on "meta-llama/Meta-Llama-3-70B-Instruct" when calling concurrently
github.com/vllm-project/vllm - tsvisab opened this issue 6 months ago
github.com/vllm-project/vllm - tsvisab opened this issue 6 months ago
[mypy][7/N] Cover all directories
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
github.com/vllm-project/vllm - rkooo567 opened this pull request 6 months ago
[Usage]: Experiencing weird import bugs and errors after installing with pip install -e .
github.com/vllm-project/vllm - KevinCL16 opened this issue 6 months ago
github.com/vllm-project/vllm - KevinCL16 opened this issue 6 months ago
[Bug]: AssertionError in neuron_model_runner.py assert len(block_table) == 1
github.com/vllm-project/vllm - calvintwr opened this issue 6 months ago
github.com/vllm-project/vllm - calvintwr opened this issue 6 months ago
[Misc] Exclude the `tests` directory from being packaged
github.com/vllm-project/vllm - itechbear opened this pull request 6 months ago
github.com/vllm-project/vllm - itechbear opened this pull request 6 months ago
[Bug fix][Core] fixup ngram not setup correctly
github.com/vllm-project/vllm - leiwen83 opened this pull request 6 months ago
github.com/vllm-project/vllm - leiwen83 opened this pull request 6 months ago
[WIP] Enhance MoE Triton kernel & tuning
github.com/vllm-project/vllm - WoosukKwon opened this pull request 6 months ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 6 months ago
[Misc] centralize all usage of environment variables
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request 6 months ago
[Bug]: v0.4.1 The output results of the MoE kinds models are incorrect on the V100
github.com/vllm-project/vllm - keyword1983 opened this issue 6 months ago
github.com/vllm-project/vllm - keyword1983 opened this issue 6 months ago
[Core] Sliding window for block manager v2
github.com/vllm-project/vllm - mmoskal opened this pull request 6 months ago
github.com/vllm-project/vllm - mmoskal opened this pull request 6 months ago
[Installation]: vLLM does not work on old CPU
github.com/vllm-project/vllm - dimaioksha opened this issue 6 months ago
github.com/vllm-project/vllm - dimaioksha opened this issue 6 months ago
[Misc][Refactor] Introduce ExecuteModelData
github.com/vllm-project/vllm - comaniac opened this pull request 6 months ago
github.com/vllm-project/vllm - comaniac opened this pull request 6 months ago
[Core] Add MultiprocessingGPUExecutor
github.com/vllm-project/vllm - njhill opened this pull request 6 months ago
github.com/vllm-project/vllm - njhill opened this pull request 6 months ago
Virtual Office Hours: May 15 2pm ET
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 6 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this issue 6 months ago