Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Bug]: Error when using --tensor-parallel-size 4 on Qwen2.5-72B-Instruct
github.com/vllm-project/vllm - DtYXs opened this issue 27 days ago
github.com/vllm-project/vllm - DtYXs opened this issue 27 days ago
[Bugfix]Enable __ldcv in custome allreduce and remove memory fence
github.com/vllm-project/vllm - HydraQYH opened this pull request 27 days ago
github.com/vllm-project/vllm - HydraQYH opened this pull request 27 days ago
[Doc] Fix typo in AMD installation guide
github.com/vllm-project/vllm - Imss27 opened this pull request 27 days ago
github.com/vllm-project/vllm - Imss27 opened this pull request 27 days ago
[Core] Rename input data types
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 27 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 27 days ago
[VLM] Use `SequenceData.from_token_counts` to create dummy data
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 27 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 27 days ago
[Bug]: RuntimeError on A800 using vllm0.6.1.post2
github.com/vllm-project/vllm - double-vin opened this issue 27 days ago
github.com/vllm-project/vllm - double-vin opened this issue 27 days ago
[New Model][Format]: Support the HF-version of Pixtral
github.com/vllm-project/vllm - mgoin opened this issue 27 days ago
github.com/vllm-project/vllm - mgoin opened this issue 27 days ago
[beam search] add output for manually checking the correctness
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
[Feature]: improve distributed backend selection
github.com/vllm-project/vllm - youkaichao opened this issue 28 days ago
github.com/vllm-project/vllm - youkaichao opened this issue 28 days ago
[Doc]: Is Qwen2-VL-72B supported?
github.com/vllm-project/vllm - pseudotensor opened this issue 28 days ago
github.com/vllm-project/vllm - pseudotensor opened this issue 28 days ago
[Bug]: QLoRA inference returns alternating output
github.com/vllm-project/vllm - rafvasq opened this issue 28 days ago
github.com/vllm-project/vllm - rafvasq opened this issue 28 days ago
[Usage]: VLLM serve Gemma 2 9B it with more than 4096 tokens
github.com/vllm-project/vllm - agorodetzky opened this issue 28 days ago
github.com/vllm-project/vllm - agorodetzky opened this issue 28 days ago
[Doc]: How to Specify System CUTLASS/CUTE Path?
github.com/vllm-project/vllm - zhanwenchen opened this issue 28 days ago
github.com/vllm-project/vllm - zhanwenchen opened this issue 28 days ago
[Not to be Submitted] [WIP] Force Unit tests to run with BlockManager V2
github.com/vllm-project/vllm - sroy745 opened this pull request 28 days ago
github.com/vllm-project/vllm - sroy745 opened this pull request 28 days ago
[Bug]: Neuron + Vllm inference broken with backward incompatible change
github.com/vllm-project/vllm - sssrijan-amazon opened this issue 28 days ago
github.com/vllm-project/vllm - sssrijan-amazon opened this issue 28 days ago
[Hardware][AWS] update neuron to 2.20
github.com/vllm-project/vllm - omrishiv opened this pull request 28 days ago
github.com/vllm-project/vllm - omrishiv opened this pull request 28 days ago
[Core] Factor out common code in `SequenceData` and `Sequence`
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 28 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 28 days ago
[Hardware][AMD] ROCm6.2 upgrade
github.com/vllm-project/vllm - hongxiayang opened this pull request 28 days ago
github.com/vllm-project/vllm - hongxiayang opened this pull request 28 days ago
[Core] Rename `PromptInputs` to `PromptType`, and `inputs` to `prompt`
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 28 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 28 days ago
[Frontend] OpenAI server: propagate usage accounting to FastAPI middleware layer
github.com/vllm-project/vllm - agt opened this pull request 28 days ago
github.com/vllm-project/vllm - agt opened this pull request 28 days ago
[Doc] neuron documentation update
github.com/vllm-project/vllm - omrishiv opened this pull request 28 days ago
github.com/vllm-project/vllm - omrishiv opened this pull request 28 days ago
[Feature]: support out tree multimodal models
github.com/vllm-project/vllm - Jack47 opened this issue 28 days ago
github.com/vllm-project/vllm - Jack47 opened this issue 28 days ago
[Installation]: can not install vllm in GPU
github.com/vllm-project/vllm - lambda7xx opened this issue 28 days ago
github.com/vllm-project/vllm - lambda7xx opened this issue 28 days ago
[Model] Add GLM-4v support and meet vllm==0.6.1.post2+cu123
github.com/vllm-project/vllm - sixsixcoder opened this pull request 28 days ago
github.com/vllm-project/vllm - sixsixcoder opened this pull request 28 days ago
[Kernel] Split Marlin MoE kernels into multiple files
github.com/vllm-project/vllm - ElizaWszola opened this pull request 28 days ago
github.com/vllm-project/vllm - ElizaWszola opened this pull request 28 days ago
[Bug]: Gemma2 model not working with vLLM 0.6.0 CPU backend
github.com/vllm-project/vllm - jerin-scalers-ai opened this issue 28 days ago
github.com/vllm-project/vllm - jerin-scalers-ai opened this issue 28 days ago
[Model] Expose Phi3v num_crops as a mm_processor_kwarg
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 28 days ago
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 28 days ago
[Core][Frontend] Support Passing Multimodal Processor Kwargs
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 28 days ago
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request 28 days ago
[Bugfix] Refactor composite weight loading logic
github.com/vllm-project/vllm - Isotr0py opened this pull request 28 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 28 days ago
[Bug]: RuntimeError in gptq_marlin_24_gemm
github.com/vllm-project/vllm - leoyuppieqnew opened this issue 28 days ago
github.com/vllm-project/vllm - leoyuppieqnew opened this issue 28 days ago
[Misc] add non cuda hf benchmark_througput
github.com/vllm-project/vllm - park12sj opened this pull request 28 days ago
github.com/vllm-project/vllm - park12sj opened this pull request 28 days ago
[Bug]: AttributeError: module 'cv2.dnn' has no attribute 'DictValue'
github.com/vllm-project/vllm - eyuansu62 opened this issue 28 days ago
github.com/vllm-project/vllm - eyuansu62 opened this issue 28 days ago
[Misc] Show AMD GPU topology in `collect_env.py`
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 28 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 28 days ago
[Frontend] Batch inference for llm.chat() API
github.com/vllm-project/vllm - aandyw opened this pull request 28 days ago
github.com/vllm-project/vllm - aandyw opened this pull request 28 days ago
[Kernel][Triton][AMD] Remove tl.atomic_add from awq_gemm_kernel, 2-5x speedup MI300, minor improvement for MI250
github.com/vllm-project/vllm - rasmith opened this pull request 28 days ago
github.com/vllm-project/vllm - rasmith opened this pull request 28 days ago
[Core] CUDA Graphs for Multi-Step + Chunked-Prefill
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 29 days ago
github.com/vllm-project/vllm - varun-sundar-rabindranath opened this pull request 29 days ago
[Kernel][Bugfix] Delete some more useless code in marlin_moe_ops.cu
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 29 days ago
github.com/vllm-project/vllm - tlrmchlsmth opened this pull request 29 days ago
[Bugfix][Core] Fix tekken edge case for mistral tokenizer
github.com/vllm-project/vllm - patrickvonplaten opened this pull request 29 days ago
github.com/vllm-project/vllm - patrickvonplaten opened this pull request 29 days ago
[Performance]: The accept rate of typical acceptance sampling
github.com/vllm-project/vllm - hustxiayang opened this issue 29 days ago
github.com/vllm-project/vllm - hustxiayang opened this issue 29 days ago
[Bugfix] Handle `best_of>1` & `use_beam_search` by disabling multi-step scheduling.
github.com/vllm-project/vllm - afeldman-nm opened this pull request 29 days ago
github.com/vllm-project/vllm - afeldman-nm opened this pull request 29 days ago
[Feature]: OpenAI o1-like Chain-of-thought (CoT) inference workflow
github.com/vllm-project/vllm - kozuch opened this issue 29 days ago
github.com/vllm-project/vllm - kozuch opened this issue 29 days ago
[Bug]: OpenGVLab/InternVL2-Llama3-76B: view size is not compatible with input tensor's size and stride
github.com/vllm-project/vllm - erkintelnyx opened this issue 29 days ago
github.com/vllm-project/vllm - erkintelnyx opened this issue 29 days ago
[Bug]: MistralTokenizer Detokenization Issue
github.com/vllm-project/vllm - ywang96 opened this issue 29 days ago
github.com/vllm-project/vllm - ywang96 opened this issue 29 days ago
qwen2-vl: AttributeError: '_OpNamespace' '_C' object has no attribute 'gelu_quick'
github.com/vllm-project/vllm - xiangxinhello opened this issue 29 days ago
github.com/vllm-project/vllm - xiangxinhello opened this issue 29 days ago
[Feature]: Output logps of given output
github.com/vllm-project/vllm - lycheeyolo opened this issue 29 days ago
github.com/vllm-project/vllm - lycheeyolo opened this issue 29 days ago
[Bug]: vllm deploy medusa, draft acceptance rate: 0.000
github.com/vllm-project/vllm - xhjcxxl opened this issue 29 days ago
github.com/vllm-project/vllm - xhjcxxl opened this issue 29 days ago
[Doc] Add documentation for GGUF quantization
github.com/vllm-project/vllm - Isotr0py opened this pull request 29 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 29 days ago
[Usage]: Number of requests currently in the queue
github.com/vllm-project/vllm - shubh9m opened this issue 29 days ago
github.com/vllm-project/vllm - shubh9m opened this issue 29 days ago
[Misc] Support FP8 MoE for compressed-tensors
github.com/vllm-project/vllm - mgoin opened this pull request 29 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 29 days ago
[Bugfix] Use heartbeats instead of health checks
github.com/vllm-project/vllm - joerunde opened this pull request 29 days ago
github.com/vllm-project/vllm - joerunde opened this pull request 29 days ago
[Usage]: How to run VLLM on multiple tpu hosts V4-32
github.com/vllm-project/vllm - sparsh35 opened this issue 30 days ago
github.com/vllm-project/vllm - sparsh35 opened this issue 30 days ago
[Bug]: lm-format-enforcer guided decoding kills MQLLMEngine
github.com/vllm-project/vllm - joerunde opened this issue 30 days ago
github.com/vllm-project/vllm - joerunde opened this issue 30 days ago
[Core] Allow IPv6 in VLLM_HOST_IP with zmq
github.com/vllm-project/vllm - russellb opened this pull request 30 days ago
github.com/vllm-project/vllm - russellb opened this pull request 30 days ago
fix validation: Only set tool_choice `auto` if at least one tool is provided
github.com/vllm-project/vllm - chiragjn opened this pull request 30 days ago
github.com/vllm-project/vllm - chiragjn opened this pull request 30 days ago
[Feature]: Offline quantization for Pixtral-12B
github.com/vllm-project/vllm - KohakuBlueleaf opened this issue 30 days ago
github.com/vllm-project/vllm - KohakuBlueleaf opened this issue 30 days ago
Fix typical acceptance sampler with correct recovered token ids
github.com/vllm-project/vllm - jiqing-feng opened this pull request about 1 month ago
github.com/vllm-project/vllm - jiqing-feng opened this pull request about 1 month ago
[Misc]: Create ProfileConfig for Profiling
github.com/vllm-project/vllm - sylviayangyy opened this issue about 1 month ago
github.com/vllm-project/vllm - sylviayangyy opened this issue about 1 month ago
[Bug]: Profiling RuntimeError when `with_stack=True`
github.com/vllm-project/vllm - sylviayangyy opened this issue about 1 month ago
github.com/vllm-project/vllm - sylviayangyy opened this issue about 1 month ago
[Usage]: Behavior with LoRA Ranks dynamic loading
github.com/vllm-project/vllm - zhao-lun opened this issue about 1 month ago
github.com/vllm-project/vllm - zhao-lun opened this issue about 1 month ago
[Bugfix] Fix potentially unsafe custom allreduce synchronization
github.com/vllm-project/vllm - hanzhi713 opened this pull request about 1 month ago
github.com/vllm-project/vllm - hanzhi713 opened this pull request about 1 month ago
[MISC] add support custom_op check
github.com/vllm-project/vllm - jikunshang opened this pull request about 1 month ago
github.com/vllm-project/vllm - jikunshang opened this pull request about 1 month ago
[Bugfix] Config.__init__() got an unexpected keyword argument 'engine' api_server args
github.com/vllm-project/vllm - Juelianqvq opened this pull request about 1 month ago
github.com/vllm-project/vllm - Juelianqvq opened this pull request about 1 month ago
[Bug]: Mistral file names are hardcoded in vllm, making fine tunes tough to use
github.com/vllm-project/vllm - dsingal0 opened this issue about 1 month ago
github.com/vllm-project/vllm - dsingal0 opened this issue about 1 month ago
[Misc] Add argument to disable FastAPI docs
github.com/vllm-project/vllm - Jeffwan opened this pull request about 1 month ago
github.com/vllm-project/vllm - Jeffwan opened this pull request about 1 month ago
[Bug]: Online serving failing for Phi-3-vision-128k-instruct
github.com/vllm-project/vllm - Muhtasham opened this issue about 1 month ago
github.com/vllm-project/vllm - Muhtasham opened this issue about 1 month ago
[Usage]: Controlling the number of requests in a batch
github.com/vllm-project/vllm - shubh9m opened this issue about 1 month ago
github.com/vllm-project/vllm - shubh9m opened this issue about 1 month ago
[CI/Build] Re-enabling Entrypoints tests on ROCm, excluding ones that fail
github.com/vllm-project/vllm - alexeykondrat opened this pull request about 1 month ago
github.com/vllm-project/vllm - alexeykondrat opened this pull request about 1 month ago
[doc] improve installation doc
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
Enabling Agent Splitting.
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request about 1 month ago
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request about 1 month ago
[Bugfix] Validate SamplingParam n is an int
github.com/vllm-project/vllm - saumya-saran opened this pull request about 1 month ago
github.com/vllm-project/vllm - saumya-saran opened this pull request about 1 month ago
[Feature]: Quantisation Support with CPU Backend
github.com/vllm-project/vllm - Christofon opened this issue about 1 month ago
github.com/vllm-project/vllm - Christofon opened this issue about 1 month ago
[Bugfix] [Encoder-Decoder] Bugfix for encoder specific metadata construction during decode of encoder-decoder models.
github.com/vllm-project/vllm - sroy745 opened this pull request about 1 month ago
github.com/vllm-project/vllm - sroy745 opened this pull request about 1 month ago
[Bugfix] Fix TP > 1 for new granite
github.com/vllm-project/vllm - joerunde opened this pull request about 1 month ago
github.com/vllm-project/vllm - joerunde opened this pull request about 1 month ago
[Core] zmq: bind only to localhost for local-only usage
github.com/vllm-project/vllm - russellb opened this pull request about 1 month ago
github.com/vllm-project/vllm - russellb opened this pull request about 1 month ago
[Bug]: deepseek_Coder_v2_Instruct give wrong output on vllm==0.5.4, 0.5.5, and 0.6.1.post2 (others not tried) with huggingface standard usage
github.com/vllm-project/vllm - iamhappytoo opened this issue about 1 month ago
github.com/vllm-project/vllm - iamhappytoo opened this issue about 1 month ago
[Doc] update the debugging document to add more explanation on `gpu_memory_utilization` and CUDA OOM issues
github.com/vllm-project/vllm - yangalan123 opened this pull request about 1 month ago
github.com/vllm-project/vllm - yangalan123 opened this pull request about 1 month ago
[CI/Build] fix Dockerfile.cpu on podman
github.com/vllm-project/vllm - dtrifiro opened this pull request about 1 month ago
github.com/vllm-project/vllm - dtrifiro opened this pull request about 1 month ago
[Misc]: RuntimeError: CUDA error: invalid configuration argument
github.com/vllm-project/vllm - YildizBurhan opened this issue about 1 month ago
github.com/vllm-project/vllm - YildizBurhan opened this issue about 1 month ago
[Bug]: Running Llama-3.1-405B on AMD MI300X with FP8 quantization fails
github.com/vllm-project/vllm - danielphilipp opened this issue about 1 month ago
github.com/vllm-project/vllm - danielphilipp opened this issue about 1 month ago
[Bugfix] fix OpenAI API server startup with --disable-frontend-multiprocessing
github.com/vllm-project/vllm - dtrifiro opened this pull request about 1 month ago
github.com/vllm-project/vllm - dtrifiro opened this pull request about 1 month ago
[Bug]: Model load on 2 or 4-gpu A100 setup may cause default text encoding to be ascii, unless enforce_eager=True
github.com/vllm-project/vllm - cbquillen opened this issue about 1 month ago
github.com/vllm-project/vllm - cbquillen opened this issue about 1 month ago
[Installation]: Assets v0.6 for cuda 12+
github.com/vllm-project/vllm - GennVa opened this issue about 1 month ago
github.com/vllm-project/vllm - GennVa opened this issue about 1 month ago
[CI/Build] Avoid CUDA initialization
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
[Kernel][Model] Varlen prefill + Prefill chunking support for mamba kernels
github.com/vllm-project/vllm - mzusman opened this pull request about 1 month ago
github.com/vllm-project/vllm - mzusman opened this pull request about 1 month ago
[Installation]: vLLM build from source errors
github.com/vllm-project/vllm - Imss27 opened this issue about 1 month ago
github.com/vllm-project/vllm - Imss27 opened this issue about 1 month ago
[Performance]: Suitable draft model for llama3.1 8b
github.com/vllm-project/vllm - hustxiayang opened this issue about 1 month ago
github.com/vllm-project/vllm - hustxiayang opened this issue about 1 month ago
ppc64le: Dockerfile and CI fix
github.com/vllm-project/vllm - sumitd2 opened this pull request about 1 month ago
github.com/vllm-project/vllm - sumitd2 opened this pull request about 1 month ago
[CI/Build][Misc] Comparing between block manager v1 and v2, under full prefix sharing and no prefix sharing case.
github.com/vllm-project/vllm - KuntaiDu opened this pull request about 1 month ago
github.com/vllm-project/vllm - KuntaiDu opened this pull request about 1 month ago
[Misc] Don't dump contents of kvcache tensors on errors
github.com/vllm-project/vllm - njhill opened this pull request about 1 month ago
github.com/vllm-project/vllm - njhill opened this pull request about 1 month ago
[torch.compile] register allreduce operations as custom ops
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Frontend] Improve Nullable kv Arg Parsing
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request about 1 month ago
github.com/vllm-project/vllm - alex-jw-brooks opened this pull request about 1 month ago
[refactor] remove triton based sampler
github.com/vllm-project/vllm - simon-mo opened this pull request about 1 month ago
github.com/vllm-project/vllm - simon-mo opened this pull request about 1 month ago
[Feature]: APC introspection interface
github.com/vllm-project/vllm - lun-4 opened this issue about 1 month ago
github.com/vllm-project/vllm - lun-4 opened this issue about 1 month ago
Adding metrics to external cache services
github.com/vllm-project/vllm - happyandslow opened this pull request about 1 month ago
github.com/vllm-project/vllm - happyandslow opened this pull request about 1 month ago
[Misc][Bugfix] Disable guided decoding for mistral tokenizer
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
[CI/Build] Excluding kernels/test_gguf.py from ROCm
github.com/vllm-project/vllm - alexeykondrat opened this pull request about 1 month ago
github.com/vllm-project/vllm - alexeykondrat opened this pull request about 1 month ago