Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Misc] Improve type annotations for `support_torch_compile`
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 22 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 22 days ago
support download Lora Model from ModelScope and download private mode…
github.com/vllm-project/vllm - AlphaINF opened this pull request 23 days ago
github.com/vllm-project/vllm - AlphaINF opened this pull request 23 days ago
TypeError: ChatGLMTokenizer._pad() got an unexpected keyword argument 'padding_side'
github.com/vllm-project/vllm - wenruihua opened this issue 23 days ago
github.com/vllm-project/vllm - wenruihua opened this issue 23 days ago
[Usage]: 可以在vllm的日志打印中加入模型的输出吗?因为请求端有的时候看不到结果,但是模型已经推理结束了,想在服务端看一下模型的输出
github.com/vllm-project/vllm - WangJianQ-0118 opened this issue 23 days ago
github.com/vllm-project/vllm - WangJianQ-0118 opened this issue 23 days ago
[Usage]: ValueError: Model architectures ['Qwen2ForCausalLM'] failed to be inspected. Please check the logs for more details.
github.com/vllm-project/vllm - despzcm opened this issue 23 days ago
github.com/vllm-project/vllm - despzcm opened this issue 23 days ago
[Feature]: ChatCompletionRequest get default value from generation_config.json
github.com/vllm-project/vllm - zhaotyer opened this issue 23 days ago
github.com/vllm-project/vllm - zhaotyer opened this issue 23 days ago
[platform] Add verify_quantization in platform.
github.com/vllm-project/vllm - wangxiyuan opened this pull request 23 days ago
github.com/vllm-project/vllm - wangxiyuan opened this pull request 23 days ago
[Misc]: Qwen2VL Vision ID Support
github.com/vllm-project/vllm - yusufani opened this issue 23 days ago
github.com/vllm-project/vllm - yusufani opened this issue 23 days ago
[Usage]: How to use `use_image_id` and `max_slice_num` parameter
github.com/vllm-project/vllm - 2U1 opened this issue 23 days ago
github.com/vllm-project/vllm - 2U1 opened this issue 23 days ago
[Feature]: Beam search: top_p, min_p and logit processors
github.com/vllm-project/vllm - denadai2 opened this issue 23 days ago
github.com/vllm-project/vllm - denadai2 opened this issue 23 days ago
[Bugfix] Prevent benchmark_throughput.py from using duplicated random prompts
github.com/vllm-project/vllm - mgoin opened this pull request 23 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 23 days ago
[Feature]: Enable `/score` endpoint for all embedding models
github.com/vllm-project/vllm - maxdebayser opened this issue 23 days ago
github.com/vllm-project/vllm - maxdebayser opened this issue 23 days ago
[Model] Clean up MiniCPMV
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 23 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 23 days ago
Configuration of the model parallelism does not make sense
github.com/vllm-project/vllm - fajavadi opened this pull request 23 days ago
github.com/vllm-project/vllm - fajavadi opened this pull request 23 days ago
[Misc][XPU] Avoid torch compile for XPU platform
github.com/vllm-project/vllm - yma11 opened this pull request 23 days ago
github.com/vllm-project/vllm - yma11 opened this pull request 23 days ago
[Bug]: Making a request to the OpenAI API server with n=2 and best_of=2 fails
github.com/vllm-project/vllm - payoto opened this issue 23 days ago
github.com/vllm-project/vllm - payoto opened this issue 23 days ago
[Misc] typo find in sampling_metadata.py
github.com/vllm-project/vllm - noooop opened this pull request 23 days ago
github.com/vllm-project/vllm - noooop opened this pull request 23 days ago
[Model] Add has_weight to RMSNorm and re-enable weights loading tracker for Mamba
github.com/vllm-project/vllm - Isotr0py opened this pull request 23 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 23 days ago
[V1] Optimize the CPU overheads in FlashAttention custom op
github.com/vllm-project/vllm - WoosukKwon opened this pull request 24 days ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request 24 days ago
[doc]Update config docstring
github.com/vllm-project/vllm - wangxiyuan opened this pull request 24 days ago
github.com/vllm-project/vllm - wangxiyuan opened this pull request 24 days ago
[Core] Refactoring disaggregated prefilling/decoding using Mooncake Transfer Engine
github.com/vllm-project/vllm - alogfans opened this pull request 24 days ago
github.com/vllm-project/vllm - alogfans opened this pull request 24 days ago
[Doc]: BNB 8 bit quantization is undocumented
github.com/vllm-project/vllm - molereddy opened this issue 24 days ago
github.com/vllm-project/vllm - molereddy opened this issue 24 days ago
[Bugfix] Fix BNB loader target_modules
github.com/vllm-project/vllm - jeejeelee opened this pull request 24 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 24 days ago
[Model] Update multi-modal processor to support Mantis(LLaVA) model
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 24 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 24 days ago
[Bug]: VLLM run very very slow in ARM cpu
github.com/vllm-project/vllm - feikiss opened this issue 24 days ago
github.com/vllm-project/vllm - feikiss opened this issue 24 days ago
[WIP][CI]add genai-perf benchmark in nightly benchmark
github.com/vllm-project/vllm - jikunshang opened this pull request 24 days ago
github.com/vllm-project/vllm - jikunshang opened this pull request 24 days ago
[V1] Initial support of multimodal models for V1 re-arch
github.com/vllm-project/vllm - ywang96 opened this pull request 24 days ago
github.com/vllm-project/vllm - ywang96 opened this pull request 24 days ago
[Bug]: v0.6.4.post1 Qwen2-VL-7B-Instruct-AWQ crash:shape mismatch
github.com/vllm-project/vllm - wciq1208 opened this issue 25 days ago
github.com/vllm-project/vllm - wciq1208 opened this issue 25 days ago
[V1] Adding min tokens/repetition/presence/frequence penalties to V1 sampler
github.com/vllm-project/vllm - sroy745 opened this pull request 25 days ago
github.com/vllm-project/vllm - sroy745 opened this pull request 25 days ago
[Model] Implement merged input processor for LLaVA model
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 25 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 25 days ago
[RFC]: Make any vLLM model a pooling model
github.com/vllm-project/vllm - DarkLight1337 opened this issue 25 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this issue 25 days ago
[Doc] Add github links for source code references
github.com/vllm-project/vllm - russellb opened this pull request 25 days ago
github.com/vllm-project/vllm - russellb opened this pull request 25 days ago
[Feature]: Integrate with XGrammar for zero-overhead structured generation in LLM inference.
github.com/vllm-project/vllm - choisioo opened this issue 25 days ago
github.com/vllm-project/vllm - choisioo opened this issue 25 days ago
[V1] VLM - Run the mm_mapper preprocessor in the frontend process
github.com/vllm-project/vllm - alexm-neuralmagic opened this pull request 26 days ago
github.com/vllm-project/vllm - alexm-neuralmagic opened this pull request 26 days ago
[Model] Enable optional prefix when loading embedding models
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 26 days ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request 26 days ago
[Usage]: What should the chat template for the `meta-llama/Llama-3.2-3B` be?
github.com/vllm-project/vllm - mrakgr opened this issue 26 days ago
github.com/vllm-project/vllm - mrakgr opened this issue 26 days ago
[Bug]: Crash with Qwen2-Audio Model in vLLM During Audio Processing
github.com/vllm-project/vllm - jiahansu opened this issue 26 days ago
github.com/vllm-project/vllm - jiahansu opened this issue 26 days ago
[Core][Bugfix] Use correct device to initialize GPU data during CUDA-graph-capture
github.com/vllm-project/vllm - IdoAsraff opened this pull request 27 days ago
github.com/vllm-project/vllm - IdoAsraff opened this pull request 27 days ago
[Bug]: When apply prompt_logprobs for OpenAI server, the prompt_logprobs field in respnose does not show which token is chosen
github.com/vllm-project/vllm - DIYer22 opened this issue 27 days ago
github.com/vllm-project/vllm - DIYer22 opened this issue 27 days ago
[Bug]: Authorization ignored when root_path is set
github.com/vllm-project/vllm - chaunceyjiang opened this pull request 27 days ago
github.com/vllm-project/vllm - chaunceyjiang opened this pull request 27 days ago
[Usage]: Why `use_beam_search` is eliminated in `vllm.SamplingParams` from v0.6.3?
github.com/vllm-project/vllm - BAI-Yeqi opened this issue 27 days ago
github.com/vllm-project/vllm - BAI-Yeqi opened this issue 27 days ago
[fix] Correct num_accepted_tokens counting
github.com/vllm-project/vllm - KexinFeng opened this pull request 27 days ago
github.com/vllm-project/vllm - KexinFeng opened this pull request 27 days ago
[doc] update the code to add models
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
[Usage]: How to make model response information appear in the vllm backend logs
github.com/vllm-project/vllm - nora647 opened this issue 27 days ago
github.com/vllm-project/vllm - nora647 opened this issue 27 days ago
Revert "[CI/Build] Print running script to enhance CI log readability"
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 27 days ago
[Bug]: GGUF Model Output Repeats Nonsensically
github.com/vllm-project/vllm - Mayflyyh opened this issue 28 days ago
github.com/vllm-project/vllm - Mayflyyh opened this issue 28 days ago
[model][utils] add extract_layer_index utility function
github.com/vllm-project/vllm - youkaichao opened this pull request 28 days ago
github.com/vllm-project/vllm - youkaichao opened this pull request 28 days ago
[Usage]: While loading model get 'layers.0.mlp.down_proj.weight' after merge_and_unload()
github.com/vllm-project/vllm - alex2romanov opened this issue 28 days ago
github.com/vllm-project/vllm - alex2romanov opened this issue 28 days ago
[Misc]Further reduce BNB static variable
github.com/vllm-project/vllm - jeejeelee opened this pull request 28 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 28 days ago
[CI/Build] Print running script to enhance CI log readability
github.com/vllm-project/vllm - jeejeelee opened this pull request 28 days ago
github.com/vllm-project/vllm - jeejeelee opened this pull request 28 days ago
[Bugfix] Avoid import AttentionMetadata explicitly in Mllama and fix openvino import
github.com/vllm-project/vllm - Isotr0py opened this pull request 28 days ago
github.com/vllm-project/vllm - Isotr0py opened this pull request 28 days ago
[Interleaved ATTN] Support for Mistral-8B
github.com/vllm-project/vllm - patrickvonplaten opened this pull request 28 days ago
github.com/vllm-project/vllm - patrickvonplaten opened this pull request 28 days ago
[Bug] Streaming output error of tool calling has still not been resolved.
github.com/vllm-project/vllm - Sala8888 opened this issue 29 days ago
github.com/vllm-project/vllm - Sala8888 opened this issue 29 days ago
[Kernel] Remove hard-dependencies of Speculative decode to CUDA workers
github.com/vllm-project/vllm - xuechendi opened this pull request 29 days ago
github.com/vllm-project/vllm - xuechendi opened this pull request 29 days ago
[Bug]: Duplicate request_id breaks the engine
github.com/vllm-project/vllm - tjohnson31415 opened this issue 29 days ago
github.com/vllm-project/vllm - tjohnson31415 opened this issue 29 days ago
[Core] Update to outlines > 0.1.4
github.com/vllm-project/vllm - russellb opened this pull request 29 days ago
github.com/vllm-project/vllm - russellb opened this pull request 29 days ago
[Installation]: Segmentation fault when building Docker container on WSL
github.com/vllm-project/vllm - nlsferrara opened this issue 29 days ago
github.com/vllm-project/vllm - nlsferrara opened this issue 29 days ago
[V1] Refactor model executable interface for multimodal models
github.com/vllm-project/vllm - ywang96 opened this pull request 29 days ago
github.com/vllm-project/vllm - ywang96 opened this pull request 29 days ago
[Hardware][Intel-Gaudi] Enable LoRA support for Intel Gaudi (HPU)
github.com/vllm-project/vllm - SanjuCSudhakaran opened this pull request 29 days ago
github.com/vllm-project/vllm - SanjuCSudhakaran opened this pull request 29 days ago
[Docs] Add dedicated tool calling page to docs
github.com/vllm-project/vllm - mgoin opened this pull request 30 days ago
github.com/vllm-project/vllm - mgoin opened this pull request 30 days ago
[Usage]: Can we extend the context length of gemma2 model or other models?
github.com/vllm-project/vllm - hahmad2008 opened this issue 30 days ago
github.com/vllm-project/vllm - hahmad2008 opened this issue 30 days ago
[Feature]: Support for Registering Model-Specific Default Sampling Parameters
github.com/vllm-project/vllm - yansh97 opened this issue about 1 month ago
github.com/vllm-project/vllm - yansh97 opened this issue about 1 month ago
[Usage]: How to use ROPE scaling for llama3.1 and gemma2?
github.com/vllm-project/vllm - hahmad2008 opened this issue about 1 month ago
github.com/vllm-project/vllm - hahmad2008 opened this issue about 1 month ago
[CI][Installation] Avoid uploading CUDA 11.8 wheel
github.com/vllm-project/vllm - cermeng opened this pull request about 1 month ago
github.com/vllm-project/vllm - cermeng opened this pull request about 1 month ago
[Usage]: Fail to load config.json
github.com/vllm-project/vllm - dequeueing opened this issue about 1 month ago
github.com/vllm-project/vllm - dequeueing opened this issue about 1 month ago
[Bug]: vllm failed to run two instance with one gpu
github.com/vllm-project/vllm - pandada8 opened this issue about 1 month ago
github.com/vllm-project/vllm - pandada8 opened this issue about 1 month ago
Add Sageattention backend
github.com/vllm-project/vllm - flozi00 opened this pull request about 1 month ago
github.com/vllm-project/vllm - flozi00 opened this pull request about 1 month ago
[Bug]: Authorization ignored when root_path is set
github.com/vllm-project/vllm - OskarLiew opened this issue about 1 month ago
github.com/vllm-project/vllm - OskarLiew opened this issue about 1 month ago
[Misc] Suppress duplicated logging regarding multimodal input pipeline
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
[8/N] enable cli flag without a space
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[V1] Fix Compilation config & Enable CUDA graph by default
github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[Usage]: Optimizing TTFT for Qwen2.5-72B Model Deployment on A800 GPUs for RAG Application
github.com/vllm-project/vllm - zhanghx0905 opened this issue about 1 month ago
github.com/vllm-project/vllm - zhanghx0905 opened this issue about 1 month ago
[Feature]: Additional possible value for `tool_choice`: `required`
github.com/vllm-project/vllm - fahadh4ilyas opened this issue about 1 month ago
github.com/vllm-project/vllm - fahadh4ilyas opened this issue about 1 month ago
[Bug]: Gemma2 becomes a fool.
github.com/vllm-project/vllm - Foreist opened this issue about 1 month ago
github.com/vllm-project/vllm - Foreist opened this issue about 1 month ago
fix the issue that len(tokenizer(prompt)["input_ids"]) > prompt_len
github.com/vllm-project/vllm - sywangyi opened this pull request about 1 month ago
github.com/vllm-project/vllm - sywangyi opened this pull request about 1 month ago
[Kernel] Register punica ops directly
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Usage]: when i set --tensor-parallel-size 4 ,openai server dose not work . Report a new Exception
github.com/vllm-project/vllm - Geek-Peng opened this issue about 1 month ago
github.com/vllm-project/vllm - Geek-Peng opened this issue about 1 month ago
[platforms] improve error message for unspecified platforms
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server
github.com/vllm-project/vllm - angkywilliam opened this pull request about 1 month ago
github.com/vllm-project/vllm - angkywilliam opened this pull request about 1 month ago
[Model] Expose `dynamic_image_size` as mm_processor_kwargs for InternVL2 models
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
[Usage]: What's the relationship between KV cache and MAX_SEQUENCE_LENGTH.
github.com/vllm-project/vllm - GRuuuuu opened this issue about 1 month ago
github.com/vllm-project/vllm - GRuuuuu opened this issue about 1 month ago
[Bug]: Model does not split in multiple Gpus instead it occupy same memory on each GPU
github.com/vllm-project/vllm - anilkumar0502 opened this issue about 1 month ago
github.com/vllm-project/vllm - anilkumar0502 opened this issue about 1 month ago
[Feature]: Manually inject Prefix KV Cache
github.com/vllm-project/vllm - toilaluan opened this issue about 1 month ago
github.com/vllm-project/vllm - toilaluan opened this issue about 1 month ago
[Model]: Add support for Aria model
github.com/vllm-project/vllm - xffxff opened this pull request about 1 month ago
github.com/vllm-project/vllm - xffxff opened this pull request about 1 month ago
[Doc] fix a small typo in docstring of llama_tool_parser
github.com/vllm-project/vllm - FerdinandZhong opened this pull request about 1 month ago
github.com/vllm-project/vllm - FerdinandZhong opened this pull request about 1 month ago
[core] overhaul memory profiling and fix backward compatibility
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Feature]: Multimodel prefix-caching features
github.com/vllm-project/vllm - justzhanghong opened this issue about 1 month ago
github.com/vllm-project/vllm - justzhanghong opened this issue about 1 month ago
[Platforms] Add `device_type` in `Platform`
github.com/vllm-project/vllm - MengqingCao opened this pull request about 1 month ago
github.com/vllm-project/vllm - MengqingCao opened this pull request about 1 month ago
[WIP][v1] Refactor KVCacheManager for more hash input than token ids
github.com/vllm-project/vllm - rickyyx opened this pull request about 1 month ago
github.com/vllm-project/vllm - rickyyx opened this pull request about 1 month ago
Need to update the jax and jaxlib version
github.com/vllm-project/vllm - vanbasten23 opened this pull request about 1 month ago
github.com/vllm-project/vllm - vanbasten23 opened this pull request about 1 month ago
Turn on V1 for H200 build
github.com/vllm-project/vllm - simon-mo opened this pull request about 1 month ago
github.com/vllm-project/vllm - simon-mo opened this pull request about 1 month ago
Metrics model name when using multiple loras
github.com/vllm-project/vllm - mces89 opened this issue about 1 month ago
github.com/vllm-project/vllm - mces89 opened this issue about 1 month ago
[Model] Add OLMo November 2024 model
github.com/vllm-project/vllm - 2015aroras opened this pull request about 1 month ago
github.com/vllm-project/vllm - 2015aroras opened this pull request about 1 month ago
[Core] Implement disagg prefill by StatelessProcessGroup
github.com/vllm-project/vllm - KuntaiDu opened this pull request about 1 month ago
github.com/vllm-project/vllm - KuntaiDu opened this pull request about 1 month ago
Setting default for EmbeddingChatRequest.add_generation_prompt to False
github.com/vllm-project/vllm - noamgat opened this pull request about 1 month ago
github.com/vllm-project/vllm - noamgat opened this pull request about 1 month ago