Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
vLLM
vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs).
Collective -
Host: opensource -
https://opencollective.com/vllm
- Code: https://github.com/vllm-project/vllm
[Usage]: While loading model get 'layers.0.mlp.down_proj.weight' after merge_and_unload()
github.com/vllm-project/vllm - alex2romanov opened this issue about 1 month ago
github.com/vllm-project/vllm - alex2romanov opened this issue about 1 month ago
[Misc]Further reduce BNB static variable
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[CI/Build] Print running script to enhance CI log readability
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Bugfix] Avoid import AttentionMetadata explicitly in Mllama and fix openvino import
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
[Interleaved ATTN] Support for Mistral-8B
github.com/vllm-project/vllm - patrickvonplaten opened this pull request about 1 month ago
github.com/vllm-project/vllm - patrickvonplaten opened this pull request about 1 month ago
[Bug] Streaming output error of tool calling has still not been resolved.
github.com/vllm-project/vllm - Sala8888 opened this issue about 1 month ago
github.com/vllm-project/vllm - Sala8888 opened this issue about 1 month ago
[Kernel] Remove hard-dependencies of Speculative decode to CUDA workers
github.com/vllm-project/vllm - xuechendi opened this pull request about 1 month ago
github.com/vllm-project/vllm - xuechendi opened this pull request about 1 month ago
[Bug]: Duplicate request_id breaks the engine
github.com/vllm-project/vllm - tjohnson31415 opened this issue about 1 month ago
github.com/vllm-project/vllm - tjohnson31415 opened this issue about 1 month ago
[Core] Update to outlines >= 0.1.8
github.com/vllm-project/vllm - russellb opened this pull request about 1 month ago
github.com/vllm-project/vllm - russellb opened this pull request about 1 month ago
[Installation]: Segmentation fault when building Docker container on WSL
github.com/vllm-project/vllm - nlsferrara opened this issue about 1 month ago
github.com/vllm-project/vllm - nlsferrara opened this issue about 1 month ago
[V1] Refactor model executable interface for multimodal models
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
[Hardware][Intel-Gaudi] Enable LoRA support for Intel Gaudi (HPU)
github.com/vllm-project/vllm - SanjuCSudhakaran opened this pull request about 1 month ago
github.com/vllm-project/vllm - SanjuCSudhakaran opened this pull request about 1 month ago
[Docs] Add dedicated tool calling page to docs
github.com/vllm-project/vllm - mgoin opened this pull request about 1 month ago
github.com/vllm-project/vllm - mgoin opened this pull request about 1 month ago
[Usage]: Can we extend the context length of gemma2 model or other models?
github.com/vllm-project/vllm - hahmad2008 opened this issue about 1 month ago
github.com/vllm-project/vllm - hahmad2008 opened this issue about 1 month ago
[Feature]: Support for Registering Model-Specific Default Sampling Parameters
github.com/vllm-project/vllm - yansh97 opened this issue about 1 month ago
github.com/vllm-project/vllm - yansh97 opened this issue about 1 month ago
[Usage]: How to use ROPE scaling for llama3.1 and gemma2?
github.com/vllm-project/vllm - hahmad2008 opened this issue about 1 month ago
github.com/vllm-project/vllm - hahmad2008 opened this issue about 1 month ago
[CI][Installation] Avoid uploading CUDA 11.8 wheel
github.com/vllm-project/vllm - cermeng opened this pull request about 1 month ago
github.com/vllm-project/vllm - cermeng opened this pull request about 1 month ago
[Usage]: Fail to load config.json
github.com/vllm-project/vllm - dequeueing opened this issue about 1 month ago
github.com/vllm-project/vllm - dequeueing opened this issue about 1 month ago
[Bug]: vllm failed to run two instance with one gpu
github.com/vllm-project/vllm - pandada8 opened this issue about 1 month ago
github.com/vllm-project/vllm - pandada8 opened this issue about 1 month ago
Add Sageattention backend
github.com/vllm-project/vllm - flozi00 opened this pull request about 1 month ago
github.com/vllm-project/vllm - flozi00 opened this pull request about 1 month ago
[Bug]: Authorization ignored when root_path is set
github.com/vllm-project/vllm - OskarLiew opened this issue about 1 month ago
github.com/vllm-project/vllm - OskarLiew opened this issue about 1 month ago
[Misc] Suppress duplicated logging regarding multimodal input pipeline
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ywang96 opened this pull request about 1 month ago
[8/N] enable cli flag without a space
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[V1] Fix Compilation config & Enable CUDA graph by default
github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
github.com/vllm-project/vllm - WoosukKwon opened this pull request about 1 month ago
[Usage]: Optimizing TTFT for Qwen2.5-72B Model Deployment on A800 GPUs for RAG Application
github.com/vllm-project/vllm - zhanghx0905 opened this issue about 1 month ago
github.com/vllm-project/vllm - zhanghx0905 opened this issue about 1 month ago
[Feature]: Additional possible value for `tool_choice`: `required`
github.com/vllm-project/vllm - fahadh4ilyas opened this issue about 1 month ago
github.com/vllm-project/vllm - fahadh4ilyas opened this issue about 1 month ago
[Bug]: Gemma2 becomes a fool.
github.com/vllm-project/vllm - Foreist opened this issue about 1 month ago
github.com/vllm-project/vllm - Foreist opened this issue about 1 month ago
fix the issue that len(tokenizer(prompt)["input_ids"]) > prompt_len
github.com/vllm-project/vllm - sywangyi opened this pull request about 1 month ago
github.com/vllm-project/vllm - sywangyi opened this pull request about 1 month ago
[Kernel] Register punica ops directly
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Usage]: when i set --tensor-parallel-size 4 ,openai server dose not work . Report a new Exception
github.com/vllm-project/vllm - Geek-Peng opened this issue about 1 month ago
github.com/vllm-project/vllm - Geek-Peng opened this issue about 1 month ago
[platforms] improve error message for unspecified platforms
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server
github.com/vllm-project/vllm - angkywilliam opened this pull request about 1 month ago
github.com/vllm-project/vllm - angkywilliam opened this pull request about 1 month ago
[Model] Expose `dynamic_image_size` as mm_processor_kwargs for InternVL2 models
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
github.com/vllm-project/vllm - Isotr0py opened this pull request about 1 month ago
[Usage]: What's the relationship between KV cache and MAX_SEQUENCE_LENGTH.
github.com/vllm-project/vllm - GRuuuuu opened this issue about 1 month ago
github.com/vllm-project/vllm - GRuuuuu opened this issue about 1 month ago
[Bug]: Model does not split in multiple Gpus instead it occupy same memory on each GPU
github.com/vllm-project/vllm - anilkumar0502 opened this issue about 1 month ago
github.com/vllm-project/vllm - anilkumar0502 opened this issue about 1 month ago
[Feature]: Manually inject Prefix KV Cache
github.com/vllm-project/vllm - toilaluan opened this issue about 1 month ago
github.com/vllm-project/vllm - toilaluan opened this issue about 1 month ago
[Model]: Add support for Aria model
github.com/vllm-project/vllm - xffxff opened this pull request about 1 month ago
github.com/vllm-project/vllm - xffxff opened this pull request about 1 month ago
[Doc] fix a small typo in docstring of llama_tool_parser
github.com/vllm-project/vllm - FerdinandZhong opened this pull request about 1 month ago
github.com/vllm-project/vllm - FerdinandZhong opened this pull request about 1 month ago
[core] overhaul memory profiling and fix backward compatibility
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Feature]: Multimodel prefix-caching features
github.com/vllm-project/vllm - justzhanghong opened this issue about 1 month ago
github.com/vllm-project/vllm - justzhanghong opened this issue about 1 month ago
[Platforms] Add `device_type` in `Platform`
github.com/vllm-project/vllm - MengqingCao opened this pull request about 1 month ago
github.com/vllm-project/vllm - MengqingCao opened this pull request about 1 month ago
[WIP][v1] Refactor KVCacheManager for more hash input than token ids
github.com/vllm-project/vllm - rickyyx opened this pull request about 1 month ago
github.com/vllm-project/vllm - rickyyx opened this pull request about 1 month ago
Need to update the jax and jaxlib version
github.com/vllm-project/vllm - vanbasten23 opened this pull request about 1 month ago
github.com/vllm-project/vllm - vanbasten23 opened this pull request about 1 month ago
Turn on V1 for H200 build
github.com/vllm-project/vllm - simon-mo opened this pull request about 1 month ago
github.com/vllm-project/vllm - simon-mo opened this pull request about 1 month ago
Metrics model name when using multiple loras
github.com/vllm-project/vllm - mces89 opened this issue about 1 month ago
github.com/vllm-project/vllm - mces89 opened this issue about 1 month ago
[Model] Add OLMo November 2024 model
github.com/vllm-project/vllm - 2015aroras opened this pull request about 1 month ago
github.com/vllm-project/vllm - 2015aroras opened this pull request about 1 month ago
[Core] Implement disagg prefill by StatelessProcessGroup
github.com/vllm-project/vllm - KuntaiDu opened this pull request about 1 month ago
github.com/vllm-project/vllm - KuntaiDu opened this pull request about 1 month ago
Setting default for EmbeddingChatRequest.add_generation_prompt to False
github.com/vllm-project/vllm - noamgat opened this pull request about 1 month ago
github.com/vllm-project/vllm - noamgat opened this pull request about 1 month ago
Support softcap in ROCm Flash Attention
github.com/vllm-project/vllm - hliuca opened this pull request about 1 month ago
github.com/vllm-project/vllm - hliuca opened this pull request about 1 month ago
[CI/Build] Dockerfile build for ARM64 / GH200
github.com/vllm-project/vllm - drikster80 opened this pull request about 1 month ago
github.com/vllm-project/vllm - drikster80 opened this pull request about 1 month ago
[Bugfix] GPU memory profiling should be per LLM instance
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 1 month ago
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 1 month ago
[Frontend] Add Command-R and Llama-3 chat template
github.com/vllm-project/vllm - ccs96307 opened this pull request about 1 month ago
github.com/vllm-project/vllm - ccs96307 opened this pull request about 1 month ago
[Misc] Increase default video fetch timeout
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
[Bugfix] Embedding model pooling_type equals ALL and multi input's bug
github.com/vllm-project/vllm - BBuf opened this pull request about 1 month ago
github.com/vllm-project/vllm - BBuf opened this pull request about 1 month ago
[Bug]: Error when calling vLLM with audio input using Qwen/Qwen2-Audio-7B-Instruct model
github.com/vllm-project/vllm - jiahansu opened this issue about 1 month ago
github.com/vllm-project/vllm - jiahansu opened this issue about 1 month ago
[V1] Replace traversal search with lookup table
github.com/vllm-project/vllm - Abatom opened this pull request about 1 month ago
github.com/vllm-project/vllm - Abatom opened this pull request about 1 month ago
[Bugfix] Handle transformers v4.47 and fix placeholder matching in merged multi-modal processors
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
github.com/vllm-project/vllm - DarkLight1337 opened this pull request about 1 month ago
Add support for reporting metrics in completion response headers in o…
github.com/vllm-project/vllm - coolkp opened this pull request about 1 month ago
github.com/vllm-project/vllm - coolkp opened this pull request about 1 month ago
[torch.compile] limit inductor threads and lazy import quant
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Usage]: VSCode debugger is hanging
github.com/vllm-project/vllm - jeejeelee opened this issue about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this issue about 1 month ago
[Bug]: vLLM CPU mode broken Unable to get JIT kernel for brgemm
github.com/vllm-project/vllm - samos123 opened this issue about 1 month ago
github.com/vllm-project/vllm - samos123 opened this issue about 1 month ago
[Usage]: Cant use vllm on a multiGPU node
github.com/vllm-project/vllm - 4k1s opened this issue about 1 month ago
github.com/vllm-project/vllm - 4k1s opened this issue about 1 month ago
[Misc] Add multipstep chunked-prefill support for FlashInfer
github.com/vllm-project/vllm - elfiegg opened this pull request about 1 month ago
github.com/vllm-project/vllm - elfiegg opened this pull request about 1 month ago
[Bugfix]: allow extra fields in requests to openai compatible server
github.com/vllm-project/vllm - gcalmettes opened this pull request about 1 month ago
github.com/vllm-project/vllm - gcalmettes opened this pull request about 1 month ago
[Core] Add Sliding Window Support with Flashinfer
github.com/vllm-project/vllm - pavanimajety opened this pull request about 1 month ago
github.com/vllm-project/vllm - pavanimajety opened this pull request about 1 month ago
[Installation]: VLLM on ARM machine with GH200
github.com/vllm-project/vllm - Phimanu opened this issue about 1 month ago
github.com/vllm-project/vllm - Phimanu opened this issue about 1 month ago
[Bugfix] Fix the LoRA weight sharding in ColumnParallelLinearWithLoRA
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[Pixtral-Large] Pixtral actually has no bias in vision-lang adapter
github.com/vllm-project/vllm - patrickvonplaten opened this pull request about 1 month ago
github.com/vllm-project/vllm - patrickvonplaten opened this pull request about 1 month ago
[Bug]: request reward model report 500 Internal Server Error
github.com/vllm-project/vllm - hrdxwandg opened this issue about 1 month ago
github.com/vllm-project/vllm - hrdxwandg opened this issue about 1 month ago
[misc][plugin] improve plugin loading
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 1 month ago
[Bug]: Speculative decoding + guided decoding not working
github.com/vllm-project/vllm - arunpatala opened this issue about 1 month ago
github.com/vllm-project/vllm - arunpatala opened this issue about 1 month ago
[CI][CPU] adding numa node number as container name suffix
github.com/vllm-project/vllm - zhouyuan opened this pull request about 1 month ago
github.com/vllm-project/vllm - zhouyuan opened this pull request about 1 month ago
[Bug]: Input prompt (35247 tokens) is too long and exceeds limit of 1000
github.com/vllm-project/vllm - Crista23 opened this issue about 1 month ago
github.com/vllm-project/vllm - Crista23 opened this issue about 1 month ago
[Bug]: Unable to run Qwen2.5-0.5B-Instruct model in v0.6.4.post1 version, Error: No available memory for the cache blocks
github.com/vllm-project/vllm - Valdanitooooo opened this issue about 1 month ago
github.com/vllm-project/vllm - Valdanitooooo opened this issue about 1 month ago
[Misc] Avoid misleading warning messages
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 1 month ago
[6/N] torch.compile rollout to users
github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
[ci/build] Have dependabot ignore all patch update
github.com/vllm-project/vllm - khluu opened this pull request about 2 months ago
github.com/vllm-project/vllm - khluu opened this pull request about 2 months ago
Compressed tensors w8a8 tpu
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 2 months ago
github.com/vllm-project/vllm - robertgshaw2-neuralmagic opened this pull request about 2 months ago
[CI/Build] Update Dockerfile.rocm
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request about 2 months ago
github.com/vllm-project/vllm - Alexei-V-Ivanov-AMD opened this pull request about 2 months ago
Add openai.beta.chat.completions.parse example to structured_outputs.rst
github.com/vllm-project/vllm - mgoin opened this pull request about 2 months ago
github.com/vllm-project/vllm - mgoin opened this pull request about 2 months ago
[Bug]: vllm server crash when num-scheduler-steps > 1 and max_tokens=0
github.com/vllm-project/vllm - atanikan opened this issue about 2 months ago
github.com/vllm-project/vllm - atanikan opened this issue about 2 months ago
[ci][bugfix] fix kernel tests
github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
github.com/vllm-project/vllm - youkaichao opened this pull request about 2 months ago
[Bugfix] Guard for negative counter metrics to prevent crash
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 2 months ago
github.com/vllm-project/vllm - tjohnson31415 opened this pull request about 2 months ago
[Doc]: Pages were moved without a redirect
github.com/vllm-project/vllm - shannonxtreme opened this issue about 2 months ago
github.com/vllm-project/vllm - shannonxtreme opened this issue about 2 months ago
[Doc]: Migrate to Markdown
github.com/vllm-project/vllm - rafvasq opened this issue about 2 months ago
github.com/vllm-project/vllm - rafvasq opened this issue about 2 months ago
Fix open_collective value in FUNDING.yml
github.com/vllm-project/vllm - andrew opened this pull request about 2 months ago
github.com/vllm-project/vllm - andrew opened this pull request about 2 months ago
[Doc] Update doc for LoRA support in GLM-4V
github.com/vllm-project/vllm - B-201 opened this pull request about 2 months ago
github.com/vllm-project/vllm - B-201 opened this pull request about 2 months ago
[CI/Build] Support compilation with local cutlass path (#10423)
github.com/vllm-project/vllm - wchen61 opened this pull request about 2 months ago
github.com/vllm-project/vllm - wchen61 opened this pull request about 2 months ago
[Feature]: Add Support for Specifying Local CUTLASS Source Directory via Environment Variable
github.com/vllm-project/vllm - wchen61 opened this issue about 2 months ago
github.com/vllm-project/vllm - wchen61 opened this issue about 2 months ago
[Misc] Reduce medusa weight
github.com/vllm-project/vllm - skylee-01 opened this pull request about 2 months ago
github.com/vllm-project/vllm - skylee-01 opened this pull request about 2 months ago
Fix: Build error seen on Power Architecture
github.com/vllm-project/vllm - mikejuliet13 opened this pull request about 2 months ago
github.com/vllm-project/vllm - mikejuliet13 opened this pull request about 2 months ago
[Model][LoRA]LoRA support added for glm-4v
github.com/vllm-project/vllm - B-201 opened this pull request about 2 months ago
github.com/vllm-project/vllm - B-201 opened this pull request about 2 months ago
[Bugfix]Fix Phi-3 BNB online quantization
github.com/vllm-project/vllm - jeejeelee opened this pull request about 2 months ago
github.com/vllm-project/vllm - jeejeelee opened this pull request about 2 months ago
[Bug]: Encountered issues when deploying Llama-3.2-11B-Vision-Instruct for online inference.
github.com/vllm-project/vllm - CapitalLiu opened this issue about 2 months ago
github.com/vllm-project/vllm - CapitalLiu opened this issue about 2 months ago
[Model] Remove transformers attention porting in VITs
github.com/vllm-project/vllm - Isotr0py opened this pull request about 2 months ago
github.com/vllm-project/vllm - Isotr0py opened this pull request about 2 months ago
Bump the patch-update group with 2 updates
github.com/vllm-project/vllm - dependabot[bot] opened this pull request about 2 months ago
github.com/vllm-project/vllm - dependabot[bot] opened this pull request about 2 months ago