Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang
Fix packet loss when deploy little model
sdli1995 opened this pull request 17 days ago
sdli1995 opened this pull request 17 days ago
A better aio rwlock that guarantees the order
merrymercy opened this pull request 17 days ago
merrymercy opened this pull request 17 days ago
feat:support 2 kenrels for mixed chunked prefill
chosen-ox opened this pull request 18 days ago
chosen-ox opened this pull request 18 days ago
Updated documentation for Grammar Backend
shuaills opened this pull request 18 days ago
shuaills opened this pull request 18 days ago
[Feature] Function Calling
Tushar-ml opened this pull request 18 days ago
Tushar-ml opened this pull request 18 days ago
[Misc] Fix metrics, weight update lock, request logging
merrymercy opened this pull request 18 days ago
merrymercy opened this pull request 18 days ago
[Feature] (Willing to PR) Avoid KV cache occupying GPU memory when not used
fzyzcjy opened this issue 18 days ago
fzyzcjy opened this issue 18 days ago
fix #2528
zhyncs opened this pull request 19 days ago
zhyncs opened this pull request 19 days ago
formatted
yixin-huang1 opened this pull request 19 days ago
yixin-huang1 opened this pull request 19 days ago
[Bug] `Qwen/QwQ-32B-Preview` undefined symbol error
HuanzhiMao opened this issue 19 days ago
HuanzhiMao opened this issue 19 days ago
[Feature] Why qwen2-vl not support radix cache
vchzls opened this issue 19 days ago
vchzls opened this issue 19 days ago
[Bug] Eagle2 has an unstable sampling rate during multi concurrency。
coolhok opened this issue 19 days ago
coolhok opened this issue 19 days ago
[Bug] Failed to launch engine when working with Ray Serve: signal only works in main thread of the main interpreter
pengye91 opened this issue 19 days ago
pengye91 opened this issue 19 days ago
Enable Nvidia's ModelOpt fp8 quantized models
Edwardf0t1 opened this pull request 19 days ago
Edwardf0t1 opened this pull request 19 days ago
[Cache Offload] Improve radix cache offload benchmark
Edenzzzz opened this pull request 19 days ago
Edenzzzz opened this pull request 19 days ago
[Cache Offload] Remove device sync overhead
Edenzzzz opened this pull request 20 days ago
Edenzzzz opened this pull request 20 days ago
[Bug] Transformers doesn't recognize LLaVA variant architectures
amosyou opened this issue 20 days ago
amosyou opened this issue 20 days ago
[Feature] Add Docs For Quantization
binhtranmcs opened this issue 20 days ago
binhtranmcs opened this issue 20 days ago
[Bug] SGLang v0.4.0 with AMD MI300X
BruceXcluding opened this issue 20 days ago
BruceXcluding opened this issue 20 days ago
Add lora_paths to v1_chat_generate_request
ccchow opened this pull request 21 days ago
ccchow opened this pull request 21 days ago
Add integration with gemlite weight only quant
jerryzh168 opened this pull request 21 days ago
jerryzh168 opened this pull request 21 days ago
[Bug] install from source cannot start
fansongfs opened this issue 21 days ago
fansongfs opened this issue 21 days ago
[Feature] Support new parameter - EBNF in xgrammar
adarshxs opened this pull request 21 days ago
adarshxs opened this pull request 21 days ago
chore: bump v0.4.0.post2
zhyncs opened this pull request 21 days ago
zhyncs opened this pull request 21 days ago
fix followup #2517
zhyncs opened this pull request 21 days ago
zhyncs opened this pull request 21 days ago
docs: update sponsorship (DataCrunch)
zhyncs opened this pull request 21 days ago
zhyncs opened this pull request 21 days ago
Update pyproject.toml: add dependancy "ninja"
adarshxs opened this pull request 21 days ago
adarshxs opened this pull request 21 days ago
fix: package data missing
yudian0504 opened this pull request 21 days ago
yudian0504 opened this pull request 21 days ago
sglang for Qwen2.5-14b deploy Error
wuxianyess opened this issue 21 days ago
wuxianyess opened this issue 21 days ago
Is video inference supported?
wuxianyess opened this issue 21 days ago
wuxianyess opened this issue 21 days ago
[Bug] using xgrammar with json schema, performance is worse than no xgrammar and json schema
fansongfs opened this issue 21 days ago
fansongfs opened this issue 21 days ago
fix: continue to use flashinfer 0.1.6 temporarily
zhyncs opened this pull request 21 days ago
zhyncs opened this pull request 21 days ago
docs: update README
zhyncs opened this pull request 21 days ago
zhyncs opened this pull request 21 days ago
feat: add llama3 eval
zhyncs opened this pull request 21 days ago
zhyncs opened this pull request 21 days ago
[Bug] RuntimeRrror: Ninja is required to load c++ extensions
Flynn-Zh opened this issue 21 days ago
Flynn-Zh opened this issue 21 days ago
Add generator-style run_batch function
xingyaoww opened this pull request 22 days ago
xingyaoww opened this pull request 22 days ago
adapt custom allreduce for tensorrt llm
yizhang2077 opened this pull request 22 days ago
yizhang2077 opened this pull request 22 days ago
[Feature] Support for Evicting Specific KV Cache to Save GPU Memory
ChenlongDeng opened this issue 22 days ago
ChenlongDeng opened this issue 22 days ago
[kernel optimize] benchmark write_req_to_token_pool_triton and optimize kernel
BBuf opened this pull request 22 days ago
BBuf opened this pull request 22 days ago
[Feature] do sample = False?
boqiny opened this issue 22 days ago
boqiny opened this issue 22 days ago
[Feature] Faster torch.compile
MichoChan opened this issue 22 days ago
MichoChan opened this issue 22 days ago
[Feature] Integration SGLang into OpenRLHF
zhaochenyang20 opened this issue 22 days ago
zhaochenyang20 opened this issue 22 days ago
[Feature] Add Tutorial for Constraint Decoding
zhaochenyang20 opened this issue 22 days ago
zhaochenyang20 opened this issue 22 days ago
[Feature] Add Math in our CI
zhaochenyang20 opened this issue 22 days ago
zhaochenyang20 opened this issue 22 days ago
Print progress bar during cuda graph capture
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
fix: add ninja as dependency for flashinfer v0.2
zhyncs opened this pull request 23 days ago
zhyncs opened this pull request 23 days ago
Update readme
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
Fix openai protocols and pass top_k, min_p
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
torcho gemlite integration
HDCharles opened this pull request 23 days ago
HDCharles opened this pull request 23 days ago
[Bug] got asyncio.exceptions.InvalidStateError: invalid state when concurrent request interface /get_server_info
Lzhang-hub opened this issue 23 days ago
Lzhang-hub opened this issue 23 days ago
improve performance by removing use_tensor_core dependency
bjmsong opened this pull request 23 days ago
bjmsong opened this pull request 23 days ago
Small fix for the order of apply_torchao_config
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
Add a benchmark script for in-batch prefix caching
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
Revert "Small fixes for torchao quant"
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
Temporarily disable unit test of torch native attention backend
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
Simplify pytorch sampling kernel and logit processor
merrymercy opened this pull request 23 days ago
merrymercy opened this pull request 23 days ago
minor: update flashinfer nightly
zhyncs opened this pull request 24 days ago
zhyncs opened this pull request 24 days ago
fix moe-ep accuracy issue for fp8
xiaobochen123 opened this pull request 24 days ago
xiaobochen123 opened this pull request 24 days ago
[Feature] Benchmarking Performance on General Devices
zhaochenyang20 opened this issue 24 days ago
zhaochenyang20 opened this issue 24 days ago
fix typo
zhyncs opened this pull request 25 days ago
zhyncs opened this pull request 25 days ago
[Benchmark] add a benchmark for hf/vllm/sglang rmsnorm
BBuf opened this pull request 25 days ago
BBuf opened this pull request 25 days ago
hotfix: checking for HIP
zhyncs opened this pull request 26 days ago
zhyncs opened this pull request 26 days ago
Remove cuda graph batch size adjustment for dp attention
ispobock opened this pull request 26 days ago
ispobock opened this pull request 26 days ago
format: add clang-format for sgl-kernel
zhyncs opened this pull request 26 days ago
zhyncs opened this pull request 26 days ago
[Bug] Accuracy is abnormal when EP MoE is enabled
ispobock opened this issue 26 days ago
ispobock opened this issue 26 days ago
sgl-kernel adapt tensorrt llm custom allreduce
yizhang2077 opened this pull request 26 days ago
yizhang2077 opened this pull request 26 days ago
Fix correctness issue for triton decoding kernel
ispobock opened this pull request 26 days ago
ispobock opened this pull request 26 days ago
[Experimental] Add a gRPC server for completion request
MrAta opened this pull request 26 days ago
MrAta opened this pull request 26 days ago
How to debug sglang using pdb?
sleepwalker2017 opened this issue 27 days ago
sleepwalker2017 opened this issue 27 days ago
Small fixes for torchao quant
jerryzh168 opened this pull request 27 days ago
jerryzh168 opened this pull request 27 days ago
[FIX] Update EOS from config
zhengy001 opened this pull request 27 days ago
zhengy001 opened this pull request 27 days ago
[Feature] request smoothquant (int8, W8A8) quantization on 40G A100
Hao-YunDeng opened this issue 27 days ago
Hao-YunDeng opened this issue 27 days ago
[Minor] Fix grok model loader
merrymercy opened this pull request 27 days ago
merrymercy opened this pull request 27 days ago
[Feature] Integrate CUTLASS FP8 GEMM into sgl-kernel
zhyncs opened this issue 28 days ago
zhyncs opened this issue 28 days ago
[Feature] FusedMoE H200 tuning
zhyncs opened this issue 28 days ago
zhyncs opened this issue 28 days ago
[Bug] Different behavior benchmarking w/ request-range-range vs. separate request-rates
Mutinifni opened this issue 28 days ago
Mutinifni opened this issue 28 days ago
feat: support dev image
zhyncs opened this pull request 28 days ago
zhyncs opened this pull request 28 days ago
"GET / HTTP/1.1" 404 Not Found
LordEdison opened this issue 28 days ago
LordEdison opened this issue 28 days ago
benchmark decoding attention kernel with cudnn
bjmsong opened this pull request 28 days ago
bjmsong opened this pull request 28 days ago
fix: set runtime path
zhyncs opened this pull request 28 days ago
zhyncs opened this pull request 28 days ago
[Bug] potential correctness with triton-attention-num-kv-splits > 1
HaiShaw opened this issue 28 days ago
HaiShaw opened this issue 28 days ago
Rename rust folder to sgl-router
MrAta opened this pull request 28 days ago
MrAta opened this pull request 28 days ago
minor: update pypi tag
zhyncs opened this pull request 28 days ago
zhyncs opened this pull request 28 days ago
chore: bump v0.0.2 for sgl-kernel
zhyncs opened this pull request 28 days ago
zhyncs opened this pull request 28 days ago
[Feature] Do we have any plan for supporting MiniCPM-V 2.6?
Xeladoes opened this issue 28 days ago
Xeladoes opened this issue 28 days ago
[Bug] CUDA Graph Build Failure
dangxingyu opened this issue 28 days ago
dangxingyu opened this issue 28 days ago
Bump sglang-router to 0.1.1
MrAta opened this pull request 28 days ago
MrAta opened this pull request 28 days ago
[Feature] MoE Expert Parallel with awq
Xu-Chen opened this issue 28 days ago
Xu-Chen opened this issue 28 days ago
Clean up GPU memory after killing sglang processes
MrAta opened this pull request 28 days ago
MrAta opened this pull request 28 days ago
Include version info into the router package
MrAta opened this pull request 28 days ago
MrAta opened this pull request 28 days ago
[router] Release router 0.1.0 with dynamic scaling and fault tolerance
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
[router] Update doc for dynamic scaling and fault tolerance
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
[router] remove main.rs because only lib.rs is used for py binding
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
[router] Add retries based fault tolerance
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
[Bug] Gemma 2 GGUF
slivka83 opened this issue 29 days ago
slivka83 opened this issue 29 days ago
[Feature]: Benchmarking H200
antferdom opened this issue 29 days ago
antferdom opened this issue 29 days ago
Fix warmup in bench_offline_throughput.py
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
Fix model loader for more quantization formats
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
chore: update ao v0.7.0
zhyncs opened this pull request 29 days ago
zhyncs opened this pull request 29 days ago
It's hard to install it
ToSev7en opened this issue 29 days ago
ToSev7en opened this issue 29 days ago