Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang
[Bug] RuntimeRrror: Ninja is required to load c++ extensions
Flynn-Zh opened this issue 4 days ago
Flynn-Zh opened this issue 4 days ago
[Feature] TORCHINDUCTOR_CACHE_DIR not work ?
MichoChan opened this issue 5 days ago
MichoChan opened this issue 5 days ago
fix typo
zhyncs opened this pull request 8 days ago
zhyncs opened this pull request 8 days ago
[Benchmark] add a benchmark for hf/vllm/sglang rmsnorm
BBuf opened this pull request 8 days ago
BBuf opened this pull request 8 days ago
hotfix: checking for HIP
zhyncs opened this pull request 8 days ago
zhyncs opened this pull request 8 days ago
Remove cuda graph batch size adjustment for dp attention
ispobock opened this pull request 8 days ago
ispobock opened this pull request 8 days ago
format: add clang-format for sgl-kernel
zhyncs opened this pull request 8 days ago
zhyncs opened this pull request 8 days ago
[Bug] Accuracy is abnormal when EP MoE is enabled
ispobock opened this issue 9 days ago
ispobock opened this issue 9 days ago
sgl-kernel adapt tensorrt llm custom allreduce
yizhang2077 opened this pull request 9 days ago
yizhang2077 opened this pull request 9 days ago
Fix correctness issue for triton decoding kernel
ispobock opened this pull request 9 days ago
ispobock opened this pull request 9 days ago
[Experimental] Add a gRPC server for completion request
MrAta opened this pull request 9 days ago
MrAta opened this pull request 9 days ago
How to debug sglang using pdb?
sleepwalker2017 opened this issue 10 days ago
sleepwalker2017 opened this issue 10 days ago
Small fixes for torchao quant
jerryzh168 opened this pull request 10 days ago
jerryzh168 opened this pull request 10 days ago
[FIX] Update EOS from config
zhengy001 opened this pull request 10 days ago
zhengy001 opened this pull request 10 days ago
[Feature] request smoothquant (int8, W8A8) quantization on 40G A100
Hao-YunDeng opened this issue 10 days ago
Hao-YunDeng opened this issue 10 days ago
[Minor] Fix grok model loader
merrymercy opened this pull request 10 days ago
merrymercy opened this pull request 10 days ago
[Feature] Integrate CUTLASS FP8 GEMM into sgl-kernel
zhyncs opened this issue 10 days ago
zhyncs opened this issue 10 days ago
[Feature] FusedMoE H200 tuning
zhyncs opened this issue 10 days ago
zhyncs opened this issue 10 days ago
[Bug] Different behavior benchmarking w/ request-range-range vs. separate request-rates
Mutinifni opened this issue 10 days ago
Mutinifni opened this issue 10 days ago
feat: support dev image
zhyncs opened this pull request 10 days ago
zhyncs opened this pull request 10 days ago
"GET / HTTP/1.1" 404 Not Found
LordEdison opened this issue 10 days ago
LordEdison opened this issue 10 days ago
benchmark decoding attention kernel with cudnn
bjmsong opened this pull request 11 days ago
bjmsong opened this pull request 11 days ago
fix: set runtime path
zhyncs opened this pull request 11 days ago
zhyncs opened this pull request 11 days ago
[Bug] potential correctness with triton-attention-num-kv-splits > 1
HaiShaw opened this issue 11 days ago
HaiShaw opened this issue 11 days ago
Rename rust folder to sgl-router
MrAta opened this pull request 11 days ago
MrAta opened this pull request 11 days ago
minor: update pypi tag
zhyncs opened this pull request 11 days ago
zhyncs opened this pull request 11 days ago
chore: bump v0.0.2 for sgl-kernel
zhyncs opened this pull request 11 days ago
zhyncs opened this pull request 11 days ago
[Feature] Do we have any plan for supporting MiniCPM-V 2.6?
Xeladoes opened this issue 11 days ago
Xeladoes opened this issue 11 days ago
[Bug] CUDA Graph Build Failure
dangxingyu opened this issue 11 days ago
dangxingyu opened this issue 11 days ago
Bump sglang-router to 0.1.1
MrAta opened this pull request 11 days ago
MrAta opened this pull request 11 days ago
[Feature] MoE Expert Parallel with awq
Xu-Chen opened this issue 11 days ago
Xu-Chen opened this issue 11 days ago
Clean up GPU memory after killing sglang processes
MrAta opened this pull request 11 days ago
MrAta opened this pull request 11 days ago
Include version info into the router package
MrAta opened this pull request 11 days ago
MrAta opened this pull request 11 days ago
[router] Release router 0.1.0 with dynamic scaling and fault tolerance
ByronHsu opened this pull request 11 days ago
ByronHsu opened this pull request 11 days ago
[router] Update doc for dynamic scaling and fault tolerance
ByronHsu opened this pull request 11 days ago
ByronHsu opened this pull request 11 days ago
[router] remove main.rs because only lib.rs is used for py binding
ByronHsu opened this pull request 11 days ago
ByronHsu opened this pull request 11 days ago
[router] Add retries based fault tolerance
ByronHsu opened this pull request 11 days ago
ByronHsu opened this pull request 11 days ago
[Bug] Gemma 2 GGUF
slivka83 opened this issue 11 days ago
slivka83 opened this issue 11 days ago
[Feature]: Benchmarking H200
antferdom opened this issue 12 days ago
antferdom opened this issue 12 days ago
Fix warmup in bench_offline_throughput.py
merrymercy opened this pull request 12 days ago
merrymercy opened this pull request 12 days ago
Fix model loader for more quantization formats
merrymercy opened this pull request 12 days ago
merrymercy opened this pull request 12 days ago
chore: update ao v0.7.0
zhyncs opened this pull request 12 days ago
zhyncs opened this pull request 12 days ago
It's hard to install it
ToSev7en opened this issue 12 days ago
ToSev7en opened this issue 12 days ago
Make request payload size configurable
MrAta opened this pull request 12 days ago
MrAta opened this pull request 12 days ago
[Feature] support Llama3.3
win4r opened this issue 12 days ago
win4r opened this issue 12 days ago
[Core] in batch prefix caching by delay scheduling
rkooo567 opened this pull request 12 days ago
rkooo567 opened this pull request 12 days ago
[router] Use borrow if possible to save cost
ByronHsu opened this pull request 12 days ago
ByronHsu opened this pull request 12 days ago
[router] Refactor: decouple select and send stage
ByronHsu opened this pull request 12 days ago
ByronHsu opened this pull request 12 days ago
[Feature] Enhanced support/structure for Multi-modal models
tp-nan opened this issue 12 days ago
tp-nan opened this issue 12 days ago
Add lora_path to chat completion
ccchow opened this pull request 12 days ago
ccchow opened this pull request 12 days ago
Add support for IBM Granite 3.x models
frreiss opened this pull request 12 days ago
frreiss opened this pull request 12 days ago
Make torch TP composable with torchao
kwen2501 opened this pull request 12 days ago
kwen2501 opened this pull request 12 days ago
fix: compatible with PEP 440
zhyncs opened this pull request 12 days ago
zhyncs opened this pull request 12 days ago
fix: use manylinux2014_x86_64 tag
zhyncs opened this pull request 12 days ago
zhyncs opened this pull request 12 days ago
feat: support sgl-kernel PyPI
zhyncs opened this pull request 12 days ago
zhyncs opened this pull request 12 days ago
[Bug] vLLM ~0.6.5 with latest sglang producing garbage text on AMD GPUs
ozziemoreno opened this issue 12 days ago
ozziemoreno opened this issue 12 days ago
[Bug] SGLang's OpenAI interface fails with Llama-3.2-1B due to missing chat template
NeilJohnson0930 opened this issue 13 days ago
NeilJohnson0930 opened this issue 13 days ago
[Bug] multiple `sgl.Runtime` instances compete for port 10000
mantle2048 opened this issue 13 days ago
mantle2048 opened this issue 13 days ago
[Feature] Function Call Support
chenweize1998 opened this issue 13 days ago
chenweize1998 opened this issue 13 days ago
Best practices for deploying different models on different GPUs for offline generation
mantle2048 opened this issue 13 days ago
mantle2048 opened this issue 13 days ago
[Feature] Support General Reward Model
zhaochenyang20 opened this issue 13 days ago
zhaochenyang20 opened this issue 13 days ago
ROCm support for sglang.check_env
hliuca opened this pull request 13 days ago
hliuca opened this pull request 13 days ago
decoding attention kernel benchmark
bjmsong opened this pull request 13 days ago
bjmsong opened this pull request 13 days ago
Typo fix in router.md
adarshxs opened this pull request 13 days ago
adarshxs opened this pull request 13 days ago
Performance issues when scaling to multiple GPUs
FinnGu opened this issue 14 days ago
FinnGu opened this issue 14 days ago
[Minor] Improve code style
merrymercy opened this pull request 14 days ago
merrymercy opened this pull request 14 days ago
Add InfiniteBench for long context benchmarking
iankur opened this pull request 14 days ago
iankur opened this pull request 14 days ago
[Bug] The first request with "regex" is too slow
sitabulaixizawaluduo opened this issue 14 days ago
sitabulaixizawaluduo opened this issue 14 days ago
[Minor] Improve code style
merrymercy opened this pull request 14 days ago
merrymercy opened this pull request 14 days ago
[Bug] File "/u02/liuys/sglang/python/sglang/srt/server.py", line 621, in _wait_and_warmup Killed
lys791227 opened this issue 14 days ago
lys791227 opened this issue 14 days ago
Migrate llama_classification to use the /classify interface
merrymercy opened this pull request 14 days ago
merrymercy opened this pull request 14 days ago
Add a unittest for fused_moe
BBuf opened this pull request 14 days ago
BBuf opened this pull request 14 days ago
[Bug] nsys will cause an error when TP=4 although I launched with --trace-fork-before-exec=true --cuda-graph-trace=node
jameswu2014 opened this issue 14 days ago
jameswu2014 opened this issue 14 days ago
[Bug] XGrammar causes gibberish during parallel execution and cuts off other requests
remixer-dec opened this issue 14 days ago
remixer-dec opened this issue 14 days ago
[Router] fix interrupt from terminal
ByronHsu opened this pull request 14 days ago
ByronHsu opened this pull request 14 days ago
[feat] Enable chunked prefill for llava-onevision
Ying1123 opened this pull request 14 days ago
Ying1123 opened this pull request 14 days ago
[router] Improve cleanup logic
ByronHsu opened this pull request 14 days ago
ByronHsu opened this pull request 14 days ago
reduce watchdog interval to 5s
ByronHsu opened this pull request 14 days ago
ByronHsu opened this pull request 14 days ago
minor: add random flashinfer vs triton use case
zhyncs opened this pull request 14 days ago
zhyncs opened this pull request 14 days ago
minor: add random use case
zhyncs opened this pull request 14 days ago
zhyncs opened this pull request 14 days ago
feat: support custom task runner
zhyncs opened this pull request 14 days ago
zhyncs opened this pull request 14 days ago
minor: update correct measurement unit
zhyncs opened this pull request 15 days ago
zhyncs opened this pull request 15 days ago
Fix recv_requests
merrymercy opened this pull request 15 days ago
merrymercy opened this pull request 15 days ago
fix: specify dtype with begin_forward aka plan
zhyncs opened this pull request 15 days ago
zhyncs opened this pull request 15 days ago
Fix a bug with logprob streaming + chunked prefill
merrymercy opened this pull request 15 days ago
merrymercy opened this pull request 15 days ago
[Feature] add kernel level benchmark
zhyncs opened this issue 15 days ago
zhyncs opened this issue 15 days ago
Remove unused vars in the triton backend
ispobock opened this pull request 15 days ago
ispobock opened this pull request 15 days ago
[Feature] support llm_bench
zhyncs opened this issue 15 days ago
zhyncs opened this issue 15 days ago
[Feature] support constrained decoding benchmark
zhyncs opened this issue 15 days ago
zhyncs opened this issue 15 days ago
Simplify stream_output
merrymercy opened this pull request 15 days ago
merrymercy opened this pull request 15 days ago
Update killall_sglang.sh
merrymercy opened this pull request 15 days ago
merrymercy opened this pull request 15 days ago
[WIP] Add sampler logit processor
hongpeng-guo opened this pull request 15 days ago
hongpeng-guo opened this pull request 15 days ago
[Bug] After deploying for a period of time (2 days), the speed slows down and the memory usage increases
lss15151161 opened this issue 15 days ago
lss15151161 opened this issue 15 days ago
Optimize Triton decoding kernel for long context
ispobock opened this pull request 15 days ago
ispobock opened this pull request 15 days ago
[router] add health checking in router init
ByronHsu opened this pull request 15 days ago
ByronHsu opened this pull request 15 days ago
[router] Health check on worker before added to the router
ByronHsu opened this pull request 15 days ago
ByronHsu opened this pull request 15 days ago
minor: update killall script
zhyncs opened this pull request 16 days ago
zhyncs opened this pull request 16 days ago
fix: update xgrammar v0.1.6
zhyncs opened this pull request 16 days ago
zhyncs opened this pull request 16 days ago
[Feature] SGLang Router design discussion
zhyncs opened this issue 16 days ago
zhyncs opened this issue 16 days ago
Fp8 MoE optimizations on AMD
HaiShaw opened this pull request 16 days ago
HaiShaw opened this pull request 16 days ago