Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang

[Bug] RuntimeRrror: Ninja is required to load c++ extensions

Flynn-Zh opened this issue 4 days ago
[Feature] TORCHINDUCTOR_CACHE_DIR not work ?

MichoChan opened this issue 5 days ago
fix typo

zhyncs opened this pull request 8 days ago
[Benchmark] add a benchmark for hf/vllm/sglang rmsnorm

BBuf opened this pull request 8 days ago
hotfix: checking for HIP

zhyncs opened this pull request 8 days ago
Remove cuda graph batch size adjustment for dp attention

ispobock opened this pull request 8 days ago
format: add clang-format for sgl-kernel

zhyncs opened this pull request 8 days ago
[Bug] Accuracy is abnormal when EP MoE is enabled

ispobock opened this issue 9 days ago
sgl-kernel adapt tensorrt llm custom allreduce

yizhang2077 opened this pull request 9 days ago
Fix correctness issue for triton decoding kernel

ispobock opened this pull request 9 days ago
[Experimental] Add a gRPC server for completion request

MrAta opened this pull request 9 days ago
How to debug sglang using pdb?

sleepwalker2017 opened this issue 10 days ago
Small fixes for torchao quant

jerryzh168 opened this pull request 10 days ago
[FIX] Update EOS from config

zhengy001 opened this pull request 10 days ago
[Minor] Fix grok model loader

merrymercy opened this pull request 10 days ago
[Feature] Integrate CUTLASS FP8 GEMM into sgl-kernel

zhyncs opened this issue 10 days ago
[Feature] FusedMoE H200 tuning

zhyncs opened this issue 10 days ago
feat: support dev image

zhyncs opened this pull request 10 days ago
"GET / HTTP/1.1" 404 Not Found

LordEdison opened this issue 10 days ago
benchmark decoding attention kernel with cudnn

bjmsong opened this pull request 11 days ago
fix: set runtime path

zhyncs opened this pull request 11 days ago
Rename rust folder to sgl-router

MrAta opened this pull request 11 days ago
minor: update pypi tag

zhyncs opened this pull request 11 days ago
chore: bump v0.0.2 for sgl-kernel

zhyncs opened this pull request 11 days ago
[Feature] Do we have any plan for supporting MiniCPM-V 2.6?

Xeladoes opened this issue 11 days ago
[Bug] CUDA Graph Build Failure

dangxingyu opened this issue 11 days ago
Bump sglang-router to 0.1.1

MrAta opened this pull request 11 days ago
[Feature] MoE Expert Parallel with awq

Xu-Chen opened this issue 11 days ago
Clean up GPU memory after killing sglang processes

MrAta opened this pull request 11 days ago
Include version info into the router package

MrAta opened this pull request 11 days ago
[router] Release router 0.1.0 with dynamic scaling and fault tolerance

ByronHsu opened this pull request 11 days ago
[router] Update doc for dynamic scaling and fault tolerance

ByronHsu opened this pull request 11 days ago
[router] remove main.rs because only lib.rs is used for py binding

ByronHsu opened this pull request 11 days ago
[router] Add retries based fault tolerance

ByronHsu opened this pull request 11 days ago
[Bug] Gemma 2 GGUF

slivka83 opened this issue 11 days ago
[Feature]: Benchmarking H200

antferdom opened this issue 12 days ago
Fix warmup in bench_offline_throughput.py

merrymercy opened this pull request 12 days ago
Fix model loader for more quantization formats

merrymercy opened this pull request 12 days ago
chore: update ao v0.7.0

zhyncs opened this pull request 12 days ago
It's hard to install it

ToSev7en opened this issue 12 days ago
Make request payload size configurable

MrAta opened this pull request 12 days ago
[Feature] support Llama3.3

win4r opened this issue 12 days ago
[Core] in batch prefix caching by delay scheduling

rkooo567 opened this pull request 12 days ago
[router] Use borrow if possible to save cost

ByronHsu opened this pull request 12 days ago
[router] Refactor: decouple select and send stage

ByronHsu opened this pull request 12 days ago
Add lora_path to chat completion

ccchow opened this pull request 12 days ago
Add support for IBM Granite 3.x models

frreiss opened this pull request 12 days ago
Make torch TP composable with torchao

kwen2501 opened this pull request 12 days ago
fix: compatible with PEP 440

zhyncs opened this pull request 12 days ago
fix: use manylinux2014_x86_64 tag

zhyncs opened this pull request 12 days ago
feat: support sgl-kernel PyPI

zhyncs opened this pull request 12 days ago
[Bug] multiple `sgl.Runtime` instances compete for port 10000

mantle2048 opened this issue 13 days ago
[Feature] Function Call Support

chenweize1998 opened this issue 13 days ago
[Feature] Support General Reward Model

zhaochenyang20 opened this issue 13 days ago
ROCm support for sglang.check_env

hliuca opened this pull request 13 days ago
decoding attention kernel benchmark

bjmsong opened this pull request 13 days ago
Typo fix in router.md

adarshxs opened this pull request 13 days ago
Performance issues when scaling to multiple GPUs

FinnGu opened this issue 14 days ago
[Minor] Improve code style

merrymercy opened this pull request 14 days ago
Add InfiniteBench for long context benchmarking

iankur opened this pull request 14 days ago
[Bug] The first request with "regex" is too slow

sitabulaixizawaluduo opened this issue 14 days ago
[Minor] Improve code style

merrymercy opened this pull request 14 days ago
Migrate llama_classification to use the /classify interface

merrymercy opened this pull request 14 days ago
Add a unittest for fused_moe

BBuf opened this pull request 14 days ago
[Router] fix interrupt from terminal

ByronHsu opened this pull request 14 days ago
[feat] Enable chunked prefill for llava-onevision

Ying1123 opened this pull request 14 days ago
[router] Improve cleanup logic

ByronHsu opened this pull request 14 days ago
reduce watchdog interval to 5s

ByronHsu opened this pull request 14 days ago
minor: add random flashinfer vs triton use case

zhyncs opened this pull request 14 days ago
minor: add random use case

zhyncs opened this pull request 14 days ago
feat: support custom task runner

zhyncs opened this pull request 14 days ago
minor: update correct measurement unit

zhyncs opened this pull request 15 days ago
Fix recv_requests

merrymercy opened this pull request 15 days ago
fix: specify dtype with begin_forward aka plan

zhyncs opened this pull request 15 days ago
Fix a bug with logprob streaming + chunked prefill

merrymercy opened this pull request 15 days ago
[Feature] add kernel level benchmark

zhyncs opened this issue 15 days ago
Remove unused vars in the triton backend

ispobock opened this pull request 15 days ago
[Feature] support llm_bench

zhyncs opened this issue 15 days ago
[Feature] support constrained decoding benchmark

zhyncs opened this issue 15 days ago
Simplify stream_output

merrymercy opened this pull request 15 days ago
Update killall_sglang.sh

merrymercy opened this pull request 15 days ago
[WIP] Add sampler logit processor

hongpeng-guo opened this pull request 15 days ago
Optimize Triton decoding kernel for long context

ispobock opened this pull request 15 days ago
[router] add health checking in router init

ByronHsu opened this pull request 15 days ago
[router] Health check on worker before added to the router

ByronHsu opened this pull request 15 days ago
minor: update killall script

zhyncs opened this pull request 16 days ago
fix: update xgrammar v0.1.6

zhyncs opened this pull request 16 days ago
[Feature] SGLang Router design discussion

zhyncs opened this issue 16 days ago
Fp8 MoE optimizations on AMD

HaiShaw opened this pull request 16 days ago