Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang
chore: bump v0.0.2.post17 for sgl-kernel
zhyncs opened this pull request about 22 hours ago
zhyncs opened this pull request about 22 hours ago
mirror fix for custom allreduce
yizhang2077 opened this pull request 1 day ago
yizhang2077 opened this pull request 1 day ago
update 3rdparty for sgl-kernel
zhyncs opened this pull request 1 day ago
zhyncs opened this pull request 1 day ago
fix: Fix deprecated max_tokens param in openai ChatCompletionRequest
mickqian opened this pull request 1 day ago
mickqian opened this pull request 1 day ago
support fp32 in sampling_scaling_penalties kernel
BBuf opened this pull request 1 day ago
BBuf opened this pull request 1 day ago
Add EngineFragment
fzyzcjy opened this pull request 1 day ago
fzyzcjy opened this pull request 1 day ago
[DO NOT MERGE] Bump CI to check
fzyzcjy opened this pull request 1 day ago
fzyzcjy opened this pull request 1 day ago
Split communication logic from computation logic into orchestrator
fzyzcjy opened this pull request 1 day ago
fzyzcjy opened this pull request 1 day ago
Let DetokenizerManager use TypeBasedDispatcher
fzyzcjy opened this pull request 1 day ago
fzyzcjy opened this pull request 1 day ago
Rename TokenizerManager to StdOrchestrator
fzyzcjy opened this pull request 1 day ago
fzyzcjy opened this pull request 1 day ago
Extract generation_manager from tokenizer_manager
fzyzcjy opened this pull request 1 day ago
fzyzcjy opened this pull request 1 day ago
First draft on GSM8K benchmark.
simveit opened this pull request 1 day ago
simveit opened this pull request 1 day ago
enable kv_scale for Gemma2
hliuca opened this pull request 1 day ago
hliuca opened this pull request 1 day ago
[Bug] Service crashed with 4 H100s and QPS=25
yh-yao opened this issue 1 day ago
yh-yao opened this issue 1 day ago
Set USE_VLLM_CUSTOM_ALLREDUCE to empty string
tot0 opened this pull request 1 day ago
tot0 opened this pull request 1 day ago
Add step to update sgl-kernel whl index
ispobock opened this pull request 1 day ago
ispobock opened this pull request 1 day ago
Add workflow for sgl-kernel cu118 release
ispobock opened this pull request 2 days ago
ispobock opened this pull request 2 days ago
[Bug] Crash special token xgrammar
maximegmd opened this issue 2 days ago
maximegmd opened this issue 2 days ago
minor: update sgl-kernel setup
zhyncs opened this pull request 2 days ago
zhyncs opened this pull request 2 days ago
[Bug] Qwen2-VL-7B with sglang has significant numerical calculation errors compared to HF Transformers
kritohyh opened this issue 2 days ago
kritohyh opened this issue 2 days ago
minor: sync flashinfer and add turbomind as 3rdparty
zhyncs opened this pull request 2 days ago
zhyncs opened this pull request 2 days ago
[Bug] constrained decoding performance is worse when qps>2
qibaoyuan opened this issue 2 days ago
qibaoyuan opened this issue 2 days ago
Batch inference over multiple nodes
boyang-nlp opened this issue 2 days ago
boyang-nlp opened this issue 2 days ago
Question About Model Integration and Parameter Updates (update_weight) in Sglang
davidlvxin opened this issue 2 days ago
davidlvxin opened this issue 2 days ago
[Bug] The batch decoding speed of DeepSeek V3 is too slow.
SonChoulJun opened this issue 2 days ago
SonChoulJun opened this issue 2 days ago
[Bug] Multi-node BUG
sitabulaixizawaluduo opened this issue 2 days ago
sitabulaixizawaluduo opened this issue 2 days ago
[Bug] Qwen2-VL Online Serving Issue
ywang96 opened this issue 2 days ago
ywang96 opened this issue 2 days ago
Fix cu118 group gemm compile issue
ispobock opened this pull request 2 days ago
ispobock opened this pull request 2 days ago
[Docs] minor update for phi-3 and phi-4
adarshxs opened this pull request 2 days ago
adarshxs opened this pull request 2 days ago
[router] Fix twine uploading
ByronHsu opened this pull request 2 days ago
ByronHsu opened this pull request 2 days ago
bump router to 0.1.4
ByronHsu opened this pull request 2 days ago
ByronHsu opened this pull request 2 days ago
Add shapes for int8 gemm benchmark
ispobock opened this pull request 2 days ago
ispobock opened this pull request 2 days ago
[Feature] Support InterVL
zhaochenyang20 opened this issue 2 days ago
zhaochenyang20 opened this issue 2 days ago
create All2All MoE module && place holder for EP group and token disp…
shawlleyw opened this pull request 2 days ago
shawlleyw opened this pull request 2 days ago
[Feature] Add support for Phi4
Stealthwriter opened this issue 2 days ago
Stealthwriter opened this issue 2 days ago
[Benchmarks] Cant'run examples benchmark. Flashinfer error:
dsantiago opened this issue 2 days ago
dsantiago opened this issue 2 days ago
feat: use sgl-kernel by default
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
chore: bump sgl-kernel 0.0.2.post16
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
feat: integrate sampling kernels into sgl-kernel
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
Add CPU affinity setting to latency benchmark
hubertlu-tw opened this pull request 3 days ago
hubertlu-tw opened this pull request 3 days ago
[hotfix] fix test_sampling_scaling_penalties.py ci test
BBuf opened this pull request 3 days ago
BBuf opened this pull request 3 days ago
Use flashinfer vec_dtypes in sgl_kernel
BBuf opened this pull request 3 days ago
BBuf opened this pull request 3 days ago
sync flashinfer and update sgl-kernel tests
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
use env variable to control the build conf on the CPU build node
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
update version setup for sgl-kernel
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
fix build error for sgl-kernel
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
[Feature] docs: Improve documentation on how to use EAGLE speculative docoding
daviddl9 opened this issue 3 days ago
daviddl9 opened this issue 3 days ago
[Bug] DeepSeek-V3 load weights failed with --enable-ep-moe
MtFitzRoy opened this issue 3 days ago
MtFitzRoy opened this issue 3 days ago
Remove torch dependency in sgl-kernel
merrymercy opened this pull request 3 days ago
merrymercy opened this pull request 3 days ago
[Feature] Support service discovery on Kubernetes in router
gaocegege opened this issue 3 days ago
gaocegege opened this issue 3 days ago
Some question about layernom in MLA code
hcyz33 opened this issue 3 days ago
hcyz33 opened this issue 3 days ago
use v0.6.4.post1 for sgl-kernel ci
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
[router] Forward all request headers from router to workers
ByronHsu opened this pull request 3 days ago
ByronHsu opened this pull request 3 days ago
docs: update developer guide for sgl-kernel
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
docs: add developer guide for sgl-kernel
zhyncs opened this pull request 3 days ago
zhyncs opened this pull request 3 days ago
Revert "disable custom allreduce on HIP"
merrymercy opened this pull request 3 days ago
merrymercy opened this pull request 3 days ago
[Feature] Beam Search
laixinn opened this pull request 3 days ago
laixinn opened this pull request 3 days ago
[Bug]ImportError: undefined symbol: cuModuleGetFunction when using lmsysorg/sglang:v0.4.1.post7-cu124
aooxin opened this issue 3 days ago
aooxin opened this issue 3 days ago
Indexing.cu:1255: indexSelectSmallIndex: block: [3,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
CallmeZhangChenchen opened this issue 3 days ago
CallmeZhangChenchen opened this issue 3 days ago
[router] make error actionable
ByronHsu opened this pull request 3 days ago
ByronHsu opened this pull request 3 days ago
Fix tp token sync for dp attention
merrymercy opened this pull request 3 days ago
merrymercy opened this pull request 3 days ago
Support loading of larger models with on-the-fly quantization
kwen2501 opened this pull request 3 days ago
kwen2501 opened this pull request 3 days ago
Add some flags to allow sync token ids across TP ranks
merrymercy opened this pull request 4 days ago
merrymercy opened this pull request 4 days ago
[Bug] Problems with logit_bias.
cinjon opened this issue 4 days ago
cinjon opened this issue 4 days ago
disable custom allreduce on HIP
hliuca opened this pull request 4 days ago
hliuca opened this pull request 4 days ago
add notice about flashinfer in sgl-kernel
zhyncs opened this pull request 4 days ago
zhyncs opened this pull request 4 days ago
feat: integrate bmm_fp8 kernel into sgl-kernel
zhyncs opened this pull request 4 days ago
zhyncs opened this pull request 4 days ago
fix rotary_embedding rope_scaling for phi
sudo-root-ns opened this pull request 4 days ago
sudo-root-ns opened this pull request 4 days ago
minor: update header and use pytest
zhyncs opened this pull request 4 days ago
zhyncs opened this pull request 4 days ago
feat: integrate activation kernels into sgl-kernel
zhyncs opened this pull request 4 days ago
zhyncs opened this pull request 4 days ago
feat: integrate norm kernels into sgl-kernel
zhyncs opened this pull request 4 days ago
zhyncs opened this pull request 4 days ago
sync the upstream updates of flashinfer
zhyncs opened this pull request 4 days ago
zhyncs opened this pull request 4 days ago
[Bug] Decode Throughput Inconsistency Between bench_serving and Engine Logs
leepoly opened this issue 4 days ago
leepoly opened this issue 4 days ago
[Help wanted] CANN'T capture GPU activities using `nsight system`
sleepwalker2017 opened this issue 4 days ago
sleepwalker2017 opened this issue 4 days ago
update norm cu
zhyncs opened this pull request 4 days ago
zhyncs opened this pull request 4 days ago
support w8a8 fp8 kernel with CUTLASS
HandH1998 opened this pull request 4 days ago
HandH1998 opened this pull request 4 days ago
Fix sgl-kernel compile for sm80
ispobock opened this pull request 4 days ago
ispobock opened this pull request 4 days ago
Fix the FP8 E4M3 parsing offline scales failure bug
sleepcoo opened this pull request 4 days ago
sleepcoo opened this pull request 4 days ago
Modify the kernel test path & add it to the CI process.
sleepcoo opened this pull request 4 days ago
sleepcoo opened this pull request 4 days ago
[Feature] Reasoning model API support
lambert0312 opened this issue 4 days ago
lambert0312 opened this issue 4 days ago
[Bug] Qwen2-VL-7B with sglang Performance Degradation
yileld opened this issue 4 days ago
yileld opened this issue 4 days ago
[Feature] batch concurrent requests while streaming responses
moxiegushi opened this issue 4 days ago
moxiegushi opened this issue 4 days ago
Use int64 as indices for set_kv_buffer
merrymercy opened this pull request 4 days ago
merrymercy opened this pull request 4 days ago
[Doc]Update doc of profiling with PyTorch Profiler
Fridge003 opened this pull request 4 days ago
Fridge003 opened this pull request 4 days ago
Allow local cutlass directory to be used in sgl-kernel build
trevor-m opened this pull request 4 days ago
trevor-m opened this pull request 4 days ago
fix pr-test-sgl-kernel
zhyncs opened this pull request 5 days ago
zhyncs opened this pull request 5 days ago
Support sm90 Int8 gemm
ispobock opened this pull request 5 days ago
ispobock opened this pull request 5 days ago
Support int8 kvcahe
sleepcoo opened this pull request 5 days ago
sleepcoo opened this pull request 5 days ago
feat: add flashinfer as 3rdparty and use rmsnorm as example
zhyncs opened this pull request 5 days ago
zhyncs opened this pull request 5 days ago
[Feature] Support Beam Search
laixinn opened this issue 5 days ago
laixinn opened this issue 5 days ago
Can router support --api-key parameter
lambert0312 opened this issue 5 days ago
lambert0312 opened this issue 5 days ago
support lightning_attention_decode in sgl-kernel for MiniMax-Text-01
BBuf opened this pull request 5 days ago
BBuf opened this pull request 5 days ago
Debug radixcache: refactor recursive helper methods
luzengxiangcn opened this pull request 5 days ago
luzengxiangcn opened this pull request 5 days ago
refactor radix cache
luzengxiangcn opened this pull request 5 days ago
luzengxiangcn opened this pull request 5 days ago
Add accuracy and latency tests of eagle into CI
merrymercy opened this pull request 5 days ago
merrymercy opened this pull request 5 days ago
upgrade torch version for sgl-kernel
zhyncs opened this pull request 5 days ago
zhyncs opened this pull request 5 days ago
minor: update Makefile for sgl-kernel
zhyncs opened this pull request 5 days ago
zhyncs opened this pull request 5 days ago
[Bug] [EAGLE2] CUDA errors occur under high concurrency.
Xu-Chen opened this issue 5 days ago
Xu-Chen opened this issue 5 days ago
Minicpmo
mickqian opened this pull request 5 days ago
mickqian opened this pull request 5 days ago
Fix flaky tests in test_programs.py
merrymercy opened this pull request 5 days ago
merrymercy opened this pull request 5 days ago