Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang
[CI] Split test cases in CI for better load balancing
merrymercy opened this pull request 28 days ago
merrymercy opened this pull request 28 days ago
feat: add should_use_tensor_core
zhyncs opened this pull request 28 days ago
zhyncs opened this pull request 28 days ago
[Feature] Get the real logprobs to analyze decoding
Snowdar opened this issue 28 days ago
Snowdar opened this issue 28 days ago
[Bug] frequency penalty
vivian0429 opened this issue 28 days ago
vivian0429 opened this issue 28 days ago
Update XGrammar to the latest API
Ubospica opened this pull request 28 days ago
Ubospica opened this pull request 28 days ago
[Fix] Avoid calling fill_vocab_mask for terminated requests
Ubospica opened this pull request 28 days ago
Ubospica opened this pull request 28 days ago
feat: fused_moe fp8 monkey patch
zhyncs opened this pull request 28 days ago
zhyncs opened this pull request 28 days ago
[feat] Refactor session control interface and add CI
Ying1123 opened this pull request 28 days ago
Ying1123 opened this pull request 28 days ago
Question about ragged wrapper
ZhongYingMatrix opened this issue 28 days ago
ZhongYingMatrix opened this issue 28 days ago
[Performance]: Process affinity to CPU cores with multiple sockets support
HaiShaw opened this pull request 28 days ago
HaiShaw opened this pull request 28 days ago
Replace prob based with threshold based load balancing
ByronHsu opened this pull request 28 days ago
ByronHsu opened this pull request 28 days ago
Allow overwrite flashinfer use_tensorcore
merrymercy opened this pull request 28 days ago
merrymercy opened this pull request 28 days ago
[Feature] How to accelerate constrained decoding when regex needs to change with input?
GrittyChen opened this issue 28 days ago
GrittyChen opened this issue 28 days ago
[Fused moe] add tuning fused configs for qwen2 57b and mixtral 8x7b
BBuf opened this pull request 28 days ago
BBuf opened this pull request 28 days ago
[Bug] cannot import name 'CachedGrammarCompiler' from 'xgrammar' (version 0.3.6)
Quang-elec44 opened this issue 28 days ago
Quang-elec44 opened this issue 28 days ago
test select concurrency
qeternity opened this pull request 29 days ago
qeternity opened this pull request 29 days ago
Fix docs
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
Rename triton_fused_moe -> fused_moe_triton
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
Balance CI tests
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
fix: use torch.sum for compatible
zhyncs opened this pull request 29 days ago
zhyncs opened this pull request 29 days ago
[Bug] FusedMoE compatible with vllm 0.6.3.post1
zhyncs opened this issue 29 days ago
zhyncs opened this issue 29 days ago
Update CI threshold & Improve code style
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
Fix mixed chunked prefill in overlap mode
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
fix: resolve end-of-file-fixer
zhyncs opened this pull request 29 days ago
zhyncs opened this pull request 29 days ago
feat: update other MoE models deps
zhyncs opened this pull request 29 days ago
zhyncs opened this pull request 29 days ago
feat: update gitignore and add tuning config for FusedMoE
zhyncs opened this pull request 29 days ago
zhyncs opened this pull request 29 days ago
Simplify `Scheduler.update_running_batch`
merrymercy opened this pull request 29 days ago
merrymercy opened this pull request 29 days ago
feat: remove the dependency on FusedMoE
zhyncs opened this pull request 29 days ago
zhyncs opened this pull request 29 days ago
Merged three native APIs into one: get_server_info
henryhmko opened this pull request 29 days ago
henryhmko opened this pull request 29 days ago
[Bug] llava use image hash as token,leading to cache bug
zwc163 opened this issue 29 days ago
zwc163 opened this issue 29 days ago
Speculative EAGLE2
yukavio opened this pull request 29 days ago
yukavio opened this pull request 29 days ago
Byhsu/fairness router
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
Improve sglang router
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
add prefix match for certain tenant
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
Add more api routes (completion, health, etc) to the router
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
[Draft] Resolving integration differences after XGrammar lauch refactoring
gittb opened this pull request 29 days ago
gittb opened this pull request 29 days ago
fix dp_rank env
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
update router doc
ByronHsu opened this pull request 29 days ago
ByronHsu opened this pull request 29 days ago
Bump sglang-router to 0.0.5
ByronHsu opened this pull request 30 days ago
ByronHsu opened this pull request 30 days ago
[Bug] Error when using LLAVA 1.5 for llava bench
pspdada opened this issue 30 days ago
pspdada opened this issue 30 days ago
fix: resolve bench_serving args
zhyncs opened this pull request 30 days ago
zhyncs opened this pull request 30 days ago
Fix dp print message
merrymercy opened this pull request 30 days ago
merrymercy opened this pull request 30 days ago
[CI] Fix test cases
merrymercy opened this pull request 30 days ago
merrymercy opened this pull request 30 days ago
Add concurrency option for benchmark
cermeng opened this pull request 30 days ago
cermeng opened this pull request 30 days ago
Add concurrency option in benchmark
cermeng opened this pull request 30 days ago
cermeng opened this pull request 30 days ago
Fix grid size in Triton decoding kernel
ispobock opened this pull request 30 days ago
ispobock opened this pull request 30 days ago
[Bug] Error when launching llava1.5
pspdada opened this issue 30 days ago
pspdada opened this issue 30 days ago
deps(flashinfer): fix `is_flashinfer_available()` and make `flashinfer` optional dependency
XuehaiPan opened this pull request about 1 month ago
XuehaiPan opened this pull request about 1 month ago
[Feature] Support LLaMA-3.2 finetuned with Sentence Transformers !
thusinh1969 opened this issue about 1 month ago
thusinh1969 opened this issue about 1 month ago
Revert "Only stream output on tp rank 0"
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
EAGLE2: general part [2]
yukavio opened this pull request about 1 month ago
yukavio opened this pull request about 1 month ago
EAGLE2: Eagle related part [1]
yukavio opened this pull request about 1 month ago
yukavio opened this pull request about 1 month ago
feat(pre-commit): trim unnecessary notebook metadata from git history
XuehaiPan opened this pull request about 1 month ago
XuehaiPan opened this pull request about 1 month ago
fix: add xgrammar dependency
zhyncs opened this pull request about 1 month ago
zhyncs opened this pull request about 1 month ago
minor: update gsm8k threshold
zhyncs opened this pull request about 1 month ago
zhyncs opened this pull request about 1 month ago
Only stream output on tp rank 0
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
add profile in offline benchmark & update doc
bjmsong opened this pull request about 1 month ago
bjmsong opened this pull request about 1 month ago
[minor] Clean up unused imports
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
Add initial support for intel Gaudi accelerators
ankurneog opened this pull request about 1 month ago
ankurneog opened this pull request about 1 month ago
chore: bump v0.3.6
zhyncs opened this pull request about 1 month ago
zhyncs opened this pull request about 1 month ago
Online weight update [WIP]
zhaochenyang20 opened this pull request about 1 month ago
zhaochenyang20 opened this pull request about 1 month ago
Rename sglang.bench_latency to sglang.bench_one_batch
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
[Bug] Unable to load GPTQ Mixtral 8x7 v0.1 with SGLang
DhruvaBansal00 opened this issue about 1 month ago
DhruvaBansal00 opened this issue about 1 month ago
Turn off autotune for scaled mm for fp8 dynamic quant in torchao
jerryzh168 opened this pull request about 1 month ago
jerryzh168 opened this pull request about 1 month ago
[router] add base_gpu_id server args & merged radix tree python reference
ByronHsu opened this pull request about 1 month ago
ByronHsu opened this pull request about 1 month ago
[router] cache-aware load-balancing router v1
ByronHsu opened this pull request about 1 month ago
ByronHsu opened this pull request about 1 month ago
[Feature] Inference example code for Qwen2-VL
YuanLiuuuuuu opened this issue about 1 month ago
YuanLiuuuuuu opened this issue about 1 month ago
[Bug] Qwen2-VL-7B with sglang Performance Degradation on MME benchmark
Mr-Loevan opened this issue about 1 month ago
Mr-Loevan opened this issue about 1 month ago
ROCm: Fix MoE padding for none FP8 cases
HaiShaw opened this pull request about 1 month ago
HaiShaw opened this pull request about 1 month ago
Benchmark with Pytorch Profiler easily
bjmsong opened this pull request about 1 month ago
bjmsong opened this pull request about 1 month ago
[Feature] Support for rerank models
dinhanhx opened this issue about 1 month ago
dinhanhx opened this issue about 1 month ago
[Feature] Is Yarn supported in sglang?
klykq111 opened this issue about 1 month ago
klykq111 opened this issue about 1 month ago
Error out when torchao-config option is not recognized
jerryzh168 opened this pull request about 1 month ago
jerryzh168 opened this pull request about 1 month ago
Fix #2037 - Context length check does not take into out pad tokens for visual models
jakep-allenai opened this pull request about 1 month ago
jakep-allenai opened this pull request about 1 month ago
Enable overlap scheduler by default for the triton attention backend
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
Move test_session_id.py to playground
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
Allow skipping warmup in bench_offline_throughput.py
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
[Bug] RuntimeError: Failed to allocate memory for batch_prefill_tmp_v with size 435814400 and alignment 16 in AlignedAllocator
yuki252111 opened this issue about 1 month ago
yuki252111 opened this issue about 1 month ago
feat: use cascade attention kernel (single level)
james-p-xu opened this pull request about 1 month ago
james-p-xu opened this pull request about 1 month ago
Update nightly-eval.yml
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
[Bug] canot load Gemma2 awq
Foreist opened this issue about 1 month ago
Foreist opened this issue about 1 month ago
[Bug] big TPOT and ITL when running the offline benchmark
TraceIvan opened this issue about 1 month ago
TraceIvan opened this issue about 1 month ago
Use native fp8 format on MI300X
HaiShaw opened this pull request about 1 month ago
HaiShaw opened this pull request about 1 month ago
minor: add dataset dump and questions shuffle
zhyncs opened this pull request about 1 month ago
zhyncs opened this pull request about 1 month ago
Expose max total num tokens from Runtime & Engine API
henryhmko opened this pull request about 1 month ago
henryhmko opened this pull request about 1 month ago
minor: update gsm8k eval
zhyncs opened this pull request about 1 month ago
zhyncs opened this pull request about 1 month ago
[Bug] disk cache io error when simultaneously loading lots of sglang offline engine
LeeSureman opened this issue about 1 month ago
LeeSureman opened this issue about 1 month ago
Use cuda event wait and synchronization instead of busy waiting
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
Fix: incorrect top_logprobs in chat completion
ajwaitz opened this pull request about 1 month ago
ajwaitz opened this pull request about 1 month ago
[Feature, Performance] kv cache performance improvement
HaiShaw opened this issue about 1 month ago
HaiShaw opened this issue about 1 month ago
Simplify logits penalizer
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
Allow passing extra request body to bench_offline_throughput.py
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
[Bug] Qwen-2.5-Math-7B-Instruct and Llama-3.1-8B-Instruct Produce Nonsensical Results
Broyojo opened this issue about 1 month ago
Broyojo opened this issue about 1 month ago
Fix chunked prefill with output logprob
merrymercy opened this pull request about 1 month ago
merrymercy opened this pull request about 1 month ago
feat(srt): support prefill and generate with `input_embeds`
XuehaiPan opened this pull request about 1 month ago
XuehaiPan opened this pull request about 1 month ago
Add simple CPU offloading support.
janimo opened this pull request about 1 month ago
janimo opened this pull request about 1 month ago
[Feature] TorchAO support for Qwen 32B
grahama1970 opened this issue about 1 month ago
grahama1970 opened this issue about 1 month ago
Rename layer_idx to layer_id for consistency
janimo opened this pull request about 1 month ago
janimo opened this pull request about 1 month ago
docs: fix module docstrings and copyright headers
XuehaiPan opened this pull request about 1 month ago
XuehaiPan opened this pull request about 1 month ago
[Performance] why so many bubbles between steps when running llava-one-vision?
sleepwalker2017 opened this issue about 1 month ago
sleepwalker2017 opened this issue about 1 month ago