Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang
Dispatch flashinfer wrappers
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Refactor] Simplify io_struct and tokenizer_manager
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Fix bugs of `logprobs_nums`
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Organize Attention Backends
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Support qwen2 vl model
yizhang2077 opened this pull request 4 months ago
yizhang2077 opened this pull request 4 months ago
[Fix, LoRA] fix LoRA with updates in main
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Clean up batch data structures: Introducing ModelWorkerBatch
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Rename InputMetadata -> ForwardBatch
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Add support for Molmo-D-7B Model
BabyChouSr opened this pull request 4 months ago
BabyChouSr opened this pull request 4 months ago
Let ModelRunner take InputMetadata as input, instead of ScheduleBatch
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Refactor] Simplify io_struct and tokenizer_manager
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Process image in parallel
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Move scheduler code from tp_worker.py to scheduler.py
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
fix ipv6 url when warm up model
cauyxy opened this pull request 4 months ago
cauyxy opened this pull request 4 months ago
[Fix] Fix AttributeError in Qwen2.5 LoRA: 'Qwen2ForCausalLM' object has no attribute 'get_hidden_dim'
mssongit opened this pull request 4 months ago
mssongit opened this pull request 4 months ago
[Fix] Fix AttributeError in Qwen2.5(huggingface model) LoRA: 'Qwen2ForCausalLM' object has no attribute 'get_module_name'
mssongit opened this pull request 4 months ago
mssongit opened this pull request 4 months ago
Improve process creation
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] ValueError: The memory capacity is unbalanced
chuangzhidan opened this issue 4 months ago
chuangzhidan opened this issue 4 months ago
Make detokenizer_manager.py not asyncio
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Organize image inputs
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Multiple minor fixes
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Event] Update meeting link
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Add float8 dynamic quant to torchao_utils
jerryzh168 opened this pull request 4 months ago
jerryzh168 opened this pull request 4 months ago
[Feature] VLLM 6.0 support
arunpatala opened this issue 4 months ago
arunpatala opened this issue 4 months ago
[Bug] IndexError: list index out of range
lvxianfeng-git opened this issue 4 months ago
lvxianfeng-git opened this issue 4 months ago
[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
minor: fix config
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Feature] add support for llama 3.2
Stealthwriter opened this issue 4 months ago
Stealthwriter opened this issue 4 months ago
[Bug] Unable to use gptq or awq with torch.compile (8*A40)
smallstepman opened this issue 4 months ago
smallstepman opened this issue 4 months ago
[FIX] Catch syntax error of Regex Guide to avoid crash
du00cs opened this pull request 4 months ago
du00cs opened this pull request 4 months ago
[bugfix]Add modelscope package to avoid docker image without modelscope
KylinMountain opened this pull request 4 months ago
KylinMountain opened this pull request 4 months ago
Accuracy reduction of Lora
yileld opened this issue 4 months ago
yileld opened this issue 4 months ago
Update Dockerfile
KylinMountain opened this pull request 4 months ago
KylinMountain opened this pull request 4 months ago
[Bug] no module modelscope using docker compose to start sglang
KylinMountain opened this issue 4 months ago
KylinMountain opened this issue 4 months ago
How to study the code?
TJ949 opened this issue 4 months ago
TJ949 opened this issue 4 months ago
[Feature] _get_pixel_values needs to return tgt_sizes
huangzl18883 opened this issue 4 months ago
huangzl18883 opened this issue 4 months ago
[Fix] Ignore model import error
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Release v0.3.2
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Revert "kernel: use tensor cores for flashinfer gqa kernels"
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
[Fix] Fix clean_up_tokenization_spaces in tokenizer
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] tensor parallel run error
jerryzh168 opened this issue 4 months ago
jerryzh168 opened this issue 4 months ago
Add support for tie_word_embeddings when loading weights + support for SmolLM
TianyiQ opened this pull request 4 months ago
TianyiQ opened this pull request 4 months ago
[CI] Update nightly eval
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
[Bug] LLaVa-next does not work for single image processing
ThomasBenzshawel opened this issue 4 months ago
ThomasBenzshawel opened this issue 4 months ago
AWQ performance tracking
zhyncs opened this issue 4 months ago
zhyncs opened this issue 4 months ago
Possible timing side-channels caused by shared prefix
Unik-lif opened this issue 4 months ago
Unik-lif opened this issue 4 months ago
Simplify bench_latency.py
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Update test_srt_backend.py
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] radixcache stack_overflow
luzengxiangcn opened this issue 4 months ago
luzengxiangcn opened this issue 4 months ago
[CI] Move AMD test to a separate file
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
debug radixcache stack_overflow
luzengxiangcn opened this pull request 4 months ago
luzengxiangcn opened this pull request 4 months ago
Speculative decoding with EAGLE2
yukavio opened this pull request 4 months ago
yukavio opened this pull request 4 months ago
MoE torch compile
ispobock opened this pull request 4 months ago
ispobock opened this pull request 4 months ago
Fix the overhead due to penalizer in bench_latency
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Fix RuntimeEndpoint.select method
jeffrey-fong opened this pull request 4 months ago
jeffrey-fong opened this pull request 4 months ago
minor: add mla fp8 test
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
[Community] Add open collective sponsor link to README
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Update dockerfile to include datamodel_code_generator
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Add AMD tests to CI
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
[API, Feature] Support response prefill for openai API
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Add a unit test for data parallelism
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Better unit tests for adding a new model
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Development Roadmap (2024 Q4)
Ying1123 opened this issue 4 months ago
Ying1123 opened this issue 4 months ago
doc: update backend
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
[Bug] tp-4 start timeout
siddhatiwari opened this issue 4 months ago
siddhatiwari opened this issue 4 months ago
Add MLA gsm8k eval
ispobock opened this pull request 4 months ago
ispobock opened this pull request 4 months ago
chore: bump v0.3.1.post3
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
Fix triton head num
ispobock opened this pull request 4 months ago
ispobock opened this pull request 4 months ago
fix incorrect links in documentation
rchen19 opened this pull request 4 months ago
rchen19 opened this pull request 4 months ago
[Feature, Hardware] Enable SGLang on XPU GPUs via PyTorch
liangan1 opened this pull request 4 months ago
liangan1 opened this pull request 4 months ago
[Bug] Deepseek-V2.5 capture cuda graph failed
halexan opened this issue 4 months ago
halexan opened this issue 4 months ago
[Bug] The sglang cannot reach the preset concurrency level.
rangehow opened this issue 4 months ago
rangehow opened this issue 4 months ago
Add OLMoE
Muennighoff opened this pull request 4 months ago
Muennighoff opened this pull request 4 months ago
minor: add quant eval compared with base
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
[Bug] The engine hangs after requesting health_generate 190 times.
unix1986 opened this issue 4 months ago
unix1986 opened this issue 4 months ago
Fix env vars in bench_latency
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Performance] Add triton kernels for LoRA
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Release v0.3.1.post2
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Fix padding in the cuda graph
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] illegal memory access encountered
wonderisland opened this issue 4 months ago
wonderisland opened this issue 4 months ago
[Bug] enable-mixed-chunk may cause the regex request get wrong result and output_token_logprobs
liuteng opened this issue 4 months ago
liuteng opened this issue 4 months ago
Debug schedule optimization
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
fix: creat new dict everytime for putting new frame
Luodian opened this pull request 4 months ago
Luodian opened this pull request 4 months ago
[Bug] oom,torch.OutOfMemoryError: seems to only use one gpu on A800-80G,available 40g on each card
chuangzhidan opened this issue 4 months ago
chuangzhidan opened this issue 4 months ago
[WIP] Prometheus Metrics
binarycrayon opened this pull request 4 months ago
binarycrayon opened this pull request 4 months ago
[Question]Why is the default value of max_prefill_tokens 16384?
wjj19950828 opened this issue 4 months ago
wjj19950828 opened this issue 4 months ago
Support double sparsity
andy-yang-1 opened this pull request 4 months ago
andy-yang-1 opened this pull request 4 months ago
[Event] Add public meeting invite to README
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Fuse top_k and top_k in the sampler
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Pr fix max workers
wellhowtosay opened this pull request 4 months ago
wellhowtosay opened this pull request 4 months ago
[Bug] OOM when runing `bench_serving` with DeepSeekCoder-V2-Lite.
zh-zheng opened this issue 4 months ago
zh-zheng opened this issue 4 months ago
Fix oom issues with fp8 for llama
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bugfix] Enable SGLang on AMD GPUs via PyTorch for ROCm (#1419)
HaiShaw opened this pull request 4 months ago
HaiShaw opened this pull request 4 months ago
Add bench_server_latency.py
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Fix schedule bug
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
fix schedule bug
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Fixed n>1 causing list index out of range with VLM
jasonyux opened this pull request 4 months ago
jasonyux opened this pull request 4 months ago
Fix attention backend
ispobock opened this pull request 4 months ago
ispobock opened this pull request 4 months ago
Enable MLA by default
ispobock opened this pull request 4 months ago
ispobock opened this pull request 4 months ago
[Bug] Performance issue on MoE with torch.compile
ispobock opened this issue 4 months ago
ispobock opened this issue 4 months ago