Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang
Release v0.3.3
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Profile] Add pytorch profiler
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
Remove references to squeezellm
janimo opened this pull request 3 months ago
janimo opened this pull request 3 months ago
[WIP] Support NVLM-D
amosyou opened this pull request 3 months ago
amosyou opened this pull request 3 months ago
Update README.md
kushal34712 opened this pull request 3 months ago
kushal34712 opened this pull request 3 months ago
Returning a per request metric for number of cached_tokens read
havetc opened this pull request 3 months ago
havetc opened this pull request 3 months ago
Optimize broadcast & Reorg code
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Fix the port_args in bench_latency
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Use is_flashinfer_available to replace is_hip for flashinfer check
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Use `atexit` hook to implicitly shutdown `Runtime`
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
Fix chunked prefill condition
ispobock opened this pull request 3 months ago
ispobock opened this pull request 3 months ago
[Fix] Fix the case where prompt_len = 0
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Fix modality for image inputs
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Update README.md
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Test consistency for single and batch seperately
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
[Minor, Performance] Use torch.argmax for greedy sampling
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
fix(docs): Improve grammar and readability in README
amantyagiprojects opened this pull request 3 months ago
amantyagiprojects opened this pull request 3 months ago
[LoRA, Performance] Speedup multi-LoRA serving - Step 1
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
Clean up event loop
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Bug] Fix decode stats error on output_len 1
HaiShaw opened this pull request 3 months ago
HaiShaw opened this pull request 3 months ago
[Minor] Improve the style and fix flaky tests
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Fix styling
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
Fix runtime.generate when sampling param is not passed
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
default sampling param should be deepcopied
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
chore: update README.md
eltociear opened this pull request 3 months ago
eltociear opened this pull request 3 months ago
[Bug] Fix the Image Input of Batch Generation
OBJECT907 opened this pull request 3 months ago
OBJECT907 opened this pull request 3 months ago
Update io_struct.py
OBJECT907 opened this pull request 3 months ago
OBJECT907 opened this pull request 3 months ago
[Easy] use .text() instead of .text
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
Backend method not found when SRT Runtime is used
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
[Bug] Inconsistent results when executing independent sglang functions in different orders
ByronHsu opened this issue 3 months ago
ByronHsu opened this issue 3 months ago
Refine the add request reasons to avoid corner cases.
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
Support min_tokens in sgl.gen
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
[Event] Update README.md
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
[Bug] `Meta-Llama-3.1-8B-Instruct` triggers "Detected errors during sampling! NaN in the probability." under high concurrency/RPS.
tongyx361 opened this issue 3 months ago
tongyx361 opened this issue 3 months ago
[LoRA, Performance] Speedup multi-LoRA serving - Step 1
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
[Minifix] Remove extra space in cot example
FredericOdermatt opened this pull request 3 months ago
FredericOdermatt opened this pull request 3 months ago
Make input_ids a torch.Tensor
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Provide an offline engine API
ByronHsu opened this pull request 3 months ago
ByronHsu opened this pull request 3 months ago
Use ipc instead of tcp in zmq
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[doc] Chinese Documentation Translation Available for sglang
khum08 opened this issue 3 months ago
khum08 opened this issue 3 months ago
[Feature] Add `choices` in `/generate` endpoint and add `min_new_tokens` in `sgl.gen()`
TING2938 opened this issue 3 months ago
TING2938 opened this issue 3 months ago
[Fix] Fix major performance bug in certain cases
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
Organize sampling batch info better
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Add llama implementation with no tensor parallel linears
jerryzh168 opened this pull request 3 months ago
jerryzh168 opened this pull request 3 months ago
Print out what the model saw?
cinjon opened this issue 3 months ago
cinjon opened this issue 3 months ago
[FP8 KV Cache] Avoid KeyError at loading pre-quantized FP8 model with kv_scale
HaiShaw opened this pull request 3 months ago
HaiShaw opened this pull request 3 months ago
[Bug] Exception: Capture cuda graph failed: Triton Error [CUDA]: device kernel image is invalid
a136214808 opened this issue 3 months ago
a136214808 opened this issue 3 months ago
Move status check in the memory pool to CPU
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Fix] Move ScheduleBatch out of SamplingInfo
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Fix] do not maintain regex_fsm in SamplingBatchInfo
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Performance, Hardware] MoE tuning on AMD MI300x GPUs
kkHuang-amd opened this pull request 3 months ago
kkHuang-amd opened this pull request 3 months ago
[Fix] Fix all the Huggingface paths
tbarton16 opened this pull request 3 months ago
tbarton16 opened this pull request 3 months ago
Simplify flashinfer dispatch
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
Llama3.2 vision model support
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
Dispatch flashinfer wrappers
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
[Refactor] Simplify io_struct and tokenizer_manager
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
Fix bugs of `logprobs_nums`
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
Organize Attention Backends
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
Support qwen2 vl model
yizhang2077 opened this pull request 3 months ago
yizhang2077 opened this pull request 3 months ago
[Fix, LoRA] fix LoRA with updates in main
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
Clean up batch data structures: Introducing ModelWorkerBatch
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Rename InputMetadata -> ForwardBatch
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Add support for Molmo-D-7B Model
BabyChouSr opened this pull request 3 months ago
BabyChouSr opened this pull request 3 months ago
Let ModelRunner take InputMetadata as input, instead of ScheduleBatch
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Refactor] Simplify io_struct and tokenizer_manager
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
Process image in parallel
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
Move scheduler code from tp_worker.py to scheduler.py
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
fix ipv6 url when warm up model
cauyxy opened this pull request 3 months ago
cauyxy opened this pull request 3 months ago
[Fix] Fix AttributeError in Qwen2.5 LoRA: 'Qwen2ForCausalLM' object has no attribute 'get_hidden_dim'
mssongit opened this pull request 3 months ago
mssongit opened this pull request 3 months ago
[Fix] Fix AttributeError in Qwen2.5(huggingface model) LoRA: 'Qwen2ForCausalLM' object has no attribute 'get_module_name'
mssongit opened this pull request 3 months ago
mssongit opened this pull request 3 months ago
Improve process creation
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Bug] ValueError: The memory capacity is unbalanced
chuangzhidan opened this issue 3 months ago
chuangzhidan opened this issue 3 months ago
Make detokenizer_manager.py not asyncio
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
Organize image inputs
hnyls2002 opened this pull request 3 months ago
hnyls2002 opened this pull request 3 months ago
Multiple minor fixes
merrymercy opened this pull request 3 months ago
merrymercy opened this pull request 3 months ago
[Event] Update meeting link
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
Add float8 dynamic quant to torchao_utils
jerryzh168 opened this pull request 3 months ago
jerryzh168 opened this pull request 3 months ago
[Feature] VLLM 6.0 support
arunpatala opened this issue 3 months ago
arunpatala opened this issue 3 months ago
[Bug] IndexError: list index out of range
lvxianfeng-git opened this issue 3 months ago
lvxianfeng-git opened this issue 3 months ago
[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B
Ying1123 opened this pull request 3 months ago
Ying1123 opened this pull request 3 months ago
minor: fix config
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Feature] add support for llama 3.2
Stealthwriter opened this issue 4 months ago
Stealthwriter opened this issue 4 months ago
[Bug] Unable to use gptq or awq with torch.compile (8*A40)
smallstepman opened this issue 4 months ago
smallstepman opened this issue 4 months ago
[FIX] Catch syntax error of Regex Guide to avoid crash
du00cs opened this pull request 4 months ago
du00cs opened this pull request 4 months ago
[bugfix]Add modelscope package to avoid docker image without modelscope
KylinMountain opened this pull request 4 months ago
KylinMountain opened this pull request 4 months ago
Accuracy reduction of Lora
yileld opened this issue 4 months ago
yileld opened this issue 4 months ago
Update Dockerfile
KylinMountain opened this pull request 4 months ago
KylinMountain opened this pull request 4 months ago
[Bug] no module modelscope using docker compose to start sglang
KylinMountain opened this issue 4 months ago
KylinMountain opened this issue 4 months ago
How to study the code?
TJ949 opened this issue 4 months ago
TJ949 opened this issue 4 months ago
[Feature] _get_pixel_values needs to return tgt_sizes
huangzl18883 opened this issue 4 months ago
huangzl18883 opened this issue 4 months ago
[Fix] Ignore model import error
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Release v0.3.2
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Revert "kernel: use tensor cores for flashinfer gqa kernels"
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
[Fix] Fix clean_up_tokenization_spaces in tokenizer
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] tensor parallel run error
jerryzh168 opened this issue 4 months ago
jerryzh168 opened this issue 4 months ago
Add support for tie_word_embeddings when loading weights + support for SmolLM
TianyiQ opened this pull request 4 months ago
TianyiQ opened this pull request 4 months ago
[CI] Update nightly eval
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
[Bug] LLaVa-next does not work for single image processing
ThomasBenzshawel opened this issue 4 months ago
ThomasBenzshawel opened this issue 4 months ago
AWQ performance tracking
zhyncs opened this issue 4 months ago
zhyncs opened this issue 4 months ago
Possible timing side-channels caused by shared prefix
Unik-lif opened this issue 4 months ago
Unik-lif opened this issue 4 months ago