Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang
Add Support for XVERSE Models (Dense and MoE) to sglang
hxer7963 opened this pull request 4 months ago
hxer7963 opened this pull request 4 months ago
[Feature] support awq of deepseek-v2 or deepseek-v2.5
tutu329 opened this issue 4 months ago
tutu329 opened this issue 4 months ago
[Feature] need DeepSeek-v2 or deepseek-v2.5 awq support
tutu329 opened this issue 4 months ago
tutu329 opened this issue 4 months ago
Remove synchronization in cuda graph replay
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Add no commit to main rule
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Optimize conflicts between CUDA graph and vocab mask tensors
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Bug] 'LlamaTokenizerFast' object has no attribute 'tokenizer'
zwc163 opened this issue 4 months ago
zwc163 opened this issue 4 months ago
Improve error reporting during server launch
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Fix] Fix --disable-flashinfer
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Feature] Support torch profiler
danielhua23 opened this issue 4 months ago
danielhua23 opened this issue 4 months ago
[Feature] Can centos7 use this project?
luo647 opened this issue 4 months ago
luo647 opened this issue 4 months ago
[Bug] requests.exceptions.JSONDecodeError:
eyuansu62 opened this issue 4 months ago
eyuansu62 opened this issue 4 months ago
remove assertion in triton attention and add an unit test
ByronHsu opened this pull request 4 months ago
ByronHsu opened this pull request 4 months ago
[Feature] Support RM API
UbeCc opened this issue 4 months ago
UbeCc opened this issue 4 months ago
Rewrite mixed chunked prefill
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Bug] too many processes
wellhowtosay opened this issue 4 months ago
wellhowtosay opened this issue 4 months ago
Refactor attention backend
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Deprecate --disable-flashinfer and introduce --attention-backend
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Minor] move triton attention kernels into a separate folder
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Organize flashinfer indices update
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Do not merge] Test torchao
jerryzh168 opened this pull request 4 months ago
jerryzh168 opened this pull request 4 months ago
Fix vocab mask update bug
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Minor] improve kill scripts and torchao import
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Feature] 4-bit quantized prefix cache
josephrocca opened this issue 4 months ago
josephrocca opened this issue 4 months ago
Fix CORS compatibility with OpenAI, vLLM, TGI, LMDeploy
josephrocca opened this pull request 4 months ago
josephrocca opened this pull request 4 months ago
deepseek-v2 torch.compile error
cdj0311 opened this issue 4 months ago
cdj0311 opened this issue 4 months ago
Support MiniCPM3
Achazwl opened this pull request 4 months ago
Achazwl opened this pull request 4 months ago
fix bug of `undefined is_single` in meth `create_abort_task`
wcsjtu opened this pull request 4 months ago
wcsjtu opened this pull request 4 months ago
deepseek-v2 enable-mla 4x slower
cdj0311 opened this issue 4 months ago
cdj0311 opened this issue 4 months ago
[Docs] Improve documentations
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
BaiChuan2 Model
blacker521 opened this pull request 4 months ago
blacker521 opened this pull request 4 months ago
SGLang Discussion WeChat Group
qingkelab opened this issue 4 months ago
qingkelab opened this issue 4 months ago
[Bug] Unable to see logprobs for prompt/input
dmakhervaks opened this issue 4 months ago
dmakhervaks opened this issue 4 months ago
[Bug] Mixed chunked prefill is not compatible with vocab tensor mask
hnyls2002 opened this issue 4 months ago
hnyls2002 opened this issue 4 months ago
Support OpenAI API json_schema response format
zifeitong opened this pull request 4 months ago
zifeitong opened this pull request 4 months ago
[Bug] sgLang v0.3 breaks TP8 Llama 3.1 405B FP8 on 8xH100
jischein opened this issue 4 months ago
jischein opened this issue 4 months ago
[CI] Return output logprobs in unit test
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Unify forward mode
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Feature] Follow up on non power of 2 triton kernel
ByronHsu opened this issue 4 months ago
ByronHsu opened this issue 4 months ago
[Bug] it seems memory leak in sglang when longtime serving
CSEEduanyu opened this issue 4 months ago
CSEEduanyu opened this issue 4 months ago
[Minor] Many cleanup
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Feature] support LLaVA-NeXT-Video-32B-Qwen
HarperGG opened this issue 4 months ago
HarperGG opened this issue 4 months ago
[Feature] smooth quant or other quant method
MichoChan opened this issue 4 months ago
MichoChan opened this issue 4 months ago
[Feature] support qwen2 vl
zhyncs opened this issue 4 months ago
zhyncs opened this issue 4 months ago
[Feature] KV Cache Quantization
ghost opened this issue 4 months ago
ghost opened this issue 4 months ago
[Feature] DRY repetition penalty
vnkc1 opened this issue 4 months ago
vnkc1 opened this issue 4 months ago
[Bug] `served_model_name` argument in the server_arg.py is not checked
zhaochenyang20 opened this issue 4 months ago
zhaochenyang20 opened this issue 4 months ago
[Feature] KV Cache Compression
ghost opened this issue 4 months ago
ghost opened this issue 4 months ago
[Feat] Add modalities for vision server when handling pixel values for llava
kcz358 opened this pull request 4 months ago
kcz358 opened this pull request 4 months ago
Fix some online scheduling delay
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Bug] it didn't work when using tp on RTX 3090
milktea888 opened this issue 4 months ago
milktea888 opened this issue 4 months ago
jinja2.exceptions.TemplateError: System role not supported
sdecoder opened this issue 4 months ago
sdecoder opened this issue 4 months ago
Add torchao quant (int4/int8/fp8) to llama models
jerryzh168 opened this pull request 4 months ago
jerryzh168 opened this pull request 4 months ago
docs: add conclusion
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
Optimize schedule
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
[Bug] Multi machine, multi card, slow speed
guleng opened this issue 4 months ago
guleng opened this issue 4 months ago
docs: highlight ttft itl and throughput
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
docs: update README
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
[Feature] Per-request random seed
laoconeth opened this issue 4 months ago
laoconeth opened this issue 4 months ago
[Bug] ConnectionResetError: [Errno 104] Connection reset by peer
oliver-li opened this issue 4 months ago
oliver-li opened this issue 4 months ago
[Bug] Unsupported architectures: ChatGLMForConditionalGeneration.
maxin9966 opened this issue 4 months ago
maxin9966 opened this issue 4 months ago
[Bug] Using 8 H20 GPUs, the deepseek-coder-v2-fp8 starts up normally, but there is no response to client requests.
fengyang95 opened this issue 4 months ago
fengyang95 opened this issue 4 months ago
Remove useless fields in global_config.py
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
docs: update news
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
Fix the flaky test test_moe_eval_accuracy_large.py
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] T4 Crash
Abdulhanan535 opened this issue 4 months ago
Abdulhanan535 opened this issue 4 months ago
[Bug] RuntimeError in ModelTpServer
Lzhang-hub opened this issue 4 months ago
Lzhang-hub opened this issue 4 months ago
[Feature] support smooth-quant?
Lzhang-hub opened this issue 4 months ago
Lzhang-hub opened this issue 4 months ago
[Bug] Facing Error When starting.
Abdulhanan535 opened this issue 4 months ago
Abdulhanan535 opened this issue 4 months ago
chore: bump v0.3.0
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
misc: speedup load safetensors
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
Fix select by ensuring each request has at least one token
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
Fix llama2 weight loader
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] Unable to fix model output
cherishhh opened this issue 4 months ago
cherishhh opened this issue 4 months ago
The CPU is also occupied at 100% when there are no requests.
luhairong11 opened this issue 4 months ago
luhairong11 opened this issue 4 months ago
Update README.md for llava-onevision instructions
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Bug] gen with regex: Token fusion between input and output, try to avoid this by removing the space at the end of the input.
alanxmay opened this issue 4 months ago
alanxmay opened this issue 4 months ago
Removed unused methods
janimo opened this pull request 4 months ago
janimo opened this pull request 4 months ago
[Bug] Update to 0.2.15 and torch compile leads to error
zhaochenyang20 opened this issue 4 months ago
zhaochenyang20 opened this issue 4 months ago
Adding document for backend
zhaochenyang20 opened this pull request 4 months ago
zhaochenyang20 opened this pull request 4 months ago
[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Feature] Initial support for multi-LoRA serving
Ying1123 opened this pull request 4 months ago
Ying1123 opened this pull request 4 months ago
Fix bugs in sampler with CUDA graph / torch.compile
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
feat: update linear deps 1/N
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
feat: update nightly gsm8k eval
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
Do you support frontend-language inference for Llava-OneVision ?
ehayeshaiper opened this issue 4 months ago
ehayeshaiper opened this issue 4 months ago
[Bug] A100 PCIE torch compile error
zhyncs opened this issue 4 months ago
zhyncs opened this issue 4 months ago
Adding Documentation for installation
zhaochenyang20 opened this pull request 4 months ago
zhaochenyang20 opened this pull request 4 months ago
Support Phi3 mini and medium
janimo opened this pull request 4 months ago
janimo opened this pull request 4 months ago
[server] Passing `model_override_args` to `launch_server` via the CLI.
kevin85421 opened this pull request 4 months ago
kevin85421 opened this pull request 4 months ago
Fix hang when doing s += None.
max99x opened this pull request 4 months ago
max99x opened this pull request 4 months ago
Fix regex mask
hnyls2002 opened this pull request 4 months ago
hnyls2002 opened this pull request 4 months ago
Release v0.2.15
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[doc] Fix more broken links
ByronHsu opened this pull request 4 months ago
ByronHsu opened this pull request 4 months ago
Fix the flaky tests in test_moe_eval_accuracy_large.py
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago
[Feature] Correctness test for Triton kernels
ByronHsu opened this issue 4 months ago
ByronHsu opened this issue 4 months ago
ci: add nightly eval
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
fix: resolve fp8 for mixtral
zhyncs opened this pull request 4 months ago
zhyncs opened this pull request 4 months ago
[CI] merge all ci tests into one file
merrymercy opened this pull request 4 months ago
merrymercy opened this pull request 4 months ago