Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang

Add Support for XVERSE Models (Dense and MoE) to sglang

hxer7963 opened this pull request 4 months ago
[Feature] support awq of deepseek-v2 or deepseek-v2.5

tutu329 opened this issue 4 months ago
[Feature] need DeepSeek-v2 or deepseek-v2.5 awq support

tutu329 opened this issue 4 months ago
Remove synchronization in cuda graph replay

hnyls2002 opened this pull request 4 months ago
Add no commit to main rule

hnyls2002 opened this pull request 4 months ago
Optimize conflicts between CUDA graph and vocab mask tensors

hnyls2002 opened this pull request 4 months ago
Improve error reporting during server launch

merrymercy opened this pull request 4 months ago
[Fix] Fix --disable-flashinfer

merrymercy opened this pull request 4 months ago
[Feature] Support torch profiler

danielhua23 opened this issue 4 months ago
[Feature] Can centos7 use this project?

luo647 opened this issue 4 months ago
[Bug] requests.exceptions.JSONDecodeError:

eyuansu62 opened this issue 4 months ago
remove assertion in triton attention and add an unit test

ByronHsu opened this pull request 4 months ago
[Feature] Support RM API

UbeCc opened this issue 4 months ago
Rewrite mixed chunked prefill

hnyls2002 opened this pull request 4 months ago
[Bug] too many processes

wellhowtosay opened this issue 4 months ago
Refactor attention backend

merrymercy opened this pull request 4 months ago
Deprecate --disable-flashinfer and introduce --attention-backend

merrymercy opened this pull request 4 months ago
[Minor] move triton attention kernels into a separate folder

merrymercy opened this pull request 4 months ago
Organize flashinfer indices update

hnyls2002 opened this pull request 4 months ago
[Do not merge] Test torchao

jerryzh168 opened this pull request 4 months ago
Fix vocab mask update bug

hnyls2002 opened this pull request 4 months ago
[Minor] improve kill scripts and torchao import

merrymercy opened this pull request 4 months ago
[Feature] 4-bit quantized prefix cache

josephrocca opened this issue 4 months ago
Fix CORS compatibility with OpenAI, vLLM, TGI, LMDeploy

josephrocca opened this pull request 4 months ago
deepseek-v2 torch.compile error

cdj0311 opened this issue 4 months ago
Support MiniCPM3

Achazwl opened this pull request 4 months ago
fix bug of `undefined is_single` in meth `create_abort_task`

wcsjtu opened this pull request 4 months ago
deepseek-v2 enable-mla 4x slower

cdj0311 opened this issue 4 months ago
[Docs] Improve documentations

merrymercy opened this pull request 4 months ago
BaiChuan2 Model

blacker521 opened this pull request 4 months ago
SGLang Discussion WeChat Group

qingkelab opened this issue 4 months ago
[Bug] Unable to see logprobs for prompt/input

dmakhervaks opened this issue 4 months ago
Support OpenAI API json_schema response format

zifeitong opened this pull request 4 months ago
[Bug] sgLang v0.3 breaks TP8 Llama 3.1 405B FP8 on 8xH100

jischein opened this issue 4 months ago
[CI] Return output logprobs in unit test

Ying1123 opened this pull request 4 months ago
Unify forward mode

hnyls2002 opened this pull request 4 months ago
[Feature] Follow up on non power of 2 triton kernel

ByronHsu opened this issue 4 months ago
[Bug] it seems memory leak in sglang when longtime serving

CSEEduanyu opened this issue 4 months ago
[Minor] Many cleanup

merrymercy opened this pull request 4 months ago
[Feature] support LLaVA-NeXT-Video-32B-Qwen

HarperGG opened this issue 4 months ago
[Feature] smooth quant or other quant method

MichoChan opened this issue 4 months ago
[Feature] support qwen2 vl

zhyncs opened this issue 4 months ago
[Feature] KV Cache Quantization

ghost opened this issue 4 months ago
[Feature] DRY repetition penalty

vnkc1 opened this issue 4 months ago
[Bug] `served_model_name` argument in the server_arg.py is not checked

zhaochenyang20 opened this issue 4 months ago
[Feature] KV Cache Compression

ghost opened this issue 4 months ago
Fix some online scheduling delay

hnyls2002 opened this pull request 4 months ago
[Bug] it didn't work when using tp on RTX 3090

milktea888 opened this issue 4 months ago
jinja2.exceptions.TemplateError: System role not supported

sdecoder opened this issue 4 months ago
Add torchao quant (int4/int8/fp8) to llama models

jerryzh168 opened this pull request 4 months ago
docs: add conclusion

zhyncs opened this pull request 4 months ago
Optimize schedule

hnyls2002 opened this pull request 4 months ago
[Bug] Multi machine, multi card, slow speed

guleng opened this issue 4 months ago
docs: highlight ttft itl and throughput

zhyncs opened this pull request 4 months ago
docs: update README

zhyncs opened this pull request 4 months ago
[Feature] Per-request random seed

laoconeth opened this issue 4 months ago
[Bug] ConnectionResetError: [Errno 104] Connection reset by peer

oliver-li opened this issue 4 months ago
Remove useless fields in global_config.py

merrymercy opened this pull request 4 months ago
docs: update news

zhyncs opened this pull request 4 months ago
Fix the flaky test test_moe_eval_accuracy_large.py

merrymercy opened this pull request 4 months ago
[Bug] T4 Crash

Abdulhanan535 opened this issue 4 months ago
[Bug] RuntimeError in ModelTpServer

Lzhang-hub opened this issue 4 months ago
[Feature] support smooth-quant?

Lzhang-hub opened this issue 4 months ago
[Bug] Facing Error When starting.

Abdulhanan535 opened this issue 4 months ago
chore: bump v0.3.0

zhyncs opened this pull request 4 months ago
misc: speedup load safetensors

zhyncs opened this pull request 4 months ago
Fix select by ensuring each request has at least one token

merrymercy opened this pull request 4 months ago
Fix llama2 weight loader

merrymercy opened this pull request 4 months ago
[Bug] Unable to fix model output

cherishhh opened this issue 4 months ago
The CPU is also occupied at 100% when there are no requests.

luhairong11 opened this issue 4 months ago
Update README.md for llava-onevision instructions

merrymercy opened this pull request 4 months ago
Removed unused methods

janimo opened this pull request 4 months ago
[Bug] Update to 0.2.15 and torch compile leads to error

zhaochenyang20 opened this issue 4 months ago
Adding document for backend

zhaochenyang20 opened this pull request 4 months ago
[Feature] Initial support for multi-LoRA serving

Ying1123 opened this pull request 4 months ago
Fix bugs in sampler with CUDA graph / torch.compile

hnyls2002 opened this pull request 4 months ago
feat: update linear deps 1/N

zhyncs opened this pull request 4 months ago
feat: update nightly gsm8k eval

zhyncs opened this pull request 4 months ago
Do you support frontend-language inference for Llava-OneVision ?

ehayeshaiper opened this issue 4 months ago
[Bug] A100 PCIE torch compile error

zhyncs opened this issue 4 months ago
Adding Documentation for installation

zhaochenyang20 opened this pull request 4 months ago
Support Phi3 mini and medium

janimo opened this pull request 4 months ago
[server] Passing `model_override_args` to `launch_server` via the CLI.

kevin85421 opened this pull request 4 months ago
Fix hang when doing s += None.

max99x opened this pull request 4 months ago
Fix regex mask

hnyls2002 opened this pull request 4 months ago
Release v0.2.15

merrymercy opened this pull request 4 months ago
[doc] Fix more broken links

ByronHsu opened this pull request 4 months ago
Fix the flaky tests in test_moe_eval_accuracy_large.py

merrymercy opened this pull request 4 months ago
[Feature] Correctness test for Triton kernels

ByronHsu opened this issue 4 months ago
ci: add nightly eval

zhyncs opened this pull request 4 months ago
fix: resolve fp8 for mixtral

zhyncs opened this pull request 4 months ago
[CI] merge all ci tests into one file

merrymercy opened this pull request 4 months ago