Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang

Fix unit tests

merrymercy opened this pull request 3 months ago
Add a watch dog thread

merrymercy opened this pull request 3 months ago
Offline serving final

hnyls2002 opened this pull request 3 months ago
Update hyperparameter_tuning.md

merrymercy opened this pull request 3 months ago
profile of M sizes for Torch native and TE (ignore)

Zhuohao-Li opened this pull request 3 months ago
Improve the user control of new_token_ratio

merrymercy opened this pull request 3 months ago
Add openAI compatible API

zhaochenyang20 opened this pull request 3 months ago
Provide an argument to set the maximum batch size for cuda graph

merrymercy opened this pull request 3 months ago
Fix docs ci

zhaochenyang20 opened this pull request 3 months ago
Simplify our docs with complicated functions into utils

zhaochenyang20 opened this pull request 3 months ago
detach two CI for documentation

zhaochenyang20 opened this pull request 3 months ago
Update links

merrymercy opened this pull request 3 months ago
Update ci workflows

merrymercy opened this pull request 3 months ago
fix int conversion for `SGLANG_CPU_COUNT`

ByronHsu opened this pull request 3 months ago
Allow consecutive ports when launching multiple sglang servers.

hnyls2002 opened this pull request 3 months ago
Set `ZMQ` buffer size heuristic

hnyls2002 opened this pull request 3 months ago
Fix possible ZMQ hanging

hnyls2002 opened this pull request 3 months ago
move max_position_embeddings to the last

hliuca opened this pull request 3 months ago
[Fix] Fix --skip-tokenizer-init

merrymercy opened this pull request 3 months ago
Revert "Fix memory leak when doing chunked prefill"

merrymercy opened this pull request 3 months ago
Release v0.3.4.post2

merrymercy opened this pull request 3 months ago
Fix logprob in the overlapped mode

merrymercy opened this pull request 3 months ago
[Fix] Fix the log parsing in chunked prefill uni tests

merrymercy opened this pull request 3 months ago
Fix log parsing in the chunked prefill unit tests

merrymercy opened this pull request 3 months ago
[Bug] Got error with awq_marlin quantization args.

liangzelang opened this issue 3 months ago
[router] rust-based router

ByronHsu opened this pull request 3 months ago
Fix seq_lens_sum for cuda graph runner in padded cases

merrymercy opened this pull request 3 months ago
[Bug] cutlass group_gemm.initialize failed

senlice opened this issue 3 months ago
Fix memory leak when doing chunked prefill

hnyls2002 opened this pull request 3 months ago
add support for ipynb

zhaochenyang20 opened this pull request 3 months ago
Enhance the test case for chunked prefill and check memory leak

merrymercy opened this pull request 3 months ago
Create deploy-docs.yml

zhaochenyang20 opened this pull request 3 months ago
Re-introduce `get_cuda_graph_seq_len_fill_value`

merrymercy opened this pull request 3 months ago
[Fix] Fix cuda graph padding for triton attention backend

merrymercy opened this pull request 3 months ago
Shortfin Backend

stbaione opened this pull request 3 months ago
Qwen2vl support cuda graph and disable radix cache

yizhang2077 opened this pull request 3 months ago
[Fix] Fix NaN issues by fixing the cuda graph padding values for flashinfer

merrymercy opened this pull request 3 months ago
check user-specified model_max_len with hf derived max_model_len

BBuf opened this pull request 3 months ago
[Bug] Catch any errors caused by parsing json schema

zolinthecow opened this pull request 3 months ago
Fix MockTokenizer in the unit tests

merrymercy opened this pull request 3 months ago
Fix the perf regression due to additional_stop_token_ids

merrymercy opened this pull request 3 months ago
Crash the server on warnings in CI

merrymercy opened this pull request 3 months ago
Fix out of memory message.

hnyls2002 opened this pull request 3 months ago
Fix missing additional_stop_token_ids

merrymercy opened this pull request 3 months ago
Update docs

merrymercy opened this pull request 3 months ago
[Fix] Fix abort in data parallelism

merrymercy opened this pull request 3 months ago
Fix stop condition for <|eom_id|>

merrymercy opened this pull request 3 months ago
Fix perf regression for set_kv_buffer

merrymercy opened this pull request 3 months ago
[Feature] Multi options

QinghanLai opened this issue 3 months ago
[API] add get memory pool size

Ying1123 opened this pull request 3 months ago
[Bug] Unable to run Qwen2-VL with OpenAI server

Quang-elec44 opened this issue 3 months ago
Fuse more ops & Simplify token mapping

merrymercy opened this pull request 3 months ago
Add send request ipynb

zhaochenyang20 opened this pull request 3 months ago
Add Send request.ipynb

zhaochenyang20 opened this pull request 3 months ago
Why StreamingResponse 3s Delay to Abort Requests?

matthew-hippocratic opened this issue 3 months ago
minor: add human eval

zhyncs opened this pull request 3 months ago
[Performance] Support both xgrammar and outlines for constrained decoding

DarkSharpness opened this pull request 3 months ago
Release v0.3.4.post1

merrymercy opened this pull request 3 months ago
Update `max_req_len` and `max_req_input_len`

hnyls2002 opened this pull request 3 months ago
Fix edge case for truncated

ByronHsu opened this pull request 3 months ago
Fix sliding window attention and gemma-2 unit tests in CI

merrymercy opened this pull request 3 months ago
Introducing SGLang Guru on Gurubase.io

kursataktas opened this pull request 3 months ago
[Bug] Issue in latest sglang docker image

shubhamgajbhiye1994 opened this issue 3 months ago
Fix prefill oom

hnyls2002 opened this pull request 3 months ago
Maintain seq_lens_sum to make more FlashInfer operations non-blocking

merrymercy opened this pull request 3 months ago
Make token mapping non-blocking in the overlapped mode

merrymercy opened this pull request 3 months ago
[Bug] Prefill OOM!

yichuan520030910320 opened this issue 3 months ago
Faster overlap mode scheduler

merrymercy opened this pull request 3 months ago
misc: add CODEOWNERS

zhyncs opened this pull request 3 months ago
Add GLM-4 TextGeneration Model support for SGLang

sixsixcoder opened this pull request 3 months ago
Simplify batch result resolution

merrymercy opened this pull request 3 months ago
Simplify the usage of device

merrymercy opened this pull request 3 months ago
Add documentations for Installation

zhaochenyang20 opened this pull request 3 months ago
[Feature] Cache-aware Data Parallel Router

ByronHsu opened this issue 3 months ago
Optimize ZMQ receive operations to reduce idle CPU usage

zyearw1024 opened this pull request 3 months ago
[Bug] 100% CPU Usage When Idle in sglang

zyearw1024 opened this issue 3 months ago
[LoRA, Performance] Add gemm expand triton kernel for multi-LoRA

Ying1123 opened this pull request 3 months ago
[Bugfix] qwen2vl forward_extend

yizhang2077 opened this pull request 3 months ago
Split the overlapped version of TpModelWorkerClient into a separate file

merrymercy opened this pull request 3 months ago
Temporarily skip the test_mixed_batch for QWen2VL

merrymercy opened this pull request 3 months ago
Unify the memory pool api and tp worker API

merrymercy opened this pull request 3 months ago
docs: fix README

zhyncs opened this pull request 3 months ago
Update README.md

Ying1123 opened this pull request 3 months ago
Support qwen2 vl model

zhyncs opened this pull request 3 months ago
Update vllm to 0.6.3 (#1711)

zhyncs opened this pull request 3 months ago
CPU Inference

JocelynPanPan opened this issue 3 months ago
Simplify the interface of tp_worker

merrymercy opened this pull request 3 months ago
Created SECURITY.md

NishantRana07 opened this pull request 3 months ago
Update readme and workflow

merrymercy opened this pull request 3 months ago
[Feature] Cascade attention kernels

merrymercy opened this issue 3 months ago
Release v0.3.4

merrymercy opened this pull request 3 months ago
Update README.md

merrymercy opened this pull request 3 months ago