Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang

Release v0.3.3

merrymercy opened this pull request 3 months ago
[Profile] Add pytorch profiler

Ying1123 opened this pull request 3 months ago
Remove references to squeezellm

janimo opened this pull request 3 months ago
[WIP] Support NVLM-D

amosyou opened this pull request 3 months ago
Update README.md

kushal34712 opened this pull request 3 months ago
Returning a per request metric for number of cached_tokens read

havetc opened this pull request 3 months ago
Optimize broadcast & Reorg code

merrymercy opened this pull request 3 months ago
Fix the port_args in bench_latency

merrymercy opened this pull request 3 months ago
Use is_flashinfer_available to replace is_hip for flashinfer check

merrymercy opened this pull request 3 months ago
Use `atexit` hook to implicitly shutdown `Runtime`

ByronHsu opened this pull request 3 months ago
Fix chunked prefill condition

ispobock opened this pull request 3 months ago
[Fix] Fix the case where prompt_len = 0

merrymercy opened this pull request 3 months ago
Fix modality for image inputs

merrymercy opened this pull request 3 months ago
Update README.md

merrymercy opened this pull request 3 months ago
Test consistency for single and batch seperately

ByronHsu opened this pull request 3 months ago
[Minor, Performance] Use torch.argmax for greedy sampling

Ying1123 opened this pull request 3 months ago
fix(docs): Improve grammar and readability in README

amantyagiprojects opened this pull request 3 months ago
[LoRA, Performance] Speedup multi-LoRA serving - Step 1

Ying1123 opened this pull request 3 months ago
Clean up event loop

merrymercy opened this pull request 3 months ago
[Bug] Fix decode stats error on output_len 1

HaiShaw opened this pull request 3 months ago
[Minor] Improve the style and fix flaky tests

merrymercy opened this pull request 3 months ago
Fix styling

ByronHsu opened this pull request 3 months ago
Fix runtime.generate when sampling param is not passed

ByronHsu opened this pull request 3 months ago
default sampling param should be deepcopied

ByronHsu opened this pull request 3 months ago
chore: update README.md

eltociear opened this pull request 3 months ago
[Bug] Fix the Image Input of Batch Generation

OBJECT907 opened this pull request 3 months ago
Update io_struct.py

OBJECT907 opened this pull request 3 months ago
[Easy] use .text() instead of .text

ByronHsu opened this pull request 3 months ago
Backend method not found when SRT Runtime is used

ByronHsu opened this pull request 3 months ago
Refine the add request reasons to avoid corner cases.

hnyls2002 opened this pull request 3 months ago
Support min_tokens in sgl.gen

ByronHsu opened this pull request 3 months ago
[Event] Update README.md

Ying1123 opened this pull request 3 months ago
[LoRA, Performance] Speedup multi-LoRA serving - Step 1

Ying1123 opened this pull request 3 months ago
[Minifix] Remove extra space in cot example

FredericOdermatt opened this pull request 3 months ago
Make input_ids a torch.Tensor

merrymercy opened this pull request 3 months ago
Provide an offline engine API

ByronHsu opened this pull request 3 months ago
Use ipc instead of tcp in zmq

merrymercy opened this pull request 3 months ago
[doc] Chinese Documentation Translation Available for sglang

khum08 opened this issue 3 months ago
[Fix] Fix major performance bug in certain cases

Ying1123 opened this pull request 3 months ago
Organize sampling batch info better

merrymercy opened this pull request 3 months ago
Add llama implementation with no tensor parallel linears

jerryzh168 opened this pull request 3 months ago
Print out what the model saw?

cinjon opened this issue 3 months ago
Move status check in the memory pool to CPU

merrymercy opened this pull request 3 months ago
[Fix] Move ScheduleBatch out of SamplingInfo

merrymercy opened this pull request 3 months ago
[Fix] do not maintain regex_fsm in SamplingBatchInfo

merrymercy opened this pull request 3 months ago
[Performance, Hardware] MoE tuning on AMD MI300x GPUs

kkHuang-amd opened this pull request 3 months ago
[Fix] Fix all the Huggingface paths

tbarton16 opened this pull request 3 months ago
Simplify flashinfer dispatch

hnyls2002 opened this pull request 3 months ago
Llama3.2 vision model support

hnyls2002 opened this pull request 3 months ago
Dispatch flashinfer wrappers

hnyls2002 opened this pull request 3 months ago
[Refactor] Simplify io_struct and tokenizer_manager

Ying1123 opened this pull request 3 months ago
Fix bugs of `logprobs_nums`

hnyls2002 opened this pull request 3 months ago
Organize Attention Backends

hnyls2002 opened this pull request 3 months ago
Support qwen2 vl model

yizhang2077 opened this pull request 3 months ago
[Fix, LoRA] fix LoRA with updates in main

Ying1123 opened this pull request 3 months ago
Clean up batch data structures: Introducing ModelWorkerBatch

merrymercy opened this pull request 3 months ago
Rename InputMetadata -> ForwardBatch

merrymercy opened this pull request 3 months ago
Add support for Molmo-D-7B Model

BabyChouSr opened this pull request 3 months ago
Let ModelRunner take InputMetadata as input, instead of ScheduleBatch

merrymercy opened this pull request 3 months ago
[Refactor] Simplify io_struct and tokenizer_manager

Ying1123 opened this pull request 3 months ago
Process image in parallel

hnyls2002 opened this pull request 3 months ago
Move scheduler code from tp_worker.py to scheduler.py

merrymercy opened this pull request 3 months ago
fix ipv6 url when warm up model

cauyxy opened this pull request 3 months ago
Improve process creation

merrymercy opened this pull request 3 months ago
[Bug] ValueError: The memory capacity is unbalanced

chuangzhidan opened this issue 3 months ago
Make detokenizer_manager.py not asyncio

merrymercy opened this pull request 3 months ago
Organize image inputs

hnyls2002 opened this pull request 3 months ago
Multiple minor fixes

merrymercy opened this pull request 3 months ago
[Event] Update meeting link

Ying1123 opened this pull request 3 months ago
Add float8 dynamic quant to torchao_utils

jerryzh168 opened this pull request 3 months ago
[Feature] VLLM 6.0 support

arunpatala opened this issue 3 months ago
[Bug] IndexError: list index out of range

lvxianfeng-git opened this issue 3 months ago
[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B

Ying1123 opened this pull request 3 months ago
minor: fix config

hnyls2002 opened this pull request 4 months ago
[Feature] add support for llama 3.2

Stealthwriter opened this issue 4 months ago
[Bug] Unable to use gptq or awq with torch.compile (8*A40)

smallstepman opened this issue 4 months ago
[FIX] Catch syntax error of Regex Guide to avoid crash

du00cs opened this pull request 4 months ago
[bugfix]Add modelscope package to avoid docker image without modelscope

KylinMountain opened this pull request 4 months ago
Accuracy reduction of Lora

yileld opened this issue 4 months ago
Update Dockerfile

KylinMountain opened this pull request 4 months ago
[Bug] no module modelscope using docker compose to start sglang

KylinMountain opened this issue 4 months ago
How to study the code?

TJ949 opened this issue 4 months ago
[Feature] _get_pixel_values needs to return tgt_sizes

huangzl18883 opened this issue 4 months ago
[Fix] Ignore model import error

merrymercy opened this pull request 4 months ago
Release v0.3.2

Ying1123 opened this pull request 4 months ago
Revert "kernel: use tensor cores for flashinfer gqa kernels"

Ying1123 opened this pull request 4 months ago
[Fix] Fix clean_up_tokenization_spaces in tokenizer

merrymercy opened this pull request 4 months ago
[Bug] tensor parallel run error

jerryzh168 opened this issue 4 months ago
[CI] Update nightly eval

Ying1123 opened this pull request 4 months ago
[Bug] LLaVa-next does not work for single image processing

ThomasBenzshawel opened this issue 4 months ago
AWQ performance tracking

zhyncs opened this issue 4 months ago
Possible timing side-channels caused by shared prefix

Unik-lif opened this issue 4 months ago