Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sglang

feat: replace get_act_fn for gpt_bigcode

zhyncs opened this pull request 5 months ago
[FIX] Wrong logger

havetc opened this pull request 5 months ago
Some questions about TTFT and TPOT benchmarks

sitabulaixizawaluduo opened this issue 5 months ago
[Minor] add delete test and delete tmp file on ci server

yichuan520030910320 opened this pull request 5 months ago
No such file or directory: '/sbin/ldconfig'

zwc163 opened this issue 5 months ago
Fix bench latency benchmark

hnyls2002 opened this pull request 5 months ago
Openvla

hnyls2002 opened this pull request 5 months ago
Torch compile CI throughput test

hnyls2002 opened this pull request 5 months ago
[FEAT] Support batches cancel

caiyueliang opened this pull request 5 months ago
Safety test

hnyls2002 opened this pull request 5 months ago
[CI] Parallelize unit tests in CI

wisclmy0611 opened this pull request 5 months ago
[Fix] Multi-images loading error

kcz358 opened this pull request 5 months ago
[CI] Fix CI

wisclmy0611 opened this pull request 5 months ago
[Feature] add option to use liger triton kernel

binarycrayon opened this issue 5 months ago
improve the threshold and ports in tests

wisclmy0611 opened this pull request 5 months ago
Update workflow files

merrymercy opened this pull request 5 months ago
Update CI runner docs

merrymercy opened this pull request 5 months ago
[Minor] improve CI and dependencies

hnyls2002 opened this pull request 5 months ago
Update CI workflows

merrymercy opened this pull request 5 months ago
[Feature] Support fp8 e5m2 kv cache with flashinfer

ispobock opened this pull request 5 months ago
Accuracy degrading in concurrent scenario

frankxyy opened this issue 5 months ago
Move sampler into CUDA graph

hnyls2002 opened this pull request 5 months ago
[Bug] enable-torch-compile error

siddhatiwari opened this issue 5 months ago
[Bug] Bad outputs with fp8 quantization at high RPS

siddhatiwari opened this issue 5 months ago
[Bug] Server crashes after loading (Mixtral 8x7b) on L4

nivibilla opened this issue 5 months ago
[Feature] Jamba 1.5 Support PLS

nivibilla opened this issue 5 months ago
[Bug] schedule_batch.py: IndexError: list index out of range

Quang-elec44 opened this issue 5 months ago
Dry sample

81549361 opened this pull request 5 months ago
Support Alibaba-NLP/gte-Qwen2-7B-instruct embedding Model

zhaochenyang20 opened this pull request 5 months ago
[Bug] vllm updated its get_model function

zhaochenyang20 opened this issue 5 months ago
[Minor] Improve logging and rename the health check endpoint name

merrymercy opened this pull request 5 months ago
[Feature] Repeated generation expression

laurens-gs opened this issue 5 months ago
[Bug] head_dim 96 not supported

ZX-ModelCloud opened this issue 5 months ago
[Feature] support W8A8(FP8) and KV Cache FP8 for DeepSeek V2

zhyncs opened this issue 5 months ago
chore: bump v0.2.14

zhyncs opened this pull request 5 months ago
[Tracker] OpenRouter LLM rankings tracking

zhyncs opened this issue 5 months ago
Save memory from interleaved attention

Ying1123 opened this pull request 5 months ago
[Feature] add disable-custom-all-reduce

Xu-Chen opened this pull request 5 months ago
Flex scheduler

yukavio opened this pull request 5 months ago
Optimize MLA/GQA/MQA Triton decoding

ispobock opened this pull request 5 months ago
[Bug] Llama3 70B A100 PCIE TP4 slow speed

zhyncs opened this issue 5 months ago
[Bug] Wrong tokens with mistral model

StevenZHB opened this issue 5 months ago
[Feature] Support TRI-ML/prismatic-vlms

Depetrol opened this issue 5 months ago
[RFC] Add an LLM engine

JianyuZhan opened this pull request 5 months ago
[FEAT] JSON constrained support

havetc opened this pull request 5 months ago
[Bug] I set `--host 0.0.0.0`, but it can't be called on another server

YinSonglin1997 opened this issue 5 months ago
[Feature] add disable_custom_all_reduce

Xu-Chen opened this issue 5 months ago
[Bug] After service, `torch.distributed.DistBackendError`

YinSonglin1997 opened this issue 5 months ago
[Feature] Do we have any plan for supporting Phi3V?

boqiny opened this issue 5 months ago
[Develop] Performance Improving Feature

yukavio opened this issue 5 months ago
[Bug] Low QPS for 1.2b model

lxww302 opened this issue 5 months ago
[Bug] Can't run Qwen2-57B-A14B-Instruct-GPTQ-Int4

xcxjack opened this issue 5 months ago
will triton kernels support cuda graph?

AlvL1225 opened this issue 5 months ago
[Bug] Always Watch Dog TimeOut

Rookie-Kai opened this issue 6 months ago
[Bug] nsys profile failed

zhangjun opened this issue 6 months ago
[Bug] T4 not work

zhyncs opened this issue 6 months ago
[Feature] Support InternVL 2

luohao123 opened this issue 6 months ago
Sequence Parallel

ZYHowell opened this pull request 6 months ago
[Feature] Allow arbitrary logit processors

iiLaurens opened this issue 6 months ago
[Bug] OOM for concurrent long requests

hahmad2008 opened this issue 6 months ago
[Bug] Multinode Llama 3.1 405B fp8

matthew-hippocratic opened this issue 6 months ago
Torch.compile Performance Tracking

merrymercy opened this issue 6 months ago
[Bug] backend stuck at Prefill batch

sophiapeng90 opened this issue 6 months ago
[Feature] DeepSeek-Coder-V2-Instruct-FP8 on 8xA100

halexan opened this issue 6 months ago
feat: frequency, min_new_tokens, presence, and repetition penalties

vhain opened this pull request 6 months ago
Add skip_tokenizer_init args.

gryffindor-rr opened this pull request 6 months ago
[Bug] Multinode cannot be started on runpod

Desmond819 opened this issue 6 months ago
[Bug] pt_main_thread uses 100% cpu all the time

wizd opened this issue 6 months ago
[Bug] FlashInfer support for <=sm_75

horiacristescu opened this issue 6 months ago
Inference Llama3-70b has an AssertionError

Ikkyu321 opened this issue 6 months ago
[Feature] Google TPU Support

RonanKMcGovern opened this issue 6 months ago
[Feature] Does sglang now support beam search

StevenZHB opened this issue 6 months ago
[Feature] Add a flag for computing the prompt's logprobs or not.

hnyls2002 opened this issue 6 months ago
run llama 3.1 405B with multi node has tp server error [Bug]

kinglion811 opened this issue 6 months ago
[Bug] AWQ Marlin not work with Torch Compile

zhyncs opened this issue 6 months ago
[Feature] plan to support medusa?

CSEEduanyu opened this issue 6 months ago
[Bug] Multi-Node communication issue

dmakhervaks opened this issue 6 months ago
[Feature] RadixCache: remove recursive logic

hnyls2002 opened this issue 6 months ago
[Feature] Frontend: be able to run generate super long text

xianbaoqian opened this issue 6 months ago
[Bug] Unable to install on mac

xianbaoqian opened this issue 6 months ago
ROCM

BasDiaz opened this issue 6 months ago
[Feature] Generation Inputs: input_embeds

AlekseyKorshuk opened this issue 6 months ago
Initialization failed. warmup error:

bravelll opened this issue 6 months ago
Support for WebAssembly models

jaanli opened this issue 6 months ago
Development Roadmap (2024 Q3)

Ying1123 opened this issue 6 months ago