Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/lm-sys/FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
https://github.com/lm-sys/FastChat

Does it support batch inference?

kangsan0420 opened this issue over 1 year ago
Add Code Llama Support and Fix empty system prompt for llama 2

woshiyyya opened this pull request over 1 year ago
Reduce gradio overhead

merrymercy opened this pull request over 1 year ago
2048 context length limit about qwen-7b-chat

Hspix opened this issue over 1 year ago
Improve gradio demo

merrymercy opened this pull request over 1 year ago
can I finetune Llama-2-70B using 16 * A10( 16 * 23G)

babytdream opened this issue over 1 year ago
Added support of google/flan models

wangzhen263 opened this pull request over 1 year ago
Allow register custom OpenAI compatible models

merrymercy opened this pull request over 1 year ago
How to finetune baichuan 13b?

renmengjie7 opened this issue over 1 year ago
Optimize for proper flash attn causal handling

siddartha-RE opened this pull request over 1 year ago
vicuna-7b-v1.5 err

gavinju opened this issue over 1 year ago
Fix the issue of API not stopping when passing in stop

Trangle opened this pull request over 1 year ago
"finish_reason": "length" --> how to increase max_new_tokens

2533245542 opened this issue over 1 year ago
Support codellama

obitolyz opened this issue over 1 year ago
Flash Attention Monkey Patch not working with CodeLlama-34B

michaelroyzen opened this issue over 1 year ago
Update conversation.py

epec254 opened this pull request over 1 year ago
Peft loading

Heckler-Dark opened this issue over 1 year ago
Correct prompt for Vicuna v1.5 7b in the case of RAG

Matthieu-Tinycoaching opened this issue over 1 year ago
使用gpt-3.5-turbo llm_api报错

zqt996 opened this issue over 1 year ago
XVERSE-13B need support!

tms2003 opened this issue over 1 year ago
Make the arena page as the default page

merrymercy opened this pull request over 1 year ago
AssertionError: Torch not compiled with CUDA enabled

Heckler-Dark opened this issue over 1 year ago
Support no user message in llama2

zeyugao opened this pull request over 1 year ago
Add new model to the arena

renatz opened this pull request over 1 year ago
Make chatglm2-6b load8bit work on Mac m2 with mps(fix bfloatxx error)

vaxilicaihouxian opened this issue over 1 year ago
Add new model to the arena

renatz opened this pull request over 1 year ago
webui is very slow, but api is normal

hustwyk opened this issue over 1 year ago
Make all tensors to be on the same device

fan-chao opened this pull request over 1 year ago
support custom API endpoints for gen_api_answer.py in llm-judge

imoneoi opened this pull request over 1 year ago
Bug on llama2-chinese conversation templates

fan-chao opened this pull request over 1 year ago
add-realm-to-the-arena

renatz opened this pull request over 1 year ago
Make all tensors to be on the same device

fan-chao opened this pull request over 1 year ago
leetcode dataset

kkkparty opened this issue over 1 year ago
async method call the sync code, and using semaphore together

colinguozizhong opened this issue over 1 year ago
Consider using a fixed version of GPT-4 for llm_judge

imoneoi opened this issue over 1 year ago
Make embedding api compatible for openai

Trangle opened this pull request over 1 year ago
Bug on llama2-chinese conversation templates

fan-chao opened this issue over 1 year ago
output text is not a complete sentence

wqn1 opened this issue over 1 year ago
Add conversation support for VMware's OpenLLaMa OpenInstruct models

nicobasile opened this pull request over 1 year ago
Update openai_api_server.py

ArtificialZeng opened this pull request over 1 year ago
release v0.2.25

merrymercy opened this pull request over 1 year ago
Fix typos

merrymercy opened this pull request over 1 year ago
switch to aiohttp post request mode

leiwen83 opened this pull request over 1 year ago
[Minor] Style clean up & Fix embeding

merrymercy opened this pull request over 1 year ago
The ConnectionError can't run

YuamLu opened this issue over 1 year ago
What's qvk in flash attention patch file?

DqEDC opened this issue over 1 year ago
lmsys/longchat-7b-v1.5-32k transformer version problem

JACKHAHA363 opened this issue over 1 year ago
chatglm2-6b-32k cannot output properly when it runs on multiple Gpus

dream20201212 opened this issue over 1 year ago
twitter --> x

ut-kr opened this pull request over 1 year ago
Add group kv support and fix past kv from cache

siddartha-RE opened this pull request over 1 year ago
load peft-model error (gradio_web_server)

jackaihfia2334 opened this issue over 1 year ago
--device cpu --load-8bit ends in TypeError

leolivier opened this issue over 1 year ago
feat: consider template's stop_token_ids in gen_model_answer

congchan opened this pull request over 1 year ago
Improve indentation in openai_api_server.py

ArtificialZeng opened this pull request over 1 year ago
Does fastchat model worker support expose some metrics?

leyao-daily opened this issue over 1 year ago
Does FastChat consider AbortController?

qftie opened this issue over 1 year ago
poor performance of httpx.AsyncClient in openai_api_server.py

leiwen83 opened this issue over 1 year ago
Official evaluation scores of QWen-7B-Chat

Lukeming-tsinghua opened this issue over 1 year ago
Fix support for GPU selection using CLI argument

laidybug opened this pull request over 1 year ago
ちゃっとぼっと

Taichi331213 opened this issue over 1 year ago
Measure API Load

brandonbiggs opened this issue over 1 year ago
WizardCoder hallucinations or bug in inference settings?

Extremys opened this issue over 1 year ago
QLoRA accidentally results in CUDA Out of Memory

zycheiheihei opened this issue over 1 year ago
[Minor] Update the warning to follow the new conv_template file

persistz opened this pull request over 1 year ago
Add Intel AMX/AVX512 support to accelerate inference

LeiZhou-97 opened this pull request over 1 year ago
how model_workers load balancing

linpan opened this issue over 1 year ago
ModuleNotFoundError: No module named 'packaging'

sxunix opened this issue over 1 year ago
Update embedding logic

Trangle opened this pull request over 1 year ago
AsyncLLMEngine not founded in vllm

HelloCard opened this issue over 1 year ago
make fastchat api server run in multiprocessing easily

liunux4odoo opened this pull request over 1 year ago
ChatGLM is error, version from 0.2.18 to 0.2.23

luefei opened this issue over 1 year ago
Update llama2 and starchat templates

Cyrilvallez opened this pull request over 1 year ago
[Minor] Fix typos

merrymercy opened this pull request over 1 year ago
How to specify gpu id?

fan-chao opened this issue over 1 year ago
Add support for Vigogne models

bofenghuang opened this pull request over 1 year ago
Can arm-64 CPU run?

zds-yyds opened this issue over 1 year ago
AssertionError

haof-github opened this issue over 1 year ago
Will there be 70b Vicuna-v1.5?

PhanTask opened this issue over 1 year ago
Does FastChat support models like GPT3?

shuqike opened this issue over 1 year ago
Add conversation template parameter to vllm worker

alanxmay opened this pull request over 1 year ago
Is it possible to finetune with llama2-70b?

lanfengmo opened this issue over 1 year ago
Adjusting Token Limit in Fastchat with Llama2 Model

coding-alt opened this issue over 1 year ago