Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
https://github.com/lm-sys/FastChat
Does it support batch inference?
kangsan0420 opened this issue over 1 year ago
kangsan0420 opened this issue over 1 year ago
Add Code Llama Support and Fix empty system prompt for llama 2
woshiyyya opened this pull request over 1 year ago
woshiyyya opened this pull request over 1 year ago
Reduce gradio overhead
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
2048 context length limit about qwen-7b-chat
Hspix opened this issue over 1 year ago
Hspix opened this issue over 1 year ago
Improve gradio demo
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
can I finetune Llama-2-70B using 16 * A10( 16 * 23G)
babytdream opened this issue over 1 year ago
babytdream opened this issue over 1 year ago
Added support of google/flan models
wangzhen263 opened this pull request over 1 year ago
wangzhen263 opened this pull request over 1 year ago
Allow register custom OpenAI compatible models
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
code llama The answers are all blank, I don't know if there is any way to fix it.
Puzzledyy opened this issue over 1 year ago
Puzzledyy opened this issue over 1 year ago
Error when trying to finetune `lmsys/vicuna-7b-v1.5` with 6 A100 40G GPUs
kunqian-58 opened this issue over 1 year ago
kunqian-58 opened this issue over 1 year ago
How to finetune baichuan 13b?
renmengjie7 opened this issue over 1 year ago
renmengjie7 opened this issue over 1 year ago
when tuning 70b llama 2 RuntimeError encountered: shape is invalid for input of size
PhanTask opened this issue over 1 year ago
PhanTask opened this issue over 1 year ago
Optimize for proper flash attn causal handling
siddartha-RE opened this pull request over 1 year ago
siddartha-RE opened this pull request over 1 year ago
Vicuna v1.5 giving wrong repsones in a different language when trying to do a vanila inference
Akshay1-6180 opened this issue over 1 year ago
Akshay1-6180 opened this issue over 1 year ago
How to process requests with FastChat api parallelly or in a batch style?
BigAndSweet opened this issue over 1 year ago
BigAndSweet opened this issue over 1 year ago
vicuna-7b-v1.5 err
gavinju opened this issue over 1 year ago
gavinju opened this issue over 1 year ago
Fix the issue of API not stopping when passing in stop
Trangle opened this pull request over 1 year ago
Trangle opened this pull request over 1 year ago
"finish_reason": "length" --> how to increase max_new_tokens
2533245542 opened this issue over 1 year ago
2533245542 opened this issue over 1 year ago
Support codellama
obitolyz opened this issue over 1 year ago
obitolyz opened this issue over 1 year ago
why lmsys/vicuna-13b-v1.3 model output contains prompt? And how to prepare finetune data for this behavior?
kunqian-58 opened this issue over 1 year ago
kunqian-58 opened this issue over 1 year ago
Flash Attention Monkey Patch not working with CodeLlama-34B
michaelroyzen opened this issue over 1 year ago
michaelroyzen opened this issue over 1 year ago
Update conversation.py
epec254 opened this pull request over 1 year ago
epec254 opened this pull request over 1 year ago
Why is lmsys/vicuna-13b-v1.5 giving chinease answers for small questions based on a custom code template created?
Akshay1-6180 opened this issue over 1 year ago
Akshay1-6180 opened this issue over 1 year ago
Peft loading
Heckler-Dark opened this issue over 1 year ago
Heckler-Dark opened this issue over 1 year ago
Correct prompt for Vicuna v1.5 7b in the case of RAG
Matthieu-Tinycoaching opened this issue over 1 year ago
Matthieu-Tinycoaching opened this issue over 1 year ago
Is it safe and faster to use multiprocess to call response = openai.ChatCompletion.create()?
BigAndSweet opened this issue over 1 year ago
BigAndSweet opened this issue over 1 year ago
使用gpt-3.5-turbo llm_api报错
zqt996 opened this issue over 1 year ago
zqt996 opened this issue over 1 year ago
XVERSE-13B need support!
tms2003 opened this issue over 1 year ago
tms2003 opened this issue over 1 year ago
Make the arena page as the default page
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
AssertionError: Torch not compiled with CUDA enabled
Heckler-Dark opened this issue over 1 year ago
Heckler-Dark opened this issue over 1 year ago
Support no user message in llama2
zeyugao opened this pull request over 1 year ago
zeyugao opened this pull request over 1 year ago
Add new model to the arena
renatz opened this pull request over 1 year ago
renatz opened this pull request over 1 year ago
Make chatglm2-6b load8bit work on Mac m2 with mps(fix bfloatxx error)
vaxilicaihouxian opened this issue over 1 year ago
vaxilicaihouxian opened this issue over 1 year ago
Add new model to the arena
renatz opened this pull request over 1 year ago
renatz opened this pull request over 1 year ago
webui is very slow, but api is normal
hustwyk opened this issue over 1 year ago
hustwyk opened this issue over 1 year ago
Make all tensors to be on the same device
fan-chao opened this pull request over 1 year ago
fan-chao opened this pull request over 1 year ago
support custom API endpoints for gen_api_answer.py in llm-judge
imoneoi opened this pull request over 1 year ago
imoneoi opened this pull request over 1 year ago
Bug on llama2-chinese conversation templates
fan-chao opened this pull request over 1 year ago
fan-chao opened this pull request over 1 year ago
add-realm-to-the-arena
renatz opened this pull request over 1 year ago
renatz opened this pull request over 1 year ago
Make all tensors to be on the same device
fan-chao opened this pull request over 1 year ago
fan-chao opened this pull request over 1 year ago
leetcode dataset
kkkparty opened this issue over 1 year ago
kkkparty opened this issue over 1 year ago
[Bug] Expected all tensors to be on the same device, but found at least two devices, cuda:6 and cuda:0!
fan-chao opened this issue over 1 year ago
fan-chao opened this issue over 1 year ago
async method call the sync code, and using semaphore together
colinguozizhong opened this issue over 1 year ago
colinguozizhong opened this issue over 1 year ago
Consider using a fixed version of GPT-4 for llm_judge
imoneoi opened this issue over 1 year ago
imoneoi opened this issue over 1 year ago
Why does sequentially call requests.post() without attaching history conversations get a good few-shot learning?
BigAndSweet opened this issue over 1 year ago
BigAndSweet opened this issue over 1 year ago
Make embedding api compatible for openai
Trangle opened this pull request over 1 year ago
Trangle opened this pull request over 1 year ago
Bug on llama2-chinese conversation templates
fan-chao opened this issue over 1 year ago
fan-chao opened this issue over 1 year ago
output text is not a complete sentence
wqn1 opened this issue over 1 year ago
wqn1 opened this issue over 1 year ago
Add conversation support for VMware's OpenLLaMa OpenInstruct models
nicobasile opened this pull request over 1 year ago
nicobasile opened this pull request over 1 year ago
udpate compression: support multi-device when using compression with args.num_gpus and args.max_gpu_memory
hzg0601 opened this pull request over 1 year ago
hzg0601 opened this pull request over 1 year ago
Update openai_api_server.py
ArtificialZeng opened this pull request over 1 year ago
ArtificialZeng opened this pull request over 1 year ago
release v0.2.25
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
Fix typos
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
switch to aiohttp post request mode
leiwen83 opened this pull request over 1 year ago
leiwen83 opened this pull request over 1 year ago
[Minor] Style clean up & Fix embeding
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
The ConnectionError can't run
YuamLu opened this issue over 1 year ago
YuamLu opened this issue over 1 year ago
What's qvk in flash attention patch file?
DqEDC opened this issue over 1 year ago
DqEDC opened this issue over 1 year ago
lmsys/longchat-7b-v1.5-32k transformer version problem
JACKHAHA363 opened this issue over 1 year ago
JACKHAHA363 opened this issue over 1 year ago
Why does the deepspeed command given in the documentation get affected by the position of the parameters?
liyifo opened this issue over 1 year ago
liyifo opened this issue over 1 year ago
chatglm2-6b-32k cannot output properly when it runs on multiple Gpus
dream20201212 opened this issue over 1 year ago
dream20201212 opened this issue over 1 year ago
twitter --> x
ut-kr opened this pull request over 1 year ago
ut-kr opened this pull request over 1 year ago
How can i use the functions attribute provided by OpenAI with open source models
necromorph98 opened this issue over 1 year ago
necromorph98 opened this issue over 1 year ago
Vicuna-1.5 Quantized using AWQ Not Working - CUDA Illegal Memory Access
mmaaz60 opened this issue over 1 year ago
mmaaz60 opened this issue over 1 year ago
Add group kv support and fix past kv from cache
siddartha-RE opened this pull request over 1 year ago
siddartha-RE opened this pull request over 1 year ago
load peft-model error (gradio_web_server)
jackaihfia2334 opened this issue over 1 year ago
jackaihfia2334 opened this issue over 1 year ago
--device cpu --load-8bit ends in TypeError
leolivier opened this issue over 1 year ago
leolivier opened this issue over 1 year ago
feat: consider template's stop_token_ids in gen_model_answer
congchan opened this pull request over 1 year ago
congchan opened this pull request over 1 year ago
Improve indentation in openai_api_server.py
ArtificialZeng opened this pull request over 1 year ago
ArtificialZeng opened this pull request over 1 year ago
Does fastchat model worker support expose some metrics?
leyao-daily opened this issue over 1 year ago
leyao-daily opened this issue over 1 year ago
Does FastChat consider AbortController?
qftie opened this issue over 1 year ago
qftie opened this issue over 1 year ago
poor performance of httpx.AsyncClient in openai_api_server.py
leiwen83 opened this issue over 1 year ago
leiwen83 opened this issue over 1 year ago
Official evaluation scores of QWen-7B-Chat
Lukeming-tsinghua opened this issue over 1 year ago
Lukeming-tsinghua opened this issue over 1 year ago
Fix support for GPU selection using CLI argument
laidybug opened this pull request over 1 year ago
laidybug opened this pull request over 1 year ago
ちゃっとぼっと
Taichi331213 opened this issue over 1 year ago
Taichi331213 opened this issue over 1 year ago
Measure API Load
brandonbiggs opened this issue over 1 year ago
brandonbiggs opened this issue over 1 year ago
WizardCoder hallucinations or bug in inference settings?
Extremys opened this issue over 1 year ago
Extremys opened this issue over 1 year ago
QLoRA accidentally results in CUDA Out of Memory
zycheiheihei opened this issue over 1 year ago
zycheiheihei opened this issue over 1 year ago
Can we use FastChat for full-parameter fine-tuning based on DeepSpeed?
mpdpey043 opened this issue over 1 year ago
mpdpey043 opened this issue over 1 year ago
[Minor] Update the warning to follow the new conv_template file
persistz opened this pull request over 1 year ago
persistz opened this pull request over 1 year ago
Add Intel AMX/AVX512 support to accelerate inference
LeiZhou-97 opened this pull request over 1 year ago
LeiZhou-97 opened this pull request over 1 year ago
how model_workers load balancing
linpan opened this issue over 1 year ago
linpan opened this issue over 1 year ago
ModuleNotFoundError: No module named 'packaging'
sxunix opened this issue over 1 year ago
sxunix opened this issue over 1 year ago
Update embedding logic
Trangle opened this pull request over 1 year ago
Trangle opened this pull request over 1 year ago
About the function of assign special token ids to the model.config object
auroua opened this issue over 1 year ago
auroua opened this issue over 1 year ago
AsyncLLMEngine not founded in vllm
HelloCard opened this issue over 1 year ago
HelloCard opened this issue over 1 year ago
make fastchat api server run in multiprocessing easily
liunux4odoo opened this pull request over 1 year ago
liunux4odoo opened this pull request over 1 year ago
ChatGLM is error, version from 0.2.18 to 0.2.23
luefei opened this issue over 1 year ago
luefei opened this issue over 1 year ago
Update llama2 and starchat templates
Cyrilvallez opened this pull request over 1 year ago
Cyrilvallez opened this pull request over 1 year ago
[Minor] Fix typos
merrymercy opened this pull request over 1 year ago
merrymercy opened this pull request over 1 year ago
How to specify gpu id?
fan-chao opened this issue over 1 year ago
fan-chao opened this issue over 1 year ago
Add support for Vigogne models
bofenghuang opened this pull request over 1 year ago
bofenghuang opened this pull request over 1 year ago
Can arm-64 CPU run?
zds-yyds opened this issue over 1 year ago
zds-yyds opened this issue over 1 year ago
AssertionError
haof-github opened this issue over 1 year ago
haof-github opened this issue over 1 year ago
Will there be 70b Vicuna-v1.5?
PhanTask opened this issue over 1 year ago
PhanTask opened this issue over 1 year ago
Does FastChat support models like GPT3?
shuqike opened this issue over 1 year ago
shuqike opened this issue over 1 year ago
Add conversation template parameter to vllm worker
alanxmay opened this pull request over 1 year ago
alanxmay opened this pull request over 1 year ago
Is it possible to finetune with llama2-70b?
lanfengmo opened this issue over 1 year ago
lanfengmo opened this issue over 1 year ago
Adjusting Token Limit in Fastchat with Llama2 Model
coding-alt opened this issue over 1 year ago
coding-alt opened this issue over 1 year ago