github.com/lm-sys/FastChat issues | Ecosyste.ms: OpenCollective

how to finetune llama-30b with fastchat？

dongguanting opened this issue over 1 year ago

lora vicuna-7b-v1.3 Error: RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

XJPeng12 opened this issue over 1 year ago

Is the OpenAI-compatible API sending the `functions` argument to the model?

AmbroxMr opened this issue over 1 year ago

How to support TheBloke/Falcon-180B-Chat-GPTQ

hustwyk opened this issue over 1 year ago

Add llama-2 template support for fine-tuning

karthik19967829 opened this pull request over 1 year ago

Add Ascend NPU support

zhangsibo1129 opened this pull request over 1 year ago

When I run the gen_judgment.py file, setting different parameters will generate the corresponding gpt-4_single.jsonl or gpt-4_pair.jsonl file, the meaning of some fields in gpt-4_single.jsonl is not clear, Can you explain what the g1_winer, g2_winer, judge, and g1_user_prompt fields mean? thank you

wuQi-666 opened this issue over 1 year ago

When I run the gen_judgment.py file, setting different parameters will generate the corresponding gpt-4_single.jsonl or gpt-4_pair.jsonl file, the meaning of some fields in gpt-4_single.jsonl is not clear, Can you explain what the g1_winer, g2_winer, judge, and g1_user_prompt fields mean? thank you。

wuQi-666 opened this issue over 1 year ago

is it right this time? (added emotion analyzer module and section to use it in openai_api_server.py

soulsyrup opened this pull request over 1 year ago

Add raw conversation template (#2417)

tobiabir opened this pull request over 1 year ago

[Feature Request] Support raw conversation

tobiabir opened this issue over 1 year ago

Add support for Phind-CodeLlama models (#2415)

tobiabir opened this pull request over 1 year ago

[Feature Request] Support Phind-CodeLlama

tobiabir opened this issue over 1 year ago

Can not use Intel Arc GPU

yxc890123 opened this issue over 1 year ago

There was an error loading the fine-tuned model？How to solve?

asenasen123 opened this issue over 1 year ago

max input prompt

UncleFB opened this issue over 1 year ago

merge google/flan based adapters: T5Adapter, CodeT5pAdapter, FlanAdapter

wangzhen263 opened this pull request over 1 year ago

[Feature request] Support loading GGUF and GGML model format

nghidinhit opened this issue over 1 year ago

Update huggingface_api.py

merrymercy opened this pull request over 1 year ago

Add support for baichuan2 models

obitolyz opened this pull request over 1 year ago

Finetuning with LLaMA-Efficient-Tuning and deploying with fastchat， but get poor result

myj951 opened this issue over 1 year ago

Rename twitter to X

karshPrime opened this pull request over 1 year ago

pip3 install "fschat[model_worker,webui]" failed at Collecting sentencepiece (from fschat[model_worker,webui]) Downloading sentencepiece-0.1.99-cp310-cp310-macosx_11_0_arm64.whl (1.2 MB)

change-since2022 opened this issue over 1 year ago

Fix model_worker error

wangxiyuan opened this pull request over 1 year ago

[Feature Request] Support InternLM Deploy

vansinhu opened this issue over 1 year ago

Added google/flan models and fixed AutoModelForSeq2SeqLM when loading T5 compression model

wangzhen263 opened this pull request over 1 year ago

Revert "add best_of and use_beam_search for completions interface"

merrymercy opened this pull request over 1 year ago

Revert "bugfix of openai_api_server for fastchat.serve.vllm_worker"

merrymercy opened this pull request over 1 year ago

vllm_worker returning LIST instead of STR causes error in opeanai_api_server.py

dhgarcia opened this issue over 1 year ago

bugfix of openai_api_server for fastchat.serve.vllm_worker

Rayrtfr opened this pull request over 1 year ago

strange output of Baichuan2 [by restful api server]

2286573608 opened this issue over 1 year ago

Fine-tuning on dual Quadro RTX 6000

nshern opened this issue over 1 year ago

LLMs Toxicity benchmark

eleluong opened this issue over 1 year ago

Baichuan2-13B-Chat模型可加载，不可用test测试

ye7love7 opened this issue over 1 year ago

Finetune on completions only

matankley opened this issue over 1 year ago

Spicyboros + airoboros 2.2 template update.

jondurbin opened this pull request over 1 year ago

can i know when will our model can be added to the MT-Bench

renatz opened this issue over 1 year ago

Use fsdp api for safe save

merrymercy opened this pull request over 1 year ago

Tokenization Mismatch on conversations with >2 turns

alwayshalffull opened this issue over 1 year ago

Enhance conv prompt and train

Trangle opened this pull request over 1 year ago

Update UI and sponsers

merrymercy opened this pull request over 1 year ago

Release new code

ZHUANGMINGXI opened this issue over 1 year ago

freezing or black screen when trying to get a response

ZHUANGMINGXI opened this issue over 1 year ago

Add falcon 180B chat conversation template

Btlmd opened this pull request over 1 year ago

Support custom conversation template in multi_model_worker

2533245542 opened this issue over 1 year ago

model_worker raises error

wangxiyuan opened this issue over 1 year ago

Make E5 adapter more restrict to reduce mismatch

merrymercy opened this pull request over 1 year ago

How come the conversation template of vicuna is different from llama?

luffycodes opened this issue over 1 year ago

update monkey patch for llama2

merrymercy opened this pull request over 1 year ago

NaN or Inf found in input tensor

kkkparty opened this issue over 1 year ago

Need to support Baichuan2

boxter007 opened this issue over 1 year ago

Page

PageIV opened this pull request over 1 year ago

Error: Failed to load GPTQ-for-LLaMa. No module named 'llama' when I load quantized model https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ

nghidinhit opened this issue over 1 year ago

[Feature] Support the new version of baichuan2

Tomorrowxxy opened this issue over 1 year ago

Inference: bf16 or fp16?

larekrow opened this issue over 1 year ago

add best_of and use_beam_search for completions interface

leiwen83 opened this pull request over 1 year ago

Improve doc

merrymercy opened this pull request over 1 year ago

Revert "add best_of and use_beam_search for completions interface"

merrymercy opened this pull request over 1 year ago

Extract upvote/downvote from log files

merrymercy opened this pull request over 1 year ago

have a question about this command

lpxiangyan9 opened this issue over 1 year ago

Update sponsor logos

merrymercy opened this pull request over 1 year ago

ValueError: Trying to set a tensor of shape torch.Size([1024, 8192]) in "weight" (which has shape torch.Size([8192, 8192])), this looks incorrect.

coding-alt opened this issue over 1 year ago

Add GPTQ via Transformers. [Basic]

digisomni opened this pull request over 1 year ago

Add GPTQ via Transformers. [Basic]

digisomni opened this pull request over 1 year ago

Correct prompt for fastchat-t5-3b-v1.0 in the case of RAG

Matthieu-Tinycoaching opened this issue over 1 year ago

Issues with VLLM Integration Speedup

chengyanwu opened this issue over 1 year ago

update compression.py: compress model with bitsandbytes

hzg0601 opened this pull request over 1 year ago

Unable to use --gpus to load model onto specific GPU

kikoreis opened this issue over 1 year ago

Loading vicuna-13b-1.5 with single GPU in 16GB VM triggers system OOM

kikoreis opened this issue over 1 year ago

can i use the latest code to conduct vicuna-bench's evaluation

renatz opened this issue over 1 year ago

if LOGDIR is empty, then don't try output log to local file

leiwen83 opened this pull request over 1 year ago

NaN or Inf found in input tensor

kkkparty opened this issue over 1 year ago

Simplify huggingface api example

merrymercy opened this pull request over 1 year ago

Flash attention for fine-tuning

prince14322 opened this issue over 1 year ago

Is vllm_worker considering making modifications for this PR to avoid Chinese output garbled characters?

aizaiyishunjian opened this issue over 1 year ago

controller

hongxiong1230 opened this issue over 1 year ago

Qwen training

kyriekevin opened this issue over 1 year ago

Fix Salesforce xgen inference

jaywonchung opened this pull request over 1 year ago

Any plans of vicuna-33b-v1.5 based on codellama-33b?

luffycodes opened this issue over 1 year ago

add best_of and use_beam_search for completions interface

leiwen83 opened this pull request over 1 year ago

Speed up inferencing

samarthsarin opened this issue over 1 year ago

added emotion_analyzer module and integrated it into openai_api_serv…

soulsyrup opened this pull request over 1 year ago

Why has the chat interface/v1/chat/completions been modified to serial

suqisuqi opened this issue over 1 year ago

AssertionError: Torch not compiled with CUDA enabled

congrashino422 opened this issue over 1 year ago

custom model path local when load the open source model from huggingface

502dxceit opened this issue over 1 year ago

Remove hardcode flash-attn disable setting

Trangle opened this pull request over 1 year ago

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.

631068264 opened this issue over 1 year ago

When using the --load-8bit parameter with Baichuan-13B-chat, an error occurs: None of the inputs have requires_grad=True. Gradients will be None.

coding-alt opened this issue over 1 year ago

support azure gpt3.5 and gpt4

zhuangAnjun opened this pull request over 1 year ago

support azure gpt3.5 and gpt4

zhuangAnjun opened this pull request over 1 year ago

Document turning off proxy_buffering when api is streaming

nathanstitt opened this pull request over 1 year ago

ERROR: Exception in ASGI application

allenhaozi opened this issue over 1 year ago

Add scripts for chat data cleaning and analysis

merrymercy opened this pull request over 1 year ago

Revert "fix: llm_judge resume from breakpoint when judging"

merrymercy opened this pull request over 1 year ago

Enhancing Model Adapter Code with "model_id" and Priority Queue

Trangle opened this pull request over 1 year ago

[llm_judge] key mismatch during match list deduplication when mode=pairwise-baseline

johnheo opened this issue over 1 year ago

Add code llama info

merrymercy opened this pull request over 1 year ago

Vicuna-13b-16k with vllm not repeats a single word in output

s-dharmam opened this issue over 1 year ago

Fix docs

merrymercy opened this pull request over 1 year ago

[fix] lm-sys/FastChat/issues/2295

vaxilicaihouxian opened this pull request over 1 year ago