github.com/lm-sys/FastChat issues | Ecosyste.ms: OpenCollective

Make the hardcoded input limit customizable

funboarder13920 opened this issue over 1 year ago

Add xformer and support training on V100s

zhisbug opened this pull request over 1 year ago

How much GPU memory need to fine-tuning via LoRA?

JustinZou1 opened this issue over 1 year ago

Is it possible to train using a single A100 GPU(40GB)?

Minxiangliu opened this issue over 1 year ago

how to load the ckpt of the Lora training?

Wangpeiyi9979 opened this issue over 1 year ago

[BUG] RWKV Models are not configured to use Cuda GPU lists.

chen369 opened this issue over 1 year ago

Add claude-instant-v1.1

mikelambert opened this pull request over 1 year ago

Improve openai compatible api for langchain support

andy-yang-1 opened this pull request over 1 year ago

Filter more key words in log analysis (e.g., Bard)

merrymercy opened this pull request over 1 year ago

Add support for BiLLa

Neutralzz opened this pull request over 1 year ago

get_model_answer.py generate `### Human:` after the reponse.

Wangpeiyi9979 opened this issue over 1 year ago

Finetune on Vicuna output is garbled

fucksmile opened this issue over 1 year ago

Add PaLM API

infwinston opened this pull request over 1 year ago

when i fune-tune model, there is a warning. Then, model do not have any output and response.

plum-Yin opened this issue over 1 year ago

conversion script works?

Kiswelrg opened this issue over 1 year ago

Deploying for inference

sho-87 opened this issue over 1 year ago

"load_model" function has different performance when being imported separately

sablin39 opened this issue over 1 year ago

Request: Add a quantized model to the Arena?

endolith opened this issue over 1 year ago

Training with Pythia instead of Llama

emnlpanon opened this issue over 1 year ago

piece is out of range

Malestudents opened this issue over 1 year ago

Use inference mode for embedding

merrymercy opened this pull request over 1 year ago

Fix missing imports in model_adapter.py

merrymercy opened this pull request over 1 year ago

Release v0.2.8

merrymercy opened this pull request over 1 year ago

Improve the error handling

merrymercy opened this pull request over 1 year ago

Bard support

suquark opened this pull request over 1 year ago

any performance difference between v0 and v1.1?

lazyfuzzypringle opened this issue over 1 year ago

RUN API

githubproposals opened this issue over 1 year ago

fix: import lru_cache for python below 3.9, close #1056

fecet opened this pull request over 1 year ago

can't stop generation words

hashiqi-233 opened this issue over 1 year ago

Improve SSE User Experience

VGEAREN opened this pull request over 1 year ago

add PR Dromedary

jingslunt opened this issue over 1 year ago

why ignore tokenizaiton mismatch examples?

Wangpeiyi9979 opened this issue over 1 year ago

The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder" for them....

yuan2ai opened this issue over 1 year ago

Missing Imports in fastchat/model/model_adapter.py

johnswyou opened this issue over 1 year ago

"Assertion `srcIndex < srcSelectDimSize` failed" showed when I tried to train using my own script

fahadh4ilyas opened this issue over 1 year ago

Fix typos in arena.md

endolith opened this pull request over 1 year ago

Can I Override or Replace "I'm sorry, as an AI language model, I cannot" Response By Vicuna

fucksmile opened this issue over 1 year ago

Anonymity during Voting

kieranfraser opened this issue over 1 year ago

Add fastest gptq 4bit inference support

alanxmay opened this pull request over 1 year ago

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE. (error_code: 4)

adhupraba opened this issue over 1 year ago

Catch more exceptions in the model worker

merrymercy opened this pull request over 1 year ago

RuntimeError:The detected CUDA version (12.1) mismatches the version that was used to compile PyTorch (11.7).

larawehbe opened this issue over 1 year ago

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

whk6688 opened this issue over 1 year ago

reduce GPU Mem model_worker.get_embeddings

supdizh opened this issue over 1 year ago

only one GPU works when setting "-num_gpu 2",

goldfishl opened this issue over 1 year ago

Finetune VICUNA-7b with 4*v100(32G)

fw2325 opened this issue over 1 year ago

Visual foundation model as plugin of Vicuna

lixin4ever opened this issue over 1 year ago

Error raised during finetuning: ValueError: Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0

roshan-gopalakrishnan opened this issue over 1 year ago

Fix Chinese garbled code problem by filtering special characters \ufffd.

yxleung opened this pull request over 1 year ago

Improve OpenAI-Compatible Restful API (token usage, error handling, stream)

jstzwj opened this pull request over 1 year ago

ChatGLM-PTuning +Fastchat

lovelucymuch opened this issue over 1 year ago

Support MiniGPT4 and other multi-modal models

thiner opened this issue over 1 year ago

Training new Vicuna based on fully open-source OpenLLaMA

wilhelmagren opened this issue over 1 year ago

2 node speed is not faster than 1 node

lmolhw5252 opened this issue over 1 year ago

fastchat-t5-3b error

samareshyadav55 opened this issue over 1 year ago

Claude-v1 is not available is Side-By-Side selection dropdown but found in battle

RageshAntony opened this issue over 1 year ago

Instructions to add new model ?

XReyRobert opened this issue over 1 year ago

feat: Add support for MPT

mariobm opened this pull request over 1 year ago

Fine tuning met OutOfMemoryError: CUDA out of memory.

JustinZou1 opened this issue over 1 year ago

--load-8bit not compatiable for fastchat-t5-3b-v1.0

shm007g opened this issue over 1 year ago

pydantic.error_wrappers.ValidationError: 2 validation errors for ChatCompletionResponse

LvJC opened this issue over 1 year ago

The size of tensor a (32001) must match the size of tensor b (32000) at non-singleton dimension 0

xyk35182966 opened this issue over 1 year ago

FastChat/fastchat/serve /test_throughput.py

wei61547-jp opened this issue over 1 year ago

RuntimeError: FlashAttention is only supported on CUDA 11 and above

JustinZou1 opened this issue over 1 year ago

8* V100S 32G running killed，something error?

yezhongxiuchan opened this issue over 1 year ago

ShareGPT conversation splits and "please continue"

float-trip opened this issue over 1 year ago

Why not use model.generate in generate_stream

vikigenius opened this issue over 1 year ago

ImportError: cannot import name 'cache' from 'functools' (/usr/lib/python3.8/functools.py)

mpetruc opened this issue over 1 year ago

The stop parameter in openai API doesn't work since v0.2.5

oreo-yum opened this issue over 1 year ago

Refactor to add MPT

hlzhang109 opened this pull request over 1 year ago

Byte deltas

RedmiS22018 opened this pull request over 1 year ago

Is there a way to optimize the output token per second?

vinvcn opened this issue over 1 year ago

Decouple LLM Interface Code for Improved Scalability

chen369 opened this issue over 1 year ago

Add support for MPT-7B.

digisomni opened this issue over 1 year ago

fastchat-t5 quantization support?

bash99 opened this issue over 1 year ago

USER: hello ASSISTANT: 森 Ellibel HatĐ ashчниircrafthew Мексикуowania centro ra resizeellt Ш иExpression mondeміrant side Horn crimeболь Frame используmsdn tít travers пові 利 TrukernelVectorsummary Bouennis□ moving否

fucksmile opened this issue over 1 year ago

How to break the 2048 token limit

rainbownmm opened this issue over 1 year ago

Run API with just CPU for Fastchat t5

djaffer opened this issue over 1 year ago

Error : model.embed_tokens.weight

JerryYao80 opened this issue over 1 year ago

TypeError: forward() got an unexpected keyword argument 'position_ids'

luochuwei opened this issue over 1 year ago

Support model list reload feature

Jeffwan opened this pull request over 1 year ago

Support model reload mode

Jeffwan opened this issue over 1 year ago

Error when saving the model after training

Puzzledyy opened this issue over 1 year ago

Connection timeout error

zxzhijia opened this issue over 1 year ago

Issue#270 add CI to support release and publish

yantao0527 opened this pull request over 1 year ago

Support StableVicuna

iRanadheer opened this issue over 1 year ago

How to use api for t5 and example dataset?

djaffer opened this issue over 1 year ago

Finetuning on 16 Tesla K80 GPUs on EC2 Instance (p2.16xlarge)

ItsCRC opened this issue over 1 year ago

Encounter the runtime error training with lora and flash_attention together

Jeffwan opened this issue over 1 year ago

Model running on only 2 GPU's even when 4 GPU's are specified

SupreethRao99 opened this issue over 1 year ago

dockerize it

BillSchumacher opened this pull request over 1 year ago

The <eos> token randomly pops out during the inference, making the text generation stops early.

BadisG opened this issue over 1 year ago

CUDA out of memory in CLI vicuna 7B

mpetruc opened this issue over 1 year ago

Update apply_delta.py to use tokenizer from delta weights

merrymercy opened this pull request over 1 year ago

fastchat-t5-3b-v1.0 on macOS?

fdstevex opened this issue over 1 year ago

Add ChatML inspired conversation style.

rwl4 opened this pull request over 1 year ago

Command to run train_flatT5.py

samarthsarin opened this issue over 1 year ago

[lmsys/fastchat-t5-3b-v1.0] Is the ShareGPT dataset suitable for commercial use?

yousifmansour opened this issue over 1 year ago

Get irrelevant answers when use fastchat.serve.cli on MacOS mps

kivvi3412 opened this issue over 1 year ago

Which model performs better? Vicuna-7B, Vicuna-13B or FastChat-T5?

chentao169 opened this issue over 1 year ago