Skip to content

Gemma.cpp hangs on a Gemma 7B model that was finetuned using huggingface peft(QLoRA) #198

@webbigdata-jp

Description

@webbigdata-jp

Hi, thanks for the interesting project!

I create Gemma 7B based model webbigdata/C3TR-Adapter.
This model is Huggingface transformer format and translation-only model with original prompt templates fine-tuned by QLoRA.

So, I convert this to pytorch(.ckpt), and result is f32_merge_model.ckpt
I have confirmed that f32_merge_model.ckpt works.

Then, run this command, no error message.
python3 convert_weights.py --tokenizer tokenizer.model --weight f32_merge_model.ckpt --output_file gemma_cpp_merge.bin --model_type 7b

Then, run this command, no error message.
./build/compress_weights --weights util/gemma_cpp_merge.bin --model 7b-pt --compressed_weights util/gemma_cpp_merge.sbs

Then, run gemma.cpp, no error message.
./build/gemma --tokenizer util/tokenizer.model --compressed_weights util/gemma_cpp_merge.sbs --model 7b-pt

and input my prompt
[### Instruction:\nTranslate English to Japanese.\n\n### Input:\nThis is a test input.\n\n### Response:\n]

but model can't output anything.

Is there something wrong with the procedure?

gemma-error
hungon

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions