Gemma.cpp hangs on a Gemma 7B model that was finetuned using huggingface peft(QLoRA)

Hi, thanks for the interesting project!

I create Gemma 7B based model [webbigdata/C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter).  
This model is Huggingface transformer format and translation-only model with original prompt templates fine-tuned by QLoRA.

So, I convert this to pytorch(.ckpt), and result is [f32_merge_model.ckpt](https://huggingface.co/dahara1/gemma_cpp_test/blob/main/f32_merge_model.ckpt)  
I have confirmed that f32_merge_model.ckpt works.  

Then, run this command, no error message.   
python3 convert_weights.py --tokenizer [tokenizer.model](https://huggingface.co/dahara1/gemma_cpp_test/blob/main/tokenizer.model) --weight f32_merge_model.ckpt --output_file [gemma_cpp_merge.bin](https://huggingface.co/dahara1/gemma_cpp_test/blob/main/gemma_cpp_merge.bin) --model_type 7b

Then, run this command, no error message.   
./build/compress_weights --weights util/gemma_cpp_merge.bin --model 7b-pt --compressed_weights util/[gemma_cpp_merge.sbs](https://huggingface.co/dahara1/gemma_cpp_test/blob/main/gemma_cpp_merge.bin)

Then, run gemma.cpp, no error message.   
./build/gemma --tokenizer util/tokenizer.model --compressed_weights util/gemma_cpp_merge.sbs --model 7b-pt

and input my prompt   
[### Instruction:\nTranslate English to Japanese.\n\n### Input:\nThis is a test input.\n\n### Response:\n]

but model can't output anything.   

Is there something wrong with the procedure?   

![gemma-error](https://github.com/google/gemma.cpp/assets/87654083/728e0026-e9d7-4150-879c-0bda537762f2)
![hungon](https://github.com/google/gemma.cpp/assets/87654083/9f927e90-8ae6-4ef6-81a0-604c4c5e0651)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gemma.cpp hangs on a Gemma 7B model that was finetuned using huggingface peft(QLoRA) #198

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gemma.cpp hangs on a Gemma 7B model that was finetuned using huggingface peft(QLoRA) #198

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions