It would be awesome if the repo supported quantization methods. Reference: [k-quants](https://github.com/ggerganov/llama.cpp/pull/1684)