RAM constraints make phone running difficult. As do the more restricted quantization schemes NPUs require. 1B-8B LLMs are shockingly good backed with RAG, but still kind of limited.
It seemed like Bitnet would solve all that, but the big model trainers have ignored it, unfortunately. Or at least not told anyone about their experiments with it.
RAM constraints make phone running difficult. As do the more restricted quantization schemes NPUs require. 1B-8B LLMs are shockingly good backed with RAG, but still kind of limited.
It seemed like Bitnet would solve all that, but the big model trainers have ignored it, unfortunately. Or at least not told anyone about their experiments with it.
deleted by creator