• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    3
    arrow-down
    1
    ·
    3 months ago

    RAM constraints make phone running difficult. As do the more restricted quantization schemes NPUs require. 1B-8B LLMs are shockingly good backed with RAG, but still kind of limited.

    It seemed like Bitnet would solve all that, but the big model trainers have ignored it, unfortunately. Or at least not told anyone about their experiments with it.