Took me some time to figure this one out, and unfortunately requires a significantly larger image (need so much more of nvidia’s toolkit D: couldn’t figure out a way to get around it…)
If people prefer a smaller image, I can start maintaining one for exllama and one without, but for now 1.0 is identical minus exllama support (and I guess also from an older commit) so you can use that one until there’s actual new functionality :)
Thank you so much! I’d be happy to test it out.