RandomlyRight@sh.itjust.worksOPtoSelfhosted@lemmy.world•Faster Ollama alternativeEnglish
1·
8 hours agoYeah, but there are many open issues on GitHub related to these settings not working right. I’m using the API, and just couldn’t get it to work. I used a request to generate a json file, and it never generated one longer than about 500 lines. With the same model on vllm, it worked instantly and generated about 2000 lines
It was multiple models, mainly 32-70B