Well, my experience with the traditional models wasn’t very pleasant, but it’s not something that really worries me. What I want to know is, should I tell Llama to do something? Apparently, there’s a specific way to write the prompt that makes it work, and I imagine you’ve already been through this.
Sadly, in my brief testing I only checked the Model produced a response, not if it actually wrote any code. My PC can run the models, but not very brilliantly.
I actually then tried llama3.2 and that did work better, but not every time. It does have the capacity for Tool Calls.
The issue you’re likely having, and what you’re referring to when you ask “Apparently, there’s a specific way to write the prompt that makes it work”, is that the AI model needs to be capable of making Tool Calls. This is, to my limited understanding, the ability for the model to know the actions it can and can’t take by providing a specific output calling a tool. Such as reading the document, and outputting what it wants to write to the document.
What I found online is that Local Models are not very great at reliably using Tool Calls. For instance, when I asked for a heading to be added to the page, it called the image generator tool, rather than creating the code for the heading and adding it to the page.
I tried using OpenAI’s ChatGPT Codex with their API and still found the AI agent very slow.
I’m still very new to local AI models, so I unfortunately am not best suited to giving the best advice possible.
I’ve had several local models running on an old i5 4GB of RAM Laptop (with no GPU) that acts as a server in my house running Linux and the models on that machine run pretty well, but are all obviously too small to be anything more than toys.
Poking about on Windows and Ollama on my other i7 16GB of RAM laptop, that has a GTX1080 felt much slower, and didn’t utilise the GPU, part of this is I’m fairly sure my GPU is too old, and running on decade old Drivers to be used, and partly it would seem the Ollama windows app seems to be CPU only.
What hardware are you running, as I can give it my best shot to try find a model that could work better. I’d need the GPU model and make, it’s VRAM and how much RAM your system has.
There’s thankfully no shortage of models out there that can be run locally, but if you wanted something on par with ChatGPT in terms of utility, you’re looking at 8+ CPU cores, 24GB of VRAM and 64GB of RAM, on a system that runs either Linux or Windows with Subsystem for Linux installed.
If it runs well, I thought it would be slow. The only problem is that the prompt for calling has certain requirements that make it quite tedious, and by the way, it can also create images. The problem is that they started using the chatgpt API, but I think it’s time they created their own servers and agents because yesterday, trying to create an article like this one ELON MUSK'S DEATH TRAP: HOW X BECAME A SNIPER'S CROSSHAIRS AND THE WORLD IS PULLING THE TRIGGER, it simply decided not to create it. Gemini did the same thing to me, which shows that these AIs are directly or indirectly deciding what content to continue and what not to create. That’s why I think they should work on an optional, secure AI that works like the AI in the previous version, which didn’t think, it just did what you told it to, and this is very important. You want to maintain the idea of whatever you’re doing because AI is a phenomenon that diverts attention from the central point.
Just tried the new AI feature against llama3.1:8b via
Ollama in a PC with a RTX 2070 Super and I think it’s a good feature, but not
ready for a real use at this stage. The model simply doesn’t know what tool it
could use, and even that it exists underlying commands which are available for
it in BSS. I asked to a cloud AI (Copilot) if it knows what were the tool names
for BSS and it gave me things like readFiles, addElement, but I don’t know if
it’s true and I didn’t succeed in using those with the Ollama model; even
indicating them in a prior system prompt to fix a context. So, thanks for the
intention and awaiting the next step ![]()