Chris EichlerAI-first Product
& Marketing

AI Hardware Guide

That's the uncomfortable truth no hardware affiliate post will tell you: for local AI the bottleneck isn't the fastest component, it's the smallest one. A model that doesn't fit in memory doesn't run faster because your GPU was expensive. It doesn't run at all — or it spills to SSD and crawls.

So before you drop anything into your cart: understand what you actually want to do.


The two questions that decide everything

There is no "best AI hardware". Only hardware that fits what you do with it. And that splits into two camps:

Doing LLMs? Chat, coding assistant, RAG over your own documents. Then RAM (or unified memory) and memory bandwidth are what counts. Apple Silicon is surprisingly strong here. A Mac with plenty of unified memory beats PCs that look more brutal on paper, at least for pure LLM inference.

Doing image and video? Stable Diffusion, ComfyUI, text-to-video. Then you need GPU VRAM, and it has to be Nvidia. CUDA is the de facto standard, and right now there is no way around it. Apple can do it too — slower, and with more patience.

If you want both, buy a PC with a fat Nvidia card and take the LLM performance as a bonus. Or accept that no single device is world-class in both disciplines. That's not a sales pitch. That's physics.

A rule of thumb that beats any benchmark: the parameter count in billions should occupy roughly 50–75 % of your available memory for things to run smoothly. So a 13B model realistically wants 16–24 GB. Do the math before you order.


Mac: the LLM and workflow champion

Entry — Mac mini M4 / MacBook Air M4

  • RAM: 16 GB
  • SSD: 512 GB

This is the honest entry point, not the toy.

7–8B models like Mistral 7B or Llama-2 7B run cleanly through Ollama or LM Studio — even as a local AI server on your LAN. Meaning: chat, summarising and translating documents, note and email analysis, light RAG workloads with tens of thousands of documents. Image AI works too, but at low resolution and with patience — the Metal backend is solid, but it's no CUDA.

The Air is portable but thermally a bit more limited. If you run sustained load, the mini is the quieter call.

Sweet spot — Mac Studio M4 Max

  • RAM: 32 GB
  • SSD: 1–2 TB

This is where most people buy right — and it's where you should land if you want to work locally for real.

12–20B models (Qwen 14B, Mixtral 8x7B quantised) run well; 20–30B has noticeable latency but is usable. Enough for a serious local coding assistant. For RAG over large private knowledge bases with hundreds of thousands of tokens active in context. And even for several parallel services on one machine: an LLM server, local embeddings and TTS on the same network.

If you take only one recommendation from this whole post: for local LLM work, the 32 GB Mac is the best price-to-performance ratio on the market.

Pro — Mac Studio M3 Ultra

  • RAM: 64–128 GB
  • SSD: 2–4 TB

The AI studio. One note before you put money down: if you don't need it right now, wait until October — the M5 drops then. At this price tier, a quarter of patience pays off.

30–70B models become viable here, and individual 100B models are doable from ~80 GB RAM upwards according to field reports — heavily quantised, but doable. Heavyweight RAG with large context and many parallel sessions, a code assistant for the whole team on the LAN including refactoring large repos, complex agent setups with multiple models running in parallel — planner, tool-user, critic.

The honest limit: for the fastest image and video models, a strong Nvidia GPU stays ahead. Macs shine at LLM inference and workflow automation. If your day is video rendering, the expensive Mac is the wrong tool.


PC: the route when image and video matter

Entry — the honest starter

  • CPU: Ryzen 7 5700X or comparable Intel i5/i7
  • GPU: RTX 3060 12 GB VRAM or RTX 4060 8 GB
  • RAM: 32 GB
  • SSD: 1 TB NVMe

A tip that saves you frustration: take the 3060 with 12 GB, not the newer 4060 with 8 GB. For AI, more VRAM beats newer architecture almost every time. The 4060 looks more modern on paper. In memory-bound AI day-to-day, it's the worse deal.

7–13B models (quantised if needed) run in VRAM or combined GPU+RAM, and Stable Diffusion at 512×512 is very usable — first more complex ComfyUI pipelines included. Local chat and coding assistant plus solid image generation for marketing assets, mockups, blog graphics. And the entry point to small fine-tunes and LoRAs for your own image styles.

Advanced — creator machine

  • CPU: Intel i7-14700K or current Ryzen 7/9
  • GPU: RTX 4070 Ti or 4080, 12–16 GB VRAM
  • RAM: 64 GB
  • SSD: 2 TB NVMe (better 2× 2 TB, OS and data separate)

13–34B models run well, plus several parallel instances of smaller models. 1024×1024 image generation is smooth, complex ComfyUI graphs with ControlNet, inpainting and upscaling, and first local text-to-video experiments — not in real time, but doable. This is the machine for real content production: thumbnails, product images, simple clips. And for in-house tools like an image asset generator for marketing plus an LLM assistant with RAG.

Pro — workstation / AI rig

  • CPU: Ryzen 9 9950X or Threadripper, alternatively a current Intel i9
  • GPU: RTX 4090 24 GB VRAM or RTX 5090 32 GB (for serious training: two cards)
  • RAM: 128 GB
  • SSD: 4 TB NVMe (better 2× 2 TB, OS and data separate)

This is the machine where PC overtakes Mac. 24–32 GB VRAM lifts image and video to another level: high-res ComfyUI pipelines, video generation without constant offloading, and above all the thing Macs aren't great at — training your own models and LoRAs locally instead of just running them. LLMs up to 70B run quantised across GPU+RAM.

Honestly: from here on you're paying for VRAM, not cleverness. A second 4090 helps training more than any CPU upgrade. And if you only care about LLMs and skip image/video — the Mac Studio at this price tier is the quieter, more power-efficient call. The rig pays off when pixels and frames are your day job.


Special case: NVIDIA DGX Spark

The DGX Spark (formerly "Project DIGITS") doesn't fit either box — it's its own category above the pro tier. Nvidia's answer to the Mac Studio: a desktop cube built for running large models locally.

  • Chip: GB10 Grace-Blackwell superchip
  • Memory: 128 GB unified memory
  • Models: LLMs up to ~200B parameters; two units linked push it further
  • OS: DGX OS (Linux/ARM), not Windows
  • Price: ~€3,000–4,000

Conceptually it's a Mac Studio with the Nvidia software stack: lots of coherent memory, small form factor, low power draw, but CUDA ecosystem instead of Metal.

Now the honest part the keynote leaves out: the Spark holds huge models, but it isn't fast. Memory bandwidth lands around 273 GB/s — well below the GDDR bandwidth of a 4090 or 5090. Concretely: large LLMs fit, but token generation crawls compared to a beefy GPU. For image and video speed, a 5090 beats it clearly. Its superpower is capacity at low power, not speed.

Buying logic: the Spark is worth it if you want to run very large models locally and live with moderate speed — prototyping, inference, developing against a 70–200B model without the cloud. If you want image/video or fast training, take the 4090/5090 rig. If you want the Spark's profile but in the Apple ecosystem, the Mac Studio is the alternative.

The bigger siblings — DGX Station (up to 784 GB) and the DGX servers with 8 GPUs for €300k+ — are datacentre hardware. Different budget, different topic, not "local" in the sense of this post.


What you should actually take away

Local AI isn't a hardware arms race. It's a fit question.

Don't buy the most expensive box. Buy the one whose memory matches the models you actually want to run. LLMs on Apple Silicon, image and video on Nvidia. Don't overspend for a "maybe later" — models and tools get more efficient faster than your hardware ages.

And the most important part: local doesn't automatically mean better. It means private, controllable and independent from API costs. When that pays off and when an API call is the more honest answer — that's the next question. We'll handle it elsewhere.

Ideas on AI-first Product Management & Marketing, straight to your inbox.

No spam, unsubscribe at any time

Chris Eichler