Running local models on an M4 with 24GB memory

TL;DR

An experiment shows that it is possible to run reasonably capable local AI models on a Mac M4 with 24GB RAM. While these models are not state-of-the-art, they can perform basic tasks, research, and planning without internet access, reducing dependence on large cloud services.

A user has demonstrated that it is feasible to run local AI models on a Mac M4 with 24GB of memory, enabling basic AI tasks without requiring internet access. This development matters because it offers a way to reduce reliance on cloud-based AI services and enhances privacy and control over AI workflows.

The user experimented with several local models, ultimately achieving a workable setup with Qwen 3.5 9B (Q4), which runs on LM Studio and achieves around 40 tokens per second with a 128K context window. The model is suitable for research, basic coding, and planning tasks, but cannot match the capabilities of state-of-the-art (SOTA) models for complex problem-solving.

Configuring the setup involved selecting appropriate models, adjusting parameters such as temperature and context length, and enabling features like ‘thinking mode.’ The user reported that while the model can perform useful functions, it sometimes misinterprets prompts or gets stuck in loops, reflecting its limitations compared to larger models.

Why It Matters

This development is significant because it demonstrates that capable local AI models can run on consumer-grade hardware, specifically a Mac M4 with 24GB RAM. It offers an alternative to cloud-based AI, enhancing privacy, reducing costs, and allowing offline operation. While these models are less powerful than SOTA solutions, they can still support productivity tasks and basic research, which could benefit individual users and small teams.

OWC 32GB (2 x 16GB) PC21300 DDR4 2666MHz 260pin SO-DIMMs Memory Ram Module, Compatible with Mac mini 2018, iMac 2019 and up, and Compatible PCs, (OWC2666DDR4S32P)

OWC 32.0GB UPGRADE: Consists of Two 16GB 2666MHz DDR4 PC4-21300 SODIMM 260-Pin, 1.2 Volts, Non-registered, Non-ECC, Fully compliant…

As an affiliate, we earn on qualifying purchases.

Background

Recent years have seen rapid growth in AI model sizes and cloud-based deployment, making advanced AI accessible primarily through large providers. Local deployment has remained challenging due to hardware limitations. However, recent experiments, including this one, show that with optimized models and configurations, it is possible to run useful AI models on consumer hardware like the Mac M4 with 24GB RAM. Prior efforts focused on smaller models or required more powerful hardware, but this development broadens the scope for local AI use.

“It is surprisingly feasible to run a model like Qwen 3.5 9B on a Mac M4 with 24GB RAM, achieving basic functionality for research and coding tasks.”

— the experimenter

“While these models can’t replace SOTA solutions for complex, long-term problem solving, their ability to run locally opens new possibilities for privacy-focused, offline AI use.”

— AI researcher

Engineering AI on Apple Silicon: Unified Memory, Metal Compute, MLX, and Core ML for On-Device Intelligence

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It remains unclear how well these models will perform across different hardware configurations or with more complex tasks. The long-term stability and scalability of such setups are still being tested, and user experiences may vary depending on specific model choices and tuning.

Silverhill Tools ATKMMI Tool Kit for Mac Mini Computers (2010 and newer)

Tools required for Mac Mini computers, 2012 and newer (not for PowerPC)

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include refining configurations for better stability and performance, exploring additional models, and testing in real-world scenarios. Further development may focus on optimizing models for even lower resource consumption and broader application use cases.

Amazon

offline AI research tools for Mac

As an affiliate, we earn on qualifying purchases.

Key Questions

Can I run these models on my own Mac with 24GB RAM?

Yes, based on recent experiments, it is possible to set up and run models like Qwen 3.5 9B on a Mac M4 with 24GB RAM, though performance may vary depending on configuration and use case.

How does the performance of these local models compare to cloud-based SOTA models?

Local models like Qwen 3.5 9B are less capable in solving complex, long-term problems and may get distracted or stuck. They are suitable for basic tasks, research, and coding but do not match the depth and reliability of SOTA cloud models.

What are the main challenges in setting up local models on a Mac M4?

Challenges include selecting compatible models, fine-tuning configuration parameters, enabling features like ‘thinking mode,’ and managing performance limitations. It requires technical knowledge and patience to optimize the setup.

Will running local models become easier or more powerful in the future?

It is likely that ongoing optimizations, new model architectures, and software improvements will make local AI deployment more accessible and capable over time, but current setups still require technical effort.

Running local models on an M4 with 24GB memory

Up next

I Work in Hollywood. Everyone Who Used to Make TV Is Now Secretly Training AI

Author

Deep Intellica Team

Share article

Why It Matters

OWC 32GB (2 x 16GB) PC21300 DDR4 2666MHz 260pin SO-DIMMs Memory Ram Module, Compatible with Mac mini 2018, iMac 2019 and up, and Compatible PCs, (OWC2666DDR4S32P)

Background

Engineering AI on Apple Silicon: Unified Memory, Metal Compute, MLX, and Core ML for On-Device Intelligence

What Remains Unclear

Silverhill Tools ATKMMI Tool Kit for Mac Mini Computers (2010 and newer)

What’s Next

offline AI research tools for Mac

Key Questions

Can I run these models on my own Mac with 24GB RAM?

How does the performance of these local models compare to cloud-based SOTA models?

What are the main challenges in setting up local models on a Mac M4?

Will running local models become easier or more powerful in the future?

Data Science vs. AI: the Defining Decision for Tech Professionals

Openai’s Visionary CEO Walks a Fine Line Between Innovation and Instability.

Anthropic announces 200K context fine-tuning

AI in Healthcare: Doctors, Nurses, and the New AI Assistants

I Work in Hollywood. Everyone Who Used to Make TV Is Now Secretly Training AI

Bitcoin mining pools with 75% of BTC hashrate join open standard for block construction

AI Agents Have Two Souls. You Only Control One

How enterprises are scaling AI

Running local models on an M4 with 24GB memory

Up next

Author

Deep Intellica Team

Share article

Why It Matters

OWC 32GB (2 x 16GB) PC21300 DDR4 2666MHz 260pin SO-DIMMs Memory Ram Module, Compatible with Mac mini 2018, iMac 2019 and up, and Compatible PCs, (OWC2666DDR4S32P)

Background

Engineering AI on Apple Silicon: Unified Memory, Metal Compute, MLX, and Core ML for On-Device Intelligence

What Remains Unclear

Silverhill Tools ATKMMI Tool Kit for Mac Mini Computers (2010 and newer)

What’s Next

offline AI research tools for Mac

Key Questions

Can I run these models on my own Mac with 24GB RAM?

How does the performance of these local models compare to cloud-based SOTA models?

What are the main challenges in setting up local models on a Mac M4?

Will running local models become easier or more powerful in the future?

You May Also Like