If you want the deeper details, weβve got an FAQ entry here: privatellm.app/en/faq#Run-i...
If you want the deeper details, weβve got an FAQ entry here: privatellm.app/en/faq#Run-i...
Just to confirm, is it on iOS or Mac?
On iOS, itβs indeed a bug. We need to remove the Show when run toggle from the shortcut action because Apple doesnβt allow background GPU execution on iOS.
Yes. LLM inference is memory-bound: memory capacity and memory bandwidth matter. For Macs, 64GB is a great sweet spot: you can run Llama 3.3 70B locally with GPT4o-level reasoning. Rule of thumb: run the largest model your Mac can fit.
Thanks for the shout-out! π
Glad youβre enjoying Private LLM. The boost youβre seeing is because weβre not an MLX/llama.cpp wrapper like LM Studio or Ollama (slowllama?)
We quantize each model (OmniQuant/GPTQ) for Apple Silicon, so even low-RAM iPhones and Macs run fast and reason better.
Just learnt that Private LLM has been cited in an AI safety paper. Because we let users download and use lots of uncensored models.
We are delighted to hear that. Please let us know if thereβs any particular model youβd like to see in the app.
We just shipped an update. More coming soon
OpenHands LM β Coding focused language model based on Qwen 2.5 coder:
* 7B (iOS + macOS) β 8GB RAM or more
* 32B (macOS only) β 32GB RAM minimum
Handles bug fixing and code refactoring tasks. Trained on real GitHub issues via reinforcement learning.
Meta-Llama 3.1 8B SurviveV3 (3-bit iOS / 4-bit macOS)
Wilderness survival assistant, offline. Knows how to build shelters, find water, navigate terrain, etc. β¨β¨
Runs on any iOS/Mac device with 8GB+ RAM β even off-grid.
Llama 3.1 8B UltraMedical (3-bit iOS / 4-bit macOS)
Biomedical assistant for med students, researchers, and clinicians.β¨Answers board-exam style questions, explains research findings, and supports clinical reasoning β privately.
Runs on 8GB+ RAM.
Perplexityβs R1 1776 Distill Llama 70Bβ¨
Post-trained to eliminate refusal behavior on politically sensitive topics β while preserving full reasoning ability.
Built to refuse censorship: open dialogue, independent thought, and the right to answer freely.
macOS only. Needs 48GB+ RAM.
Amoral-Gemma3-1B-v2 & gemma-3-1b-it-abliterated
Uncensored 4-bit Omniquant quantized fine-tunes of Gemma 3 1B.β¨For users who want unrestricted conversations, roleplay, and truth-seeking without moral filters. Fast and small. iOS and macOS.
Gemma 3 1B IT (4-bit QAT)β¨Instruction-tuned.
Multilingual. Full 32K context on iPhones with β₯ 6GB RAM.β¨β¨
Ideal for writing, Q&A, summarization β in 140+ languages.β¨β¨
Small enough to run on any supported iOS or Mac device.
Private LLM v1.9.7 (iOS) and v1.9.9 (macOS) are out.
This update brings Gemma 3 1B to all devices β iPhone, iPad, Mac.β¨And Perplexityβs R1 1776 Distill Llama 70B to beefy Macs for uncensored reasoning.β¨
Plus new models for coding, survival, and biomedicine β all local, all private. π§΅
π οΈ We've fixed a pesky crash that was affecting some newer models on older versions of macOS like Sonoma.
π Also, we've updated our lineup by adding support for both 3-bit and 4-bit OmniQuant quantized versions of the EVA LLaMA 3.33 70B v0.1 model by @Nottlespike. Note that we've deprecated the previous version, EVA LLaMA 3.33 70B v0.0
For Apple Silicon Mac users with 64GB or more RAM, we still recommend using the 4-bit OmniQuant-quantized version of 70B models.
πͺ Power users, rejoice! The 5 new 3-bit OmniQuant-quantized 70B models on Mac from Private LLM v1.9.8 are here. These models consume around 5GB less RAM than their 4-bit counterparts, making them ideal for Apple Silicon Macs with 48GB of RAM.
π Now, with Private LLM, you can see the context length right in the model quick switcher! This little upgrade makes a big difference, helping you choose the perfect model for your conversation or task at a glance.
βοΈ Unleash your creativity with the Gemma 2 iFable 9B model from iFable! This top-tier creative writing model works on iPad Pros with 16GB of RAM or any Apple Silicon Mac with 16GB+ RAM. No other local LLM app lets you run 9B or 14B models on iOS like Private LLM can.
- Dolphin 3.0 Llama 3.1 8B - For iOS devices with 8GB or more RAM, like the iPhone 15 Pro or newer
These are currently the best uncensored LLMs that can fit in your pocket, no holds barred!
- Dolphin 3.0 Llama 3.2 3B - For those with 6GB+ RAM on their iOS devices or any Apple Silicon Mac
- Dolphin 3.0 Qwen 2.5 0.5B, 1.5B, 3B - Compatible with nearly all modern iPhones (iPhone 12 or newer) and Macs
π¬ Say hello to the uncensored freedom of Dolphin 3.0 models! From Cognitive Computations, these models are your ticket to unfiltered AI conversations.
- Dolphin 3.0 Llama 3.2 1B - Perfect for iPhones/iPads with 4GB+ RAM or any Apple Silicon Mac
Private LLM v1.9.6 for iOS and v1.9.8 for macOS are here with 12 new models! From uncensored chats to creative writing, there's something for everyone. Let's dive in! π§΅
Thank you, @soldaini.net! And huge congratulations on launching Ai2 OLMoE - love what youβre doing for local AI!
We're excited to join the BlueSky community!