Apple's Foundation Models framework is the quiet sleeper of iOS 26. It does not get the Liquid Glass billing, it does not get the keynote montage, and it does not require you to register with a cloud API. What it does do is let you ship genuinely useful AI features — summarization, categorization, structured extraction, natural-language commands — that run entirely on the user's device, for free, with no telemetry leaving the phone.

If you are an indie developer who watched the last two years of "AI apps" and quietly wondered how you were supposed to pay for tokens, this is the answer. This guide walks through what the framework is, when to use it versus a cloud model, how to wire it into a real app, how to handle the cases where it falls over, and what Apple's reviewers will look for when you submit.

what the foundation models framework actually is#

The framework ships a 3-billion-parameter language model that sits locally on Apple Silicon devices — iPhone 15 Pro and up for phones, M-series iPads, and every M-series Mac. You call it from Swift with a small API surface, you get back structured responses, and nothing leaves the device.

The practical contract: if the user is on a supported device with Apple Intelligence enabled, you get a local LLM you can call like any other Swift API. If they are not, your call fails with a specific error and you fall back. That fallback branch is the single most important part of the integration; we will come back to it.

Three things it is not. Not a chatbot SDK — no pre-built UI, no "Talk to Apple Intelligence" button. Not a frontier model — it will not beat GPT-5 on reasoning. And not a general-purpose agent — it does not call tools or browse; that is App Intents and Siri territory.

What it is good at: text you already have. Rewriting, classifying, pulling structured data out, generating short completions. That is a huge surface area for real apps.

when to use on-device versus cloud#

The decision is almost always about three things: privacy, cost, and quality.

Privacy is the easy case. If the input is sensitive — health, finance, personal journaling, private messages — on-device is the default. You do not want a privacy policy update that reads "we now send your users' diary entries to a third-party LLM." The Foundation Models framework makes this a non-issue: the data never leaves the device, so the framework does not have to appear in your privacy manifest as a tracking destination.

Cost is the next filter. If the call happens once per user session, cloud is fine. If it happens every time the user types, pastes, or scrolls, you are either eating margin or passing a subscription on to your users. On-device inference is free at runtime forever. That changes which features you can justify shipping.

Quality is the trap. The 3B on-device model is surprisingly good at constrained tasks (extracting a date from a string, picking a category, rewriting a sentence) and surprisingly bad at open-ended ones (writing a whole blog post from a prompt). If you ask it to do something it cannot do, your users will see garbage, and you will ship a bug report. Design the feature around what the model can do reliably.

the minimum viable integration#

The API is deliberately small. Here is a complete example that summarizes a piece of text into a single sentence. You need iOS 26, Xcode 26, and the FoundationModels framework linked.

swift

import FoundationModels

struct SummaryGenerator {
    func summarize(_ text: String) async throws -> String {
        let session = LanguageModelSession(
            instructions: "Summarize the input in exactly one sentence. No preamble."
        )
        let response = try await session.respond(to: text)
        return response.content
    }
}

That is the integration. Three lines of interesting code, one framework import.

Structured output is where the framework earns its keep. You rarely want freeform text — you want a typed value you can plug into the rest of your app. The @Generable macro makes the model return a Swift struct directly:

swift

@Generable
struct ExtractedTask {
    @Guide(description: "A short action-oriented title.")
    let title: String
    @Guide(description: "Due date if present, otherwise nil.")
    let dueDate: Date?
    @Guide(description: "Priority: low, medium, or high.")
    let priority: Priority
}

let session = LanguageModelSession(instructions: "Extract a task from the user's note.")
let task = try await session.respond(to: userNote, generating: ExtractedTask.self)

The framework does the prompt engineering, the JSON parsing, and the type coercion. You get an ExtractedTask back. This is the pattern you will use for 80% of real integrations.

handling the fallback case, which you absolutely will hit#

Here is the case you cannot skip: the user is on an iPhone 14, or an iPhone 15 that has Apple Intelligence disabled, or an iPad in a region where Apple Intelligence is not yet available. Your API call will throw. Your feature needs a plan.

swift

func summarize(_ text: String) async -> String {
    do {
        let session = LanguageModelSession(
            instructions: "Summarize in one sentence."
        )
        return try await session.respond(to: text).content
    } catch LanguageModelSession.Error.unavailable {
        return heuristicSummary(text)  // first sentence, truncated to 120 chars
    } catch {
        return heuristicSummary(text)
    }
}

Three strategies for the fallback, in order of preference:

First, a heuristic. For a lot of tasks (first-sentence summaries, simple keyword extraction, basic category guessing), a dumb rule performs surprisingly well and costs nothing. This is the right default for non-critical features.

Second, degrade the feature. If the user does not have the model, just do not show the button. Graceful feature hiding is better than an error state, and "this works on newer iPhones" is a message users have internalized for a decade.

Third, cloud fallback. If the feature is load-bearing and the heuristic is bad, call out to a cloud model for the minority of users who need it. This is the most expensive path; treat it as the last resort, not the first.

prompt design that actually works on a 3b model#

The temptation with any LLM is to dump a wall of instructions and hope. On a 3B on-device model, that does not work. The prompts that perform well are short, constrained, and explicit about output shape.

Be specific about output shape. "Summarize in exactly one sentence" beats "summarize." "Return a JSON list of three tags, lowercase, single words" beats "give me tags."

Anchor with examples when the task is subjective. Two or three worked examples inline are worth more than a paragraph of description. The @Generable macro covers most of this for you.

Keep context small. The model's context is finite and degrades with length. Chunk long documents, summarize the chunks, summarize the summaries. Do not paste 50KB and hope.

Constrain with types. If you want a priority, define an enum. If you want a score, make it an Int in 1…5. The type system is a free prompt.

the privacy story, told correctly#

This is the part that matters for submission. Because the model runs on-device, the user's input and the model's output never leave the phone. That is a factual claim you can make to users, and you should make it prominently in any feature description. "This runs on your iPhone. Nothing is sent to our servers or to Apple" is the correct copy.

Two caveats. If the user explicitly invokes a Private Cloud Compute fallback (currently triggered for the larger server-side model, not Foundation Models itself), then Apple's privacy guarantees kick in but data has technically left the device. That is worth disclosing. And if you combine Foundation Models output with a cloud call downstream — for example, you extract an email address on-device and then send the email via your own backend — the email's journey to your server is still subject to normal privacy-manifest rules.

Your privacy manifest does not need to list Foundation Models as a data collection mechanism. It is a local API. But it should mention Apple Intelligence features in your app description if you are marketing the feature, because users now search for apps that have it.

submission, review, and the things that trip people up#

Three things the reviewer will look at that are specific to Apple Intelligence features.

First, device gating. If your marketing copy says "AI-powered summaries," your app must actually ship the feature on devices that support it and degrade gracefully elsewhere. Reviewers will test on a non-Apple-Intelligence device and will reject if the button appears and does nothing.

Second, the "AI" claim in metadata. If you use the phrase "AI" or "Apple Intelligence" in your screenshots, description, or keywords, your binary must contain an import FoundationModels (or a documented alternative) and actually invoke it. Apple has gotten aggressive about AI-washing in 2026; "powered by AI" as flavor text will get flagged.

Third, content safety. The on-device model has built-in safety filters, but reviewers will still try to break your feature with adversarial input. If your feature extracts text from user content, test it with the nastiest strings you can think of and make sure nothing crashes or produces harmful output.

a practical rollout plan#

If you are adding Foundation Models to an existing app, do it in this order:

Pick one narrow feature. Not "make the app AI-powered" — a concrete transformation like "auto-categorize the note the user just saved." Narrow features succeed; broad features ship broken.

Build the heuristic first. Before touching the framework, write the dumb-rule version. It is your fallback, and it forces you to understand the problem before you throw a model at it.

Wire the framework, add the fallback, test on a non-Intelligence device. That step catches 90% of launch bugs.

Update your privacy manifest if you are combining with cloud calls, and update your store listing to mention the feature. The on-device-only story is marketable; do not bury it.

Feature-flag the rollout. The framework is stable, but language coverage, regional availability, and device-class differences are real edge cases.

where stora fits#

When you are ready to submit, Stora's compliance engine has a specific check for Apple Intelligence features: it scans for AI-related language in your store listing, looks for the corresponding framework import in your binary, checks that your privacy manifest is consistent with your on-device-only claim, and flags mismatches before review. Our store listing generator also writes AI-feature copy that holds up to Apple's 2026 scrutiny — no "AI" filler, clear device-gating language, the structure Apple reviewers want to see.

The Foundation Models framework is one of the cheapest wins available to indie iOS developers in 2026. Free inference, strong privacy story, small API surface, and a framework reviewers actively want to see adopted. The apps that ship it well — narrow features, real fallbacks, honest marketing — are the ones that will stand out over the next twelve months. Ship small, ship careful, ship now.

How to Integrate Apple's Foundation Models Framework Into Your iOS App: A 2026 Guide