Amazon's 6 Rules for Building With AI

Beginner14m readFull-stack developers

Amazon's 6 Rules for Building With AI Amazon's retail engineering organization ("Stores") published a short list of AI-native engineering tenets—internal rules for how teams should adopt AI.

Primary Focus

ai &-machine-learning

AI Tools Covered

amazonai-adoptionenterprise-ai

What You'll Learn

  • What the Tenets Are — and Why They're a Useful Lens
  • The Caveat — Policy Is Not Scripture
  • Tenet 6 — No Black Boxes (Auditability Beats Opaque Wins)
  • Tenet 1 — Delivery First, Cost Second
  • Tenet 2 — AI-Native Is Not AI-Exclusive
  • Tenet 3 — Cutting Edge, Not Bleeding Edge

Guide Curriculum

How to Read Amazon's AI Tenets

Learn key concepts

2 lessons
  • What the Tenets Are — and Why They're a Useful Lens1m
  • The Caveat — Policy Is Not Scripture1m

Auditability First

Learn key concepts

1 lessons
  • Tenet 6 — No Black Boxes (Auditability Beats Opaque Wins)3m

Shipping Discipline — Speed, Tools, and Upgrades

Learn key concepts

3 lessons
  • Tenet 1 — Delivery First, Cost Second2m
  • Tenet 2 — AI-Native Is Not AI-Exclusive1m
  • Tenet 3 — Cutting Edge, Not Bleeding Edge1m

Ownership and Scale

Learn key concepts

2 lessons
  • Tenet 4 — With You, Not For You1m
  • Tenet 5 — Not All Preferences Are Requirements1m

Put the Tenets Into Practice

Learn key concepts

2 lessons
  • Score Your AI Stack Against Amazon's 6 Tenets2m
  • Sources and Further Reading1m

Preview: First Lesson

How to Read Amazon's AI Tenets

What the Tenets Are — and Why They're a Useful Lens

Amazon's retail engineering organization ("Stores") published a short list of AI-native engineering tenets, and this module frames how to use them. The goal is not to admire Amazon's culture — it's to treat the tenets as a lens on tradeoffs every team shipping AI will face, while staying honest about where internal corporate policy and engineering judgment blur together.

Amazon's retail engineering organization ("Stores") published a short list of AI-native engineering tenets—internal rules for how teams should adopt AI. The list was reported by Business Insider from an internal document; Amazon confirmed the broad direction in a statement to the outlet. You do not need to admire Amazon's culture to treat the tenets as a useful lens: they are explicit tradeoffs—speed vs. cost, novelty vs. control, scale vs. niceties—that every team shipping AI will face.

This guide walks through all six tenets in an order that matches how most production systems break: auditability first, then execution speed, tool choice, upgrade discipline, ownership, and organizational scaling. Each section states the rule plainly, what Amazon says it is optimizing for, and what it means for your codebase or platform—not slogans, but decisions you can implement this week.

Free Access

Start learning with this comprehensive guide

This guide includes:

5 modules with 10 lessons
14m estimated reading time

About the Author

H
✨ Vibe Coder
@hiram-clark

Hiram Clark is the founder and managing editor of vybecoding.ai and sets editorial direction for the guides and news published here. Articles are drafted with AI assistance and edited before publication. He works hands-on with the AI development tools, workflows, and infrastructure covered on the site.

Full Guide Content

Complete lesson text — start the interactive course above for exercises and progress tracking.

Module 1How to Read Amazon's AI Tenets

1.1What the Tenets Are — and Why They're a Useful Lens

Amazon's retail engineering organization ("Stores") published a short list of AI-native engineering tenets, and this module frames how to use them. The goal is not to admire Amazon's culture — it's to treat the tenets as a lens on tradeoffs every team shipping AI will face, while staying honest about where internal corporate policy and engineering judgment blur together.

Amazon's retail engineering organization ("Stores") published a short list of AI-native engineering tenets—internal rules for how teams should adopt AI. The list was reported by Business Insider from an internal document; Amazon confirmed the broad direction in a statement to the outlet. You do not need to admire Amazon's culture to treat the tenets as a useful lens: they are explicit tradeoffs—speed vs. cost, novelty vs. control, scale vs. niceties—that every team shipping AI will face.

This guide walks through all six tenets in an order that matches how most production systems break: auditability first, then execution speed, tool choice, upgrade discipline, ownership, and organizational scaling. Each section states the rule plainly, what Amazon says it is optimizing for, and what it means for your codebase or platform—not slogans, but decisions you can implement this week.

1.2The Caveat — Policy Is Not Scripture

One caveat: internal policy at a trillion-dollar retailer is not scripture. The tenets bundle engineering judgment with organizational politics—central platforms need leverage, and "not all preferences are requirements" can double as cover for slow internal roadmaps. Read them as pressure-tested tradeoffs, then steal the mechanics that fit your size and risk profile.


Module 2Auditability First

2.1Tenet 6 — No Black Boxes (Auditability Beats Opaque Wins)

The order matters: Amazon puts auditability ahead of raw performance, and so does this guide. This module covers Tenet 6 — the rule that every deployed AI system must be understandable and traceable, even at the cost of efficiency — because it's the foundation that makes every other tenet tractable.

The rule (plain language): Every deployed AI-related solution must be auditable, understandable, and traceable. Amazon's wording goes further: it will give up performance and cost improvements if those gains come at the expense of human understanding. What Amazon is actually doing: Drawing a bright line between "works in demos" and "works under incident review." When something misbehaves—bad retrieval, wrong classification, a jailbreak, silent drift—the organization wants humans (and downstream automation) to reconstruct what happened without reverse-engineering tensor soup. What it means for your codebase: Treat observability as a product requirement, not an afterthought. That includes:
  • Structured decision logs for anything that picks a model, a prompt variant, a tool, or a policy path. Free-form chat logs are not enough; you need machine-queryable fields.
  • Versioned prompts and configs checked into source control or a config service with history—so "what did we deploy Tuesday?" is answerable in minutes.
  • Explicit failure and escalation paths when confidence is low—instead of silently averaging away uncertainty.
Concrete example: When routing a request across models (for example Haiku vs. Sonnet), emit a single structured record per decision—not prose reasoning in the user thread, but an operational trace:
{
  "event": "model_route_decided",
  "request_id": "req_8f3a…",
  "chosen_model": "claude-sonnet-4",
  "reason": "input_complexity_score=0.82 > threshold_0.55",
  "candidates_evaluated": ["haiku", "sonnet"],
  "prompt_template_version": "support-agent@v37",
  "policy_flags": { "pii_redaction": true }
}

If you cannot produce that artifact during a five-alarm incident, you do not yet meet the spirit of this tenet.

Critical angle: This rule is expensive. Full traceability adds latency (logging I/O), storage, and process overhead. Amazon is stating a preference order: understanding > raw efficiency. Smaller teams sometimes invert that to ship faster; the tenet is a reminder that the bill often arrives as reputational or compliance debt, not as an immediate metric dip. Where teams miss the bar: "We log prompts" without schemas; "we use OpenTelemetry" without linking spans to business outcomes; or shipping an agent that chains tools with no persisted graph of which tool ran when with what arguments. For regulated environments, auditability also overlaps with access controls on training data and inference logs—another reason Tenet 6 belongs at the front of the list. Testing tie-in: If you cannot write an integration test that asserts "given this input, we route to model A and record reason code R," you probably do not understand your own system well enough to defend it in production.

Module 3Shipping Discipline — Speed, Tools, and Upgrades

3.1Tenet 1 — Delivery First, Cost Second

With auditability as the foundation, the next three tenets govern how you actually ship: deliver working solutions before optimizing cost, reach for AI only when it's genuinely the best tool, and upgrade models deliberately rather than chasing every release.

The rule: Prioritize working, effective solutions over cheap ones—build now, then optimize compute cost later. What Amazon is optimizing for: Throughput of learning. If teams freeze because GPUs are pricey or token budgets are scary, adoption stalls and you lose calendar time—which is usually more expensive than inefficient inference for a bounded pilot. Developer takeaway: Ship with guardrails, not with perfect FinOps on day one. Instrument spend early (even coarse dashboards), but do not let cost modeling block a validated use case. Pair this tenet with Tenet 6: cheap systems that nobody can debug are not truly "delivered." When to invert the order: If your pilot could burn a material share of monthly runway in a week, cost is delivery risk—cap budgets, throttle concurrency, and ship smaller slices. The tenet assumes adult supervision on spend, not denial of arithmetic. What is missing from the slogan: It says little about quality gates—velocity without evaluation can ship harmful automation fast. Pair delivery speed with offline evals and staged rollouts so "ship now" does not mean "deceive later."

3.2Tenet 2 — AI-Native Is Not AI-Exclusive

The rule: Use the best approach for the problem. Sometimes that is AI; sometimes it is an LLM; often it is neither. What Amazon is pushing back on: LLM-as-default architecture—every ticket becomes a chat completion, every workflow becomes RAG. That creates fragility and variable latency where deterministic code would be clearer. Developer takeaway: Before you reach for an API call, ask whether rules, retrieval alone, classical ML, or plain CRUD solves 90% of cases. Reserve generative models for genuinely open-ended language tasks or synthesis—then wrap them with tests and evaluation harnesses. Your repo should contain non-AI paths that keep working when a provider blips. Examples worth standardizing: Regex or parser-backed validators for structured forms; deterministic ranking for search previews; cached embeddings with explicit refresh policies; human-approved macros for support replies. Let the LLM draft paragraphs where variability is the product; let code enforce invariants where variability is a defect.

3.3Tenet 3 — Cutting Edge, Not Bleeding Edge

The rule: Do not try to keep pace with every AI release. Evaluate upgrades deliberately; switch when benefits outweigh costs, and accept that skipping the newest model can be rational. What Amazon is avoiding: Permanent upgrade churn—new tokenizer, new failure modes, new eval gaps—without a migration budget. Developer takeaway: Gate model changes behind benchmarks you actually run: regression suites, offline evals on held-out tasks, side-by-side shadow traffic. Document a model policy (who approves upgrades, what metrics must hold). This is how you keep Tenet 6 tractable: fewer surprises, clearer diffs between releases. Hidden cost of "latest": Tokenization shifts, safety refusals, and tool-calling quirks can change overnight. Treat upgrades like database migrations: backups (prompt snapshots), replay tests, and rollback switches—not a Friday afternoon dependency bump.

Module 4Ownership and Scale

4.1Tenet 4 — With You, Not For You

The last two tenets are about people and platforms, not models. Central AI teams can't own every domain, and shared platforms can't honor every bespoke request. This module covers who must stay accountable (Tenet 4) and how to say no without losing customers (Tenet 5).

The rule: Central AI teams will not become domain experts in every product area. Pilot participants must bring domain expertise and time. What Amazon is signaling: Platforms enable; they do not absorb accountability for business semantics. If product engineers ghost the pilot, you get generic widgets bolted onto the wrong workflow. Developer takeaway: Staff AI projects with embedded domain owners—PM/engineer pairs who own metrics tied to customer outcomes. Require labeled examples, rubrics, and acceptance tests from those owners. Your stack should expose observability (Tenet 6) so domain experts can iterate prompts and policies without opaque handoffs. Political reality: Central teams that "drop in AI" without embedded owners become scapegoats for bad metrics. Write a one-page pilot charter: who approves scope, who owns customer-visible failures, and what "done" means in numbers—not vibes.

4.2Tenet 5 — Not All Preferences Are Requirements

The rule: Aim to delight customers, but do not satisfy every preference. Optimize for many teams / broad scale, not bespoke exceptions everywhere. What Amazon is optimizing for: Maintainability of shared platforms—similar to how core libraries avoid infinite flags. Developer takeaway: Capture feedback, then prioritize with usage data. Say no (politely) to one-off knobs that explode test matrices. Offer extension points—webhooks, exports, configuration within bounds—instead of custom pipelines per stakeholder. This pairs with Tenet 3: fewer branches means fewer half-tested paths. Customer nuance: External customers deserve clarity on what you will not customize; internal customers deserve the same. Platform teams fail when they confuse empathy with obligation—negotiate SLAs and interfaces, not infinite bespoke behavior.

Module 5Put the Tenets Into Practice

5.1Score Your AI Stack Against Amazon's 6 Tenets

Tenets are only useful if you measure yourself against them. This module turns the six rules into a self-audit you can run on your own stack today, plus the original source if you want to go deeper.

Use this as a blunt self-audit. For each line, pick Strong / Partial / Missing.

No black boxes
  • [ ] Decisions (routing, policy, tool choice) emit structured, queryable logs tied to request IDs.
  • [ ] Prompts/configs are versioned and deploys are reproducible.
  • [ ] On-call can trace an incident end-to-end without raw prompt dumps alone.
Delivery first, cost second
  • [ ] You have shipped at least one production path end-to-end; cost tuning is scheduled, not blocking launch.
  • [ ] Spend visibility exists—even if optimization lags.
AI-native is not AI-exclusive
  • [ ] Non-LLM baselines exist for core workflows; LLMs justify their use with measurable lift.
Cutting edge, not bleeding edge
  • [ ] Model upgrades require an explicit evaluation gate and rollback plan.
With you, not for you
  • [ ] Domain owners co-own metrics, datasets, and acceptance criteria—not an AI center of excellence acting alone.
Not all preferences are requirements
  • [ ] Roadmap reflects scaled impact; bespoke asks are triaged and mostly routed into platforms or extensions.

If your stack scores mostly Strong on Tenet 6 and Partial elsewhere, you are directionally aligned with what Amazon encoded: trustworthy systems first, throughput second, clever models third.

5.2Sources and Further Reading