Skip to main content

Architecture

Nano Banana is an Nx monorepo with five packages and two Cloud Run services.

Repo layout

packages/
├── server/ # Genkit backend (web API + Cloud Tasks stage handlers)
├── mcp-server/ # Remote MCP server (OAuth 2.1 + Streamable HTTP)
├── pipeline-core/ # Shared lib: Firestore client, Cloud Tasks enqueue, GCS,
│ # PipelineDocument state machine, subscribeToPipeline,
│ # startGeneration / startArchitect
├── frontend/ # React + Vite + MUI web client
├── color-edit/ # Python HSL color replacement tool (subprocess stage)
└── docs/ # Docusaurus documentation site (this site)

skills/
└── nano-banana/ # Claude skill that wraps the MCP for AI clients

Pipeline stages

  • Architect (optional, kicked off by propose_concepts or by the web UI's ideation phase): produces 3–5 stylistically coherent concept proposals from a rough user idea. Each proposal carries a refined prompt + keywords.
  • Prompt engineer (optional, kicked off when no concept is provided): rewrites a raw prompt against the corporate style guide so the diffusion model receives brand-aware language. Web UI normally skips this (architect already refined). MCP one-shot generate_image runs it. See Brand styling.
  • Generate image: calls Gemini / Imagen with the (refined) prompt + reference icons + optional sketch. Outputs intermediate image to GCS.
  • Enhance image (optional): post-processes for sharpness / aesthetic uplift.
  • Color adjust: Python subprocess (packages/color-edit/color_tool.py) replaces brand colors via HSL with luminance preservation.

Each stage is an independent HTTP handler on nano-server. Stages are connected via three Cloud Tasks queues (text-generation, image-generation, processing) with different concurrency and rate-limit profiles.

Cloud Run services

ServiceRolePublic URL
nano-serverWeb API + Cloud Tasks stage handlershttps://nano-server-bc5eqn62ka-ez.a.run.app
nano-mcp-serverRemote MCP server (4 tools, OAuth 2.1)https://mcp.nano.cpl.ai (Cloud Run domain mapping)

Both services share @nano/pipeline-core and write to the same Firestore pipelines collection. The mcp-server enqueues Cloud Tasks that target nano-server's /tasks/* handlers — there's only ever one set of stage implementations.

Firestore state machine

Each generation creates a pipelines/<uuid> document:

{
id: string;
userId: string;
status: 'pending' | 'running' | 'completed' | 'failed';
stageOrder: string[]; // deterministic iteration order
stages: Record<string, {
status: 'pending' | 'queued' | 'running' | 'completed' | 'failed';
startedAt?, completedAt?, error?, result?: unknown;
}>;
input: { prompt, aspectRatio, enhance, ... };
results?: Array<{ image, image_uri, thumbnail }>;
createdAt, updatedAt;
}
  • The frontend onSnapshot-subscribes to the document for live progress.
  • The mcp-server subscribeToPipeline(id, cb)-subscribes server-side and pushes MCP progress notifications to the connected Claude client. The client never touches Firestore directly.
  • markStage* and markPipeline* helpers (in pipeline-core/state.ts) are the single source of truth for transitions.

Data storage

WhatWhere
Pipeline stateFirestore (default) DB, pipelines collection
Reference iconsFirestore icons collection (vector RAG)
Conversation historyFirestore conversations, history
User settingsFirestore users/<uid>/settings
Generated imagesGCS bucket cpl-gen-ai-marketing-images
OAuth state (MCP)Firestore oauth_* + mcp_jwks collections