Skip to main content

Brand styling

Every image that leaves Nano Banana passes through the corporate style guide. That style guide is a long prose document (packages/server/data/style-guide.md) extracted from a curated set of reference images by analyze-style.ts. It describes the company's visual language: palette, motifs, illustration register, composition rules, and what to avoid.

The style guide is never sent verbatim to the diffusion model. Instead, it shapes the prompt at one of two stages:

Two paths to a styled prompt

1. Architect path (concepts)

When a user (web UI or propose_concepts MCP tool) starts from a rough idea, the architect flow uses the style guide to produce 3–5 concept proposals. Each proposal has:

  • a short title
  • a description
  • a refined prompt already rewritten in the brand's voice
  • thematic keywords

When the user picks one and calls generate_image with the matching concept_id, the refined prompt feeds directly into image generation — no further engineering needed.

2. Prompt-engineer path (one-shot)

When a client (MCP generate_image with a raw prompt, no concept) skips the architect flow, the pipeline runs the prompt-engineer stage first. That stage:

  1. Loads the style guide.
  2. Calls Gemini with prompts/prompt-engineer.prompt (style guide + user's raw prompt as input).
  3. Outputs a single refinedPrompt that the diffusion model receives.

This is what saves an unstructured prompt like "hero image for the Gemini Enterprise launch" from producing generic stock-style output. The refined prompt at the same point would be something like "isometric illustration of a translucent data prism … in the corporate palette (#0066D4, #FFB300, #1A1A1A) with soft directional light from upper left, no text overlays" — which is what the model actually sees.

When the engineer stage runs

Caller pathEngineer runs?Why
Web UI (architect → pick concept → generate)NoArchitect already produced a refined prompt
MCP propose_conceptsgenerate_image with concept_idNoSame as above
MCP generate_image with raw promptYesNo upstream refinement; style would be lost
Web UI direct generate (no architect)No*Web UI doesn't expose a no-architect flow

* If a future web flow skipped architect, it should also set includePromptEngineer: true when calling startGeneration.

Implementation pointer

packages/pipeline-core/src/start-generation.ts exposes StartGenerationInput.includePromptEngineer?: boolean (default false). The mcp-server passes includePromptEngineer: !args.concept_id.

The server-side handler at POST /tasks/prompt-engineer reads the pipeline's input.prompt, runs the Genkit prompt-engineer.prompt with the style guide, marks the stage completed with { refinedPrompt, keywords }, then enqueues generate-image with the refined prompt in the payload.