Early access · one key, every model

AI Image Generation API: the straight-answer comparison

An image generation API turns a text prompt into an image over HTTPS. Current per-image prices run from under a cent to $0.21 depending on model and quality tier. Below: exact prices, working code, a capability matrix and the licensing fine print, for every model that matters in 2026.

Unified schema across modelsPer-image billingFree tier at launch
The 30-second verdict · July 2026
Cheapest
FLUX schnell / dev

From ~$0.003 per image via fal, Together or Replicate

Best text rendering
Imagen 4 · Ideogram 3

Logos, posters, UI copy that actually spells

Best editing / instructions
gpt-image-2

Multi-turn edits, masks, reference images

Best open-weight
FLUX dev · SD 3.5

Self-host, fine-tune, own the stack

The money table

Per-image API pricing, model by model

List prices from provider docs and public aggregator rates, checked July 2026. Where a model has quality tiers, all tiers are shown; the 35x spread between gpt-image-2 low and high is the kind of thing vendors don't put in headlines.

Prices are per 1024x1024-class image unless noted. Aggregator hosting (fal, Replicate, Together) sometimes beats first-party pricing; always check both.
ModelPrice per imageMax resolutionEdit / inpaintText renderingBest atAccess
gpt-image-2 (OpenAI)$0.006 low · $0.053 med · $0.211 high4K-class (3840x2160)Yes: masks, refs, multi-turnStrongInstruction-following, iterative editingOpenAI API
gpt-image-1-mini~$0.0051024-classYesDecentHigh-volume drafts on OpenAI stackOpenAI API
Imagen 4 (Google) batch = 50% off$0.02 Fast · $0.04 Standard · ~$0.12 Ultra2KPartial (via Gemini)ExcellentText-in-image, speed at qualityGemini API, Vertex
Gemini 3 Pro Image (Nano Banana Pro)~$0.039 to 0.24 by tier4KYes: semantic masks, 14 ref imagesExcellentComplex composition, grounded generationGemini API
FLUX 2 pro / 1.1 pro (BFL) price-performance king$0.03 to 0.06 pro · ~$0.003 schnell2K (4K upscale)Yes: Kontext editingGoodPhotorealism per dollarBFL API, fal, Replicate, Together
Ideogram 3$0.02 to 0.102KYesExcellentTypography, posters, logosIdeogram API, aggregators
Stable Diffusion 3.5 open weights$0.012 to 0.04 hosted2KYesFairSelf-hosting, fine-tuning, controlStability API, aggregators, self-host
Recraft V3$0.03 to 0.092K + true SVGYesExcellentBrand design, vector outputRecraft API, aggregators
Seedream 4 (ByteDance)~$0.02 to 0.044KYesGoodValue 4K generationBytePlus, aggregators
Aggregators (fal · Replicate · Together)$0.003 to 0.06 by modelModel-dependentModel-dependentModel-dependentOne key, model routing, often cheapestTheir unified APIs

Integration

The same request in cURL, Python and Node

Image APIs are synchronous at standard sizes: one request, one image back in 2 to 15 seconds. Here is the task "generate a 1024x1024 product photo" against our unified endpoint. The model is a string; swapping providers is a config change.

curl
curl -X POST \
 https://api.aiimagegenerationapi.com/v1/images \
 -H "Authorization: Bearer $KEY" \
 -d '{
  "model": "flux-2-pro",
  "prompt": "studio photo of a
   ceramic mug on walnut desk,
   soft window light",
  "size": "1024x1024"
 }'

# → { "url": "https://cdn...",
#     "model": "flux-2-pro",
#     "cost_usd": 0.03 }
python
import requests

img = requests.post(
  f"{BASE}/v1/images",
  headers={"Authorization":
           f"Bearer {KEY}"},
  json={
    "model": "gpt-image-2",
    "prompt": prompt,
    "size": "1024x1024",
    "quality": "medium"
  }
).json()

print(img["url"], img["cost_usd"])
node
const res = await fetch(
  `${BASE}/v1/images`, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${KEY}`
  },
  body: JSON.stringify({
    model: "imagen-4-fast",
    prompt,
    size: "1024x1024"
  })
});
const { url, cost_usd } =
  await res.json();

Production notes: request 2048-class or 4K only when the use demands it, since price scales with pixels on most providers. Cache aggressively; the same prompt regenerated is money burned. And log per-request cost from day one, because "which feature is spending our image budget" is the first question finance asks.

Beyond generation

Capability matrix: editing, references, vectors

Raw text-to-image is table stakes. The differentiators in 2026 are edit operations and reference-image control:

CapabilityWhat it doesStrongest options
Inpainting / masksRegenerate only a masked region: swap a product, fix a handgpt-image-2, FLUX Kontext, SD 3.5
Instruction editing"Make the background white" on an existing image, no mask neededgpt-image-2, Gemini 3 Pro Image, FLUX Kontext
Reference imagesLock a face, product or style across generationsGemini 3 Pro Image (14 refs), gpt-image-2, Seedream 4
Multi-turn editingConversational refinement: each request edits the last resultgpt-image-2, Gemini 3 Pro Image
Text-in-imageLegible typography: posters, packaging, UI mockupsImagen 4, Ideogram 3, Recraft V3
Vector / SVG outputTrue scalable vectors, not raster tracingsRecraft V3 (unique at production quality)
Fine-tuning / LoRATrain the model on your product or styleFLUX dev, SD 3.5 (open weights), via fal or Replicate
4K generationNative 3840px-class output without upscalingGemini 3 Pro Image, gpt-image-2, Seedream 4

Cost at scale

What your monthly volume actually costs

Standard-quality 1024-class images, list prices, no committed-use discounts. This is the table to screenshot for the budget meeting.

Monthly volumeFLUX schnell (~$0.003)gpt-image-2 low ($0.006)Imagen 4 Fast ($0.02)FLUX 2 pro ($0.03)gpt-image-2 high ($0.211)
1,000 images$3$6$20$30$211
10,000 images$30$60$200$300$2,110
100,000 images$300$600$2,000$3,000$21,100

The 70x spread between the cheapest and priciest cell is the whole argument for model routing. The pattern that works: generate drafts and thumbnails on a schnell-class model, publish hero assets from a pro-class one, and reserve premium tiers (gpt-image-2 high, Imagen Ultra) for images where text legibility or fine detail is the product. Google's batch API halves Imagen prices for non-realtime workloads; use it for anything a queue can absorb.

The fine print

Licensing and commercial use, in plain words

Hosted closed models (gpt-image, Imagen, Ideogram, Recraft): outputs are yours to use commercially on paid tiers. You own what you generate to the extent the law allows; note that purely AI-generated images may not qualify for copyright protection in the US, which matters if exclusivity is the point.

Open-weight models: read the specific license. Stable Diffusion 3.5 is free under Stability's Community License up to $1M annual revenue, then requires an enterprise deal. FLUX schnell is Apache 2.0 (fully free), but FLUX dev is non-commercial without a paid license key from Black Forest Labs. Shipping FLUX dev output commercially from a self-hosted box without that key is the most common licensing mistake in this space.

Client work: hosted-API outputs can be delivered to clients on all major providers. Keep generation records; ad platforms and stock sites increasingly require AI-provenance disclosure (C2PA metadata), and Meta and Google both auto-label detected synthetic media.

Use-case router

Which model for which job

You're generatingUseWhy
Ecommerce product shotsFLUX 2 pro, gpt-image-2Photorealism plus reference-image support keeps the product accurate.
Blog and social heroes at volumeFLUX schnell, Imagen 4 FastCents per image; quality ceiling irrelevant at feed sizes.
Posters, packaging, anything with wordsImagen 4, Ideogram 3The only models where typography reliably spells.
Brand assets and logosRecraft V3True SVG output; hand it straight to the design tool.
User avatars / personalizationFLUX dev + LoRA via fal or ReplicateFine-tune once, generate per-user at open-weight prices.
Iterative creative toolinggpt-image-2, Gemini 3 Pro ImageMulti-turn editing means users refine instead of re-rolling.

FAQ

Image generation API questions, answered

Is there a free AI image generation API?

Several providers offer free tiers or trial credits: hosted open-weight models are the cheapest real path at roughly $0.003 per image, which is effectively free at prototype volume. Fully unlimited free APIs either watermark output, rate-limit hard or monetize your data. Our unified API includes a free tier at launch; join early access above.

What is the cheapest image generation API?

FLUX schnell hosted on fal, Together or Replicate runs around $0.003 per 1024-class image, with gpt-image-1-mini (~$0.005) and gpt-image-2 low ($0.006) close behind. At 100k images per month the spread between cheap and premium tiers is $300 versus $21,000, so tier routing matters more than picking one "cheap" vendor.

How is image API pricing calculated: per image or per token?

Most providers price per image, scaled by resolution and quality tier. OpenAI internally meters image tokens (which is why gpt-image-2 prices vary by quality: $0.006 to $0.211), and Google halves Imagen prices through its batch API. Aggregators normalize everything to a flat per-image rate, which makes cost forecasting simpler.

Can I use generated images commercially?

Yes on paid tiers of every major hosted provider. The traps are open-weight licenses (FLUX dev requires a paid license for commercial use; SD 3.5 is free only under $1M revenue) and copyright: purely AI-generated images may not be copyrightable in the US, so competitors can legally reuse your generated assets.

Which API is best for text rendering inside images?

Imagen 4 and Ideogram 3 lead for legible typography, with Recraft V3 strongest when the text is part of a designed asset like a logo or poster. Classic diffusion models still garble long strings; if words are central to the image, this single capability should drive your model choice.

Can these APIs edit existing images, not just generate new ones?

Yes, and it is the fastest-moving capability. gpt-image-2 and Gemini 3 Pro Image accept plain-language edit instructions on uploaded images, FLUX Kontext and SD 3.5 support mask-based inpainting, and reference-image inputs let you lock a face or product across generations. See the capability matrix above.

Should I use fal, Replicate or Together instead of going direct?

Aggregators give you one key across dozens of models, often at or below first-party pricing, with the freedom to reroute when the leaderboard flips. Go direct when you need the newest checkpoints on day one, enterprise SLAs or negotiated volume rates. Most product teams are better served by keeping the model name a config string.

What resolution can image APIs generate?

1024x1024 is the standard unit. Gemini 3 Pro Image, gpt-image-2 and Seedream 4 generate native 4K-class output; most others top out at 2K with optional upscaling. Price scales with pixels, so generate at the size you will actually serve.

Can I self-host an image generation model?

Yes: FLUX dev/schnell and SD 3.5 run on a single 24GB-plus GPU. You gain fixed costs, privacy and fine-tuning freedom, and take on ops, safety filtering and license compliance (FLUX dev needs a commercial key). Break-even versus hosted APIs typically lands in the tens of thousands of images per month.

Every image model. One API key.

Unified schema, per-image billing with cost returned on every response, and model routing across everything in the table above. Early access is open.