FLUX.2 [klein] 4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer developed by Black Forest Labs, released January 15, 2026 under Apache 2.0. It is part of the FLUX.2 [klein] model family — BFL's fastest image models to date. The architecture is a unified generative-editing backbone: the same weights handle text-to-image generation, single-reference editing, and multi-reference generation without switching pipelines. Built on Rectified Flow, the model finds the straightest possible path between noise and image, enabling high-quality generation in as few as 4 inference steps (sub-second on enterprise GPUs). The distilled variant (this checkpoint) is step-distilled for speed; the undistilled Base variant (FLUX.2-klein-base-4B) is available for LoRA training and fine-tuning. The model fits in ~13GB VRAM (full bf16: 23.7GB checkpoint; quantized fp8/nvfp4/GGUF variants available for tighter budgets) and is accessible on RTX 3090/4070 and above. Pixel-layer watermarking (C2PA standard) is implemented in the inference code for content provenance. Safety filtering of both inputs and outputs is encouraged for all deployments.

Black Forest Labs Image Context N/A

Get API Key

Try in Playground

Free Trial Credit On first TopUp of minimum $5

$1.00

api_example.sh

curl -X POST "https://platform.qubrid.com/v1/images/generations" \
  -H "Authorization: Bearer QUBRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "flux-klein-4b",
  "prompt": "cinematic shot of a lone astronaut standing on a desolate alien planet, glowing orange sunset sky, dust storms swirling, dramatic lighting, ultra-wide lens composition, movie still aesthetic, realistic space suit details, volumetric atmosphere, 8k sci-fi film scene",
  "seed": -1,
  "aspect_ratio": "1:1",
  "output_format": "jpg",
  "output_quality": 80
}'

import requests
import json

url = "https://platform.qubrid.com/v1/images/generations"
headers = {
    "Authorization": "Bearer QUBRID_API_KEY",
    "Content-Type": "application/json"
}

data = {
  "model": "flux-klein-4b",
  "prompt": "cinematic shot of a lone astronaut standing on a desolate alien planet, glowing orange sunset sky, dust storms swirling, dramatic lighting, ultra-wide lens composition, movie still aesthetic, realistic space suit details, volumetric atmosphere, 8k sci-fi film scene",
  "seed": -1,
  "aspect_ratio": "1:1",
  "output_format": "jpg",
  "output_quality": 80
}

response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    with open("generated_image.png", "wb") as f:
        f.write(response.content)
        print("Image saved to generated_image.png")
else:
    print(f"Error: {response.status_code}")
    print(response.text)

const body = {
  "model": "flux-klein-4b",
  "prompt": "cinematic shot of a lone astronaut standing on a desolate alien planet, glowing orange sunset sky, dust storms swirling, dramatic lighting, ultra-wide lens composition, movie still aesthetic, realistic space suit details, volumetric atmosphere, 8k sci-fi film scene",
  "seed": -1,
  "aspect_ratio": "1:1",
  "output_format": "jpg",
  "output_quality": 80
};

const res = await fetch("https://platform.qubrid.com/v1/images/generations", {
  method: "POST",
  headers: {
    Authorization: "Bearer QUBRID_API_KEY",
    "Content-Type": "application/json"
  },
  body: JSON.stringify(body)
});

const result = await res.json();

package main

import (
  "bytes"
  "encoding/json"
  "net/http"
)

func main() {
  url := "https://platform.qubrid.com/v1/images/generations"

  data := {
  "model": "flux-klein-4b",
  "prompt": "cinematic shot of a lone astronaut standing on a desolate alien planet, glowing orange sunset sky, dust storms swirling, dramatic lighting, ultra-wide lens composition, movie still aesthetic, realistic space suit details, volumetric atmosphere, 8k sci-fi film scene",
  "seed": -1,
  "aspect_ratio": "1:1",
  "output_format": "jpg",
  "output_quality": 80
}
  jsonData, _ := json.Marshal(data)

  req, _ := http.NewRequest("POST", url, bytes.NewBuffer(jsonData))
  req.Header.Set("Authorization", "Bearer QUBRID_API_KEY")
  req.Header.Set("Content-Type", "application/json")

  client := &http.Client{}
  res, _ := client.Do(req)
}

Technical Specifications

Model Architecture & Performance

Variant Distilled (step-distilled, 4-step inference)

Model Size 4B parameters (~23.7GB bf16 checkpoint; ~13GB VRAM at runtime)

Quantization None (fp8, nvfp4, GGUF community and official variants available)

Architecture Rectified Flow Transformer — unified generative-editing backbone; same weights for T2I, single-reference editing, and multi-reference generation

Precision bfloat16

License Apache 2.0

Release Date January 15, 2026

Developers Black Forest Labs

Pricing

Pay-per-use, no commitments

Per Image $0.0001/Image

API Reference

Complete parameter documentation

Parameter	Type	Default	Description
seed	number	-1	Random seed for reproducible generation. Use -1 for random results.
aspect_ratio	string	1:1	Aspect ratio of the output image. Options: 1:1, 16:9, 21:9, 3:2, 2:3, 4:5, 5:4, 3:4, 4:3, 9:16, 9:21.
output_format	string	jpg	Format of the generated image. Options: png, jpg, webp.
output_quality	number	80	Compression quality for jpg/webp output (1–100). Not applicable for png outputs.

Explore the full request and response schema in our external API documentation

Performance

Strengths & considerations

Strengths	Considerations
4B parameter rectified flow transformer — sub-second inference in 4 steps Unified generative-editing backbone: T2I + single-reference + multi-reference from the same weights Fits in ~13GB VRAM — accessible on RTX 3090/4070 and above Apache 2.0 license — fully open for commercial use with no restrictions Rectified flow architecture: straight noise-to-image paths = fewer steps, faster generation Matches quality of much larger models on the Pareto frontier for quality vs. latency fp8, nvfp4, and GGUF quantized variants available for sub-13GB deployment Pixel-layer C2PA watermarking built into inference code for content provenance Diffusers-native via Flux2KleinPipeline	Distilled checkpoint is optimized for speed — for LoRA training or fine-tuning, use the Base variant (FLUX.2-klein-base-4B) May amplify biases observed in training data Not intended or able to provide factual information; text rendering in images may be inaccurate Prompt following is sensitive to prompting style Full bf16 checkpoint is 23.7GB — quantization (fp8/GGUF) required for strict 13GB deployments

Strengths

Considerations

4B parameter rectified flow transformer — sub-second inference in 4 steps

Unified generative-editing backbone: T2I + single-reference + multi-reference from the same weights

Fits in ~13GB VRAM — accessible on RTX 3090/4070 and above

Apache 2.0 license — fully open for commercial use with no restrictions

Rectified flow architecture: straight noise-to-image paths = fewer steps, faster generation

Matches quality of much larger models on the Pareto frontier for quality vs. latency

fp8, nvfp4, and GGUF quantized variants available for sub-13GB deployment

Pixel-layer C2PA watermarking built into inference code for content provenance

Diffusers-native via Flux2KleinPipeline

Distilled checkpoint is optimized for speed — for LoRA training or fine-tuning, use the Base variant (FLUX.2-klein-base-4B)

May amplify biases observed in training data

Not intended or able to provide factual information; text rendering in images may be inaccurate

Prompt following is sensitive to prompting style

Full bf16 checkpoint is 23.7GB — quantization (fp8/GGUF) required for strict 13GB deployments

Use cases

Recommended applications for this model

Real-time and sub-second text-to-image generation

Multi-reference image editing (style, character, object transfer)

Single-reference image editing and style transfer

Local deployment on consumer GPUs (RTX 3090/4070+)

Edge deployment and production pipelines requiring Apache 2.0 licensing

Rapid prototyping and creative workflows

Enterprise
Platform Integration

Docker Support

Official Docker images for containerized deployments

Kubernetes Ready

Production-grade KBS manifests and Helm charts

SDK Libraries

Official SDKs for Python, Javascript, Go, and Java

Don't let your AI control you. Control your AI the Qubrid way!

Have questions? Want to Partner with us? Looking for larger deployments or custom fine-tuning? Let's collaborate on the right setup for your workloads.

Get Started

"Qubrid helped us turn a collection of AI scripts into structured production workflows. We now have better reliability, visibility, and control over every run."

AI Infrastructure Team

Automation & Orchestration