Image Generation - Rilo Documentation

Rilo’s ImageGenerationTool uses Google’s Gemini models to generate, edit, and create images from text prompts. Perfect for creating visuals, editing images, or generating person photos.

Overview

ImageGenerationTool supports:

Text-to-Image: Generate images from text descriptions
Image Editing: Edit existing images with text prompts
Person Generation: Create person photos from reference images
Multiple Models: Flash (fast) and Pro (quality) models

Image generation uses Google’s Gemini models via Vertex AI. Images are generated in high quality and saved to your workflow output.

Models

Gemini 2.5 Flash Image

Best for: Speed and efficiency

Aspect Ratios: 1:1, 16:9, 9:16, 3:4, 4:3
Image Size: Fixed (not configurable)
Max Reference Images: 3
Use Case: Quick image generation, rapid iterations

Gemini 3 Pro Image Preview

Best for: Quality and flexibility

Aspect Ratios: 1:1, 16:9, 9:16, 3:4, 4:3, 21:9
Image Sizes: 1K, 2K, 4K
Max Reference Images: 5
Use Case: High-quality images, detailed editing

Use Cases

1. Text-to-Image Generation

Generate images from text descriptions:

from library.image_generation_tool import ImageGenerationTool, ImageGenerationConfig

tool = ImageGenerationTool()
config = ImageGenerationConfig(
    model="gemini-2.5-flash-image",
    aspect_ratio="16:9"
)

result = await tool.generate_image(
    prompt="A sunset over mountains with a lake in the foreground",
    config=config
)

2. Image Editing

Edit existing images with text prompts:

# Get image from previous block
input_image_path = inputs["previous_block"]["image_path"]

config = ImageGenerationConfig(
    model="gemini-3-pro-image-preview",
    aspect_ratio="1:1",
    reference_images=[input_image_path]  # Image to edit
)

result = await tool.generate_image(
    prompt="Add a rainbow in the sky of this image",
    config=config
)

3. Person Generation

Generate person photos from reference images:

# Reference images from config or previous blocks
person_photos = image_generation_config.get("reference_images", [])

config = ImageGenerationConfig(
    model="gemini-3-pro-image-preview",
    aspect_ratio="3:4",
    reference_images=person_photos
)

result = await tool.generate_image(
    prompt="A professional headshot of this person in business attire",
    config=config
)

Configuration

Config Fields

Create a single config field of type image_generation_config:

{
  "name": "image_generation_config",
  "value": {
    "model": "gemini-2.5-flash-image",
    "aspect_ratio": "16:9",
    "output_format": "png",
    "style_instructions": "Photorealistic, high quality"
  },
  "field_type": "image_generation_config"
}

Configuration Options

Field	Required	Description	Options
`model`	Yes	Model to use	`gemini-2.5-flash-image`, `gemini-3-pro-image-preview`
`aspect_ratio`	Yes	Image aspect ratio	Must be supported by selected model
`output_format`	No	Output format	`png` (default), `jpeg`
`style_instructions`	No	Style guidance	Text appended to prompts
`image_size`	No	Image size (Pro only)	`1K`, `2K`, `4K`
`reference_images`	No	Reference images	List of image paths

Aspect Ratios

Choose aspect ratios based on your use case:

1:1

Square format. Good for social media posts.

16:9

Widescreen. Perfect for banners and headers.

9:16

Vertical. Ideal for mobile content.

3:4

Portrait. Great for photos and portraits.

Output Handling

Saving Generated Images

Generated images must be copied to the block’s output directory:

import shutil

# Block output directory (already exists)
block_output_dir = "data/{block_name}_data"

# Generate image
result = await tool.generate_image(prompt=prompt, config=config)

# Copy to block output directory
filename = f"generated_image.{result.format}"
final_path = f"{block_output_dir}/{filename}"
shutil.copy(result.file_path, final_path)

# Return output
output = {
    "image_path": final_path,
    "width": result.width,
    "height": result.height,
    "format": result.format
}

Always copy generated images to the block output directory. Temporary files are cleaned up after execution.

Style Instructions

Add style guidance to your prompts:

config = ImageGenerationConfig(
    model="gemini-3-pro-image-preview",
    aspect_ratio="16:9",
    style_instructions="Photorealistic, cinematic lighting, high detail"
)

result = await tool.generate_image(
    prompt="A futuristic city at night",
    config=config
)

Style instructions are automatically appended to your prompt to ensure consistent styling.

Best Practices

Choose the Right Model

Use Flash for speed, Pro for quality. Flash is faster and cheaper, Pro produces higher quality images.

Be Specific in Prompts

Detailed prompts produce better results. Include style, mood, composition details.

Use Appropriate Aspect Ratios

Match aspect ratios to your use case. 16:9 for banners, 1:1 for social media.

Save Images Properly

Always copy generated images to the block output directory. Don’t rely on temporary paths.

Limitations

Model Limitations

Flash Model: Fixed image size, limited reference images (3)
Pro Model: Higher cost, slower generation
Aspect Ratios: Must match model capabilities
Reference Images: Limited by model (3-5 images)

General Limitations

No animation: Static images only
No video: Cannot generate videos
File formats: PNG or JPEG only
Size limits: Maximum 4K for Pro model

Image generation costs 100 credits per image. Cost is charged per generation regardless of model or size.

Troubleshooting

Image not generating

Check that the model supports the aspect ratio
Verify the prompt is clear and specific
Ensure config fields are set correctly
Check credit balance

Image quality issues

Try the Pro model for better quality
Add style instructions to the config
Be more specific in the prompt
Use higher image size (Pro model)

Reference images not working

Verify image paths are correct
Check that model supports reference images
Ensure reference images are accessible
Check image format (PNG/JPEG)

Vision Analysis - Analyze generated images
Code Generation - How workflows use image generation
Configs - Configure image generation settings

Image generation is a powerful feature for creating visuals. Experiment with different models, aspect ratios, and prompts to get the best results.

​Overview

​Models

​Gemini 2.5 Flash Image

​Gemini 3 Pro Image Preview

​Use Cases

​1. Text-to-Image Generation

​2. Image Editing

​3. Person Generation

​Configuration

​Config Fields

​Configuration Options

​Aspect Ratios

1:1

16:9

9:16

3:4

​Output Handling

​Saving Generated Images

​Style Instructions

​Best Practices

​Limitations

​Model Limitations

​General Limitations

​Troubleshooting

​Related Features

Overview

Models

Gemini 2.5 Flash Image

Gemini 3 Pro Image Preview

Use Cases

1. Text-to-Image Generation

2. Image Editing

3. Person Generation

Configuration

Config Fields

Configuration Options

Aspect Ratios

Output Handling

Saving Generated Images

Style Instructions

Best Practices

Limitations

Model Limitations

General Limitations

Troubleshooting

Related Features