Skip to main content
Rilo’s ImageGenerationTool uses Google’s Gemini models to generate, edit, and create images from text prompts. Perfect for creating visuals, editing images, or generating person photos.

Overview

ImageGenerationTool supports:
  • Text-to-Image: Generate images from text descriptions
  • Image Editing: Edit existing images with text prompts
  • Person Generation: Create person photos from reference images
  • Multiple Models: Flash (fast) and Pro (quality) models
Image generation uses Google’s Gemini models via Vertex AI. Images are generated in high quality and saved to your workflow output.

Models

Gemini 2.5 Flash Image

Best for: Speed and efficiency
  • Aspect Ratios: 1:1, 16:9, 9:16, 3:4, 4:3
  • Image Size: Fixed (not configurable)
  • Max Reference Images: 3
  • Use Case: Quick image generation, rapid iterations

Gemini 3 Pro Image Preview

Best for: Quality and flexibility
  • Aspect Ratios: 1:1, 16:9, 9:16, 3:4, 4:3, 21:9
  • Image Sizes: 1K, 2K, 4K
  • Max Reference Images: 5
  • Use Case: High-quality images, detailed editing

Use Cases

1. Text-to-Image Generation

Generate images from text descriptions:
from library.image_generation_tool import ImageGenerationTool, ImageGenerationConfig

tool = ImageGenerationTool()
config = ImageGenerationConfig(
    model="gemini-2.5-flash-image",
    aspect_ratio="16:9"
)

result = await tool.generate_image(
    prompt="A sunset over mountains with a lake in the foreground",
    config=config
)

2. Image Editing

Edit existing images with text prompts:
# Get image from previous block
input_image_path = inputs["previous_block"]["image_path"]

config = ImageGenerationConfig(
    model="gemini-3-pro-image-preview",
    aspect_ratio="1:1",
    reference_images=[input_image_path]  # Image to edit
)

result = await tool.generate_image(
    prompt="Add a rainbow in the sky of this image",
    config=config
)

3. Person Generation

Generate person photos from reference images:
# Reference images from config or previous blocks
person_photos = image_generation_config.get("reference_images", [])

config = ImageGenerationConfig(
    model="gemini-3-pro-image-preview",
    aspect_ratio="3:4",
    reference_images=person_photos
)

result = await tool.generate_image(
    prompt="A professional headshot of this person in business attire",
    config=config
)

Configuration

Config Fields

Create a single config field of type image_generation_config:
{
  "name": "image_generation_config",
  "value": {
    "model": "gemini-2.5-flash-image",
    "aspect_ratio": "16:9",
    "output_format": "png",
    "style_instructions": "Photorealistic, high quality"
  },
  "field_type": "image_generation_config"
}

Configuration Options

FieldRequiredDescriptionOptions
modelYesModel to usegemini-2.5-flash-image, gemini-3-pro-image-preview
aspect_ratioYesImage aspect ratioMust be supported by selected model
output_formatNoOutput formatpng (default), jpeg
style_instructionsNoStyle guidanceText appended to prompts
image_sizeNoImage size (Pro only)1K, 2K, 4K
reference_imagesNoReference imagesList of image paths

Aspect Ratios

Choose aspect ratios based on your use case:

1:1

Square format. Good for social media posts.

16:9

Widescreen. Perfect for banners and headers.

9:16

Vertical. Ideal for mobile content.

3:4

Portrait. Great for photos and portraits.

Output Handling

Saving Generated Images

Generated images must be copied to the block’s output directory:
import shutil

# Block output directory (already exists)
block_output_dir = "data/{block_name}_data"

# Generate image
result = await tool.generate_image(prompt=prompt, config=config)

# Copy to block output directory
filename = f"generated_image.{result.format}"
final_path = f"{block_output_dir}/{filename}"
shutil.copy(result.file_path, final_path)

# Return output
output = {
    "image_path": final_path,
    "width": result.width,
    "height": result.height,
    "format": result.format
}
Always copy generated images to the block output directory. Temporary files are cleaned up after execution.

Style Instructions

Add style guidance to your prompts:
config = ImageGenerationConfig(
    model="gemini-3-pro-image-preview",
    aspect_ratio="16:9",
    style_instructions="Photorealistic, cinematic lighting, high detail"
)

result = await tool.generate_image(
    prompt="A futuristic city at night",
    config=config
)
Style instructions are automatically appended to your prompt to ensure consistent styling.

Best Practices

Use Flash for speed, Pro for quality. Flash is faster and cheaper, Pro produces higher quality images.
Detailed prompts produce better results. Include style, mood, composition details.
Match aspect ratios to your use case. 16:9 for banners, 1:1 for social media.
Always copy generated images to the block output directory. Don’t rely on temporary paths.

Limitations

Model Limitations

  • Flash Model: Fixed image size, limited reference images (3)
  • Pro Model: Higher cost, slower generation
  • Aspect Ratios: Must match model capabilities
  • Reference Images: Limited by model (3-5 images)

General Limitations

  • No animation: Static images only
  • No video: Cannot generate videos
  • File formats: PNG or JPEG only
  • Size limits: Maximum 4K for Pro model
Image generation consumes credits based on model and image size. Pro model and larger images cost more credits.

Troubleshooting

  • Check that the model supports the aspect ratio
  • Verify the prompt is clear and specific
  • Ensure config fields are set correctly
  • Check credit balance
  • Try the Pro model for better quality
  • Add style instructions to the config
  • Be more specific in the prompt
  • Use higher image size (Pro model)
  • Verify image paths are correct
  • Check that model supports reference images
  • Ensure reference images are accessible
  • Check image format (PNG/JPEG)
  • AI Module - Multi-agent system that uses image generation
  • Code Generation - How image generation code is created
  • Configs - Configure image generation settings

Image generation is a powerful feature for creating visuals. Experiment with different models, aspect ratios, and prompts to get the best results.