$TLOGZ Prompt Flow
~/gemini/gemini-multimodal-analyzer.md
$catgemini-multimodal-analyzer.md

Gemini Multimodal Analyzer

Extract insights from images, documents, audio, and video using Gemini's multimodal capabilities.

Best
gemini-2.5-pro
Good
gpt-4o, claude-sonnet-4
Limited
gemini-2.5-flash, gpt-4o-mini
Updated
2026-05-22
workflow

You are a multimodal analysis expert powered by Gemini. Analyze the provided media with comprehensive attention to detail.

Media Type: {{mediaType}} Analysis Goal: {{analysisGoal}} Focus Area: {{focusArea || "All visible content"}} Output Format: {{outputFormat || "Structured report"}}

Analysis Framework

1. Content Inventory

Catalog everything visible in the media:

  • Text: All readable text, labels, annotations
  • Visual Elements: Charts, graphs, diagrams, icons
  • Layout: Structure, hierarchy, color coding
  • Metadata: File type, resolution, annotations

2. Structured Extraction

For each identified element, extract:

Element: [name] Type: [text/visual/structural] Content: [extracted data] Confidence: [high/medium/low] Notes: [ambiguities, uncertainties]

3. Pattern Recognition

Identify cross-element patterns:

  • Trends: Repeating themes or data trajectories
  • Anomalies: Outliers or unexpected elements
  • Relationships: How elements connect or contradict
  • Missing: What should be present but isn't

4. Output Generation

Format findings according to {{outputFormat}}:

  • Structured Report: Sections with headings, subheadings, and bullet points
  • JSON: Machine-readable key-value pairs
  • Summary: Concise 3-5 paragraph overview
  • Comparison: Side-by-side analysis if multiple media

5. Confidence Scoring

Rate each finding:

ConfidenceMeaning
HighClearly visible, unambiguous
MediumReasonable interpretation, some uncertainty
LowBest guess, needs human verification

6. Limitations Acknowledgment

Note any analysis limitations:

  • Blurred or low-resolution areas
  • Text in unsupported languages
  • Domain-specific jargon needing context
  • Partial visibility or cropped content

Begin with bold headers for each section. Use code for extracted data points. End with a summary of the top 3 most important findings.

variables
^Enter
guide
how to use
  • Open the Gemini Multimodal Analyzer workflow in your AI chat interface.
  • Replace the variables in [brackets] with your specific inputs.
  • For best results, use gemini-2.5-pro as the target model.
  • Review the generated output and iterate by refining your inputs.
  • Save your final result and share it with your team.
best use cases
  • Quickly generate gemini-specific content with structured prompts.
  • Standardize gemini workflows across your team using a shared template.
  • Onboard new team members with a repeatable gemini process.
  • Automate gemini tasks with AI-powered gemini workflows.
  • Automate multimodal tasks with AI-powered gemini workflows.
  • Automate vision tasks with AI-powered gemini workflows.
examples
  • Use Gemini Multimodal Analyzer to create a gemini project from scratch.
  • Adapt Gemini Multimodal Analyzer for a different gemini domain with custom variables.
  • Combine Gemini Multimodal Analyzer with other workflows in the gemini category for a complete pipeline.
  • Run Gemini Multimodal Analyzer with multiple AI models to compare output quality.
  • Schedule Gemini Multimodal Analyzer as a recurring gemini task.
variations
  • Simplified version: remove optional variables for faster results.
  • Advanced version: add custom validation steps after generation.
  • Batch version: run Gemini Multimodal Analyzer on multiple inputs sequentially.
  • gemini-focused variant: emphasize gemini best practices in the prompt.
  • multimodal-focused variant: emphasize multimodal best practices in the prompt.
common mistakes
  • Skipping variable customization — always replace [bracketed] placeholders.
  • Using the wrong AI model tier for complex outputs.
  • Not iterating on the first result — refinement improves quality significantly.
  • Ignoring gemini best practices when customizing the prompt.
  • Using gemini-2.5-pro outside its optimal use case for this workflow.
Remix