How to Use Copilot Studio to Answer Questions About Images

March 4, 2025

We are always exploring the latest advancements in AI to help businesses optimize their workflows and leverage Microsoft Copilot Studio effectively. Copilot Studio can now answer questions based on uploaded images and documents, opening up new possibilities for AI-powered interactions.

This guide explains how to use Copilot Studio to extract and answer questions from images stored in documents. We will cover the process of uploading files, indexing content, querying AI for insights, and exploring real-world applications.

Setting Up Copilot Studio for Image-Based Queries

To take advantage of Copilot Studio’s document and image processing capabilities, start by creating an empty Copilot Studio agent and uploading a document to Copilot’s knowledge storage system.

Create New Agent In Copilot Studio & Upload Knowledge

For this example, a PDF file containing deck plans of a Holland America cruise ship is used. This document contains multiple floor layouts, images, and textual data, making it an ideal test case for AI-powered search and summarization.

How Copilot Studio Handles Uploaded Documents

When a document is uploaded to Copilot Studio, it gets stored in Microsoft Dataverse. The document then undergoes vectorization and chunking, which breaks the content down into searchable units. The system needs some time to process the file before it becomes fully indexed and ready for queries.

Here’s what happens in the background:

  1. Storage in Dataverse – The file is stored in the AI agent’s knowledge repository.
  2. Vectorization – The document's content, including text and images, is converted into an AI-searchable format.
  3. Indexing and Chunking – The AI system segments the document into smaller, easily retrievable chunks.
  4. Ready for Queries – Once processing is complete, Copilot Studio can now retrieve relevant data from the document
Knowledge Indexed & Ready

Key Insight: At present, Dataverse is the only supported storage method for image-based document search in Copilot Studio.

"Rotterdam" example of the images stored in the uploaded file

Querying Image Data in Copilot Studio

With the deck plans uploaded and indexed, the next step is interacting with the AI agent to extract information. The built-in test canvas can be used to simulate a conversation with Copilot Studio.

Step 1: Asking Questions About the Document

To test Copilot Studio’s ability to analyze images and structured data, you can ask:

👉 “What deck is the casino on the Rotterdam?"

  • The AI system retrieves the relevant section of the document and correctly identifies the deck where the casino is located.
  • The response is generated using Microsoft’s generative orchestration layer, which enables intelligent data retrieval and contextual awareness.

Step 2: Maintaining Conversational Context

One of the powerful features of Copilot Studio’s AI orchestration is context retention. Instead of requiring users to restate previous questions, the system can infer context and continue the conversation naturally.

For example:

👉 “How do you get to the sports court from there?”

  • The AI understands that “there” refers to the casino’s deck.
  • It provides instructions on how to navigate from the casino to the sports court.

Similarly, when asked:

👉 “What is the closest bar to the Seaview Pool?”

  • The AI retains context and retrieves the nearest bar’s location without needing additional details.

This contextual awareness makes Copilot Studio highly effective for real-time wayfinding and navigation by allowing users to interact naturally without having to repeat details in every query. Instead of restarting the conversation with each new question, the AI maintains an understanding of previously discussed locations, ensuring smoother and more efficient interactions. This capability is particularly valuable in dynamic environments like airports, shopping malls, office buildings, and cruise ships, and many more- where users may need step-by-step directions or location-based information while on the move.

Advanced Queries with Copilot Studio

With structured data indexed, the system can also answer more specific queries, such as:

👉 “Is room VC 6154 a handicapped-accessible room?”

The AI checks the legend within the deck plan to determine the accessibility features of that specific room.

👉 “What deck is that room on?”

  • Since the system has already identified room VC 6154, it understands the reference and provides the correct deck number without requiring the user to repeat details.

Your agent’s ability to interpret structured data within images and documents enables powerful search and retrieval capabilities for businesses, allowing users to extract precise information without manually scanning through files. By leveraging AI-driven indexing and contextual understanding, Copilot Studio can analyze floor plans, schematics, blueprints, and even product catalogs, making it easier to navigate complex datasets. This functionality is especially valuable in industries like real estate, logistics, healthcare, and retail, where quick access to visual and textual data can streamline operations, enhance decision-making, and improve customer experiences.

Current Limitations and Future Possibilities

While this feature is a breakthrough for AI-powered image and document search, there are some limitations:

  • Currently, only files uploaded directly into Dataverse support image-based search.
  • Not all image types are fully indexed—some complex floor plans may lack full AI recognition.
  • Lack of metadata integration—additional labeling options could improve AI-driven search results.

However, future updates may enhance Copilot Studio’s ability to:

  • Index images from more sources (e.g., SharePoint, OneDrive).
  • Provide more detailed spatial reasoning and route mapping.
  • Support additional file formats and metadata tagging for better indexing.

As Copilot Studio evolves, businesses will have more advanced AI-powered document retrieval and wayfinding capabilities.

Our Take

Copilot Studio’s new ability to analyze images and documents is a game-changer for AI-powered search and navigation. By leveraging Dataverse storage, vectorized indexing, and generative AI orchestration, users can now ask highly specific questions and get accurate answers from structured image data.

At Digital Bricks, we specialize in creating intelligent, industry-specific AI agents that leverage Copilot Studio to deliver highly efficient, context-aware solutions. Our team of AI and automation experts works with organizations across retail, logistics, real estate, finance, and more to develop tailored AI-powered search and navigation tools. We design scalable, intelligent agents that optimize customer interactions, streamline employee workflows, and drive operational efficiency.

Robin Rocks, The Real Estate Agent Testimonial

With a deep understanding of Copilot Studio’s advanced AI capabilities, we can help your business integrate AI-driven navigation, document search, and real-time contextual assistance into your existing systems. We take Copilot Studio to the next level by integrating it with APIs and Robotic Process Automation (RPA) to connect agents with external data sources. This enables businesses to extend Copilot’s capabilities beyond static document searches, allowing real-time data retrieval, automation, and seamless interaction with third-party systems.

Want to transform your business with AI-powered Copilot agents? Get in touch with Digital Bricks today at info@digitalbricks.ai