Skip to main content

Bulk Query & Extractor

Run systematic analysis across multiple documents simultaneously, comparing data points and extracting structured information from large document sets, with Extractor capabilities for automated entity identification and comprehensive data mining.

Written by Christa Jagnanan
Updated over 2 months ago

What is Bulk Query and when should I use it?

Bulk Query allows you to run the same questions across multiple documents simultaneously, perfect for:

  • Comparing terms across multiple contracts or agreements

  • Extracting specific data points from numerous CIMs

  • Analyzing consistent metrics across portfolio companies

  • Creating comparison matrices for investment opportunities

  • Running due diligence questions across an entire data room

Bulk Query vs. Chats - Which should I use?

  • Use Bulk Query when: You need structured, comparable answers across 5+ documents

  • Use Chats when: You need conversational analysis or have fewer than 5 documents

  • Pro tip: For CIM comparisons, use Chats for 5 or fewer, Bulk Query for more than 5

Creating and Running Bulk Queries

Step-by-Step Setup

  1. Navigate to the Deals menu and select your Deal

  2. Click "Bulk Query" from the Deal menu bar

  3. Click "+ New Bulk Query"

  4. Name your query descriptively (e.g., "Q4 2025 Portfolio EBITDA Analysis")

  5. Set privacy:

    • Deal Team: All team members can view and access results

    • Private: Only you can see the query and results

Document Selection Best Practices

Adding Documents:

  1. Click "+ Add Documents" in the documents section

  2. Select documents using one of these methods:

    • Individual selection: Click specific documents

    • Tag filtering: Use tags to select document groups

    • Select all: Choose all documents in the deal

Common Issue - "My documents aren't showing"

  • Ensure documents have "Ready" status (not still processing)

  • Check that you have access permissions for the documents

  • Refresh your browser if recently uploaded documents don't appear

Setting Up Questions Effectively

Adding Questions:

  1. Click "+ Add Questions"

  2. Type your first question in column A

  3. Add additional questions in subsequent columns (B, C, D, etc.)

  4. Use the Prompt Library (button at bottom) for pre-tested questions

Question Writing Tips:

  • Be specific: "What is the 2024 EBITDA?" instead of "What are the financials?"

  • One concept per question for cleaner results

  • Use consistent terminology across all questions

  • Avoid compound questions that might confuse the AI

Common Issue - "Questions aren't returning good results"

  • Check your Source Token Limit (see Advanced Settings below)

  • Ensure questions are specific enough for the AI to locate information

  • Try rephrasing using exact terminology from your documents

Running and Managing Queries

Execution Process:

  1. Review selected documents and questions

  2. Click "Run" button (top left of screen)

  3. Monitor progress in the status bar

  4. Processing time depends on:

    • Number of documents

    • Number of questions

    • Token limit settings

    • Model selected

Accessing Results:

  • Click your query name in the left navigation panel

  • Results appear in a grid format

  • Each cell shows the answer for that document/question combination

  • Empty cells indicate no relevant information found

Common Issue - "My bulk query is taking forever"

  • Large document sets (50+) may take 10-15 minutes

  • Reduce token limits for faster processing

  • Consider breaking into smaller batches for very large queries

  • Check system status for any platform-wide delays

Managing Duplicate Documents and Questions

Bulk Query includes built-in functionality to help users maintain clean, accurate query tables by removing duplicate entries directly within the interface.

To remove a duplicate document, right-click on the document column for the relevant entry. A context menu will appear presenting the option to remove the document.

The same process applies to duplicate questions β€” right-clicking on the question column for the applicable row will surface the same removal option.

This functionality allows users to efficiently manage their Bulk Query table prior to running an analysis, ensuring that results remain precise and free of redundant data.

Advanced Configuration

AI Model Settings:

  1. Click the Settings gear icon (top left)

  2. Configure these critical parameters:

Source Token Limit (Most Important Setting):

  • What it does: Controls how much document content the AI analyzes

  • Default: 5,000 tokens

  • Recommendations:

    • Simple data extraction: 2,000-3,000 tokens

    • Complex analysis: 5,000-8,000 tokens

    • Comprehensive review: 10,000+ tokens

  • Reference: 1 page β‰ˆ 400-800 tokens

Model Selection:

  • Choose based on complexity needs

  • GPT-5 for numerical/financial data

  • Claude V4 for nuanced text analysis

  • Test with a small batch first to optimize selection

Common Issue - "Results are incomplete or missing information"

  • Increase Source Token Limit - this is the #1 fix

  • Ensure questions match document terminology

  • Check if information exists in tables (may need table extraction)

  • Verify document quality (clear scans, not password-protected)

Exporting and Sharing Results

Export Process:

  1. Click "Export" button (top right)

  2. System generates Excel file with:

    • All questions as column headers

    • Documents as rows

    • Answers in corresponding cells

Extractor Feature - Advanced Data Mining

What is Extractor? A specialized Bulk Query mode that automatically identifies entities (like companies, investments, or people) and extracts multiple data points for each entity across your documents.

When to Use Extractor vs. Standard Bulk Query:

  • Use Extractor when: You need to identify AND extract data for multiple entities (e.g., find all portfolio companies and get their CEO, revenue, and location)

  • Use Standard when: You know exactly what you're looking for and where

Setting Up Extractor:

  1. Select "Extractor" instead of standard query when creating

  2. Define your entity type (e.g., "Portfolio Company", "Investment", "Management Team")

  3. List data points to extract (e.g., "CEO Name", "2024 Revenue", "Headquarters")

  4. Select documents

  5. Run extraction

Extractor Results:

  • Presents a structured table with:

    • Rows: Each identified entity

    • Columns: Your specified data points

    • Cells: Extracted information

  • Export-ready for immediate analysis

Common Issue - "Extractor missed some entities"

  • Be more specific with entity definition

  • Increase token limits to capture more context

  • Check if entities use different names/aliases in documents

  • Consider running multiple extractions with different entity definitions

Did this answer your question?