Skip to main content

Preparing Spreadsheet Data

Achieve maximum-value outputs with high-quality inputs.

Updated over a week ago

If there is one golden rule when using automated analytics and AI tools, it is this: The cleaner your inputs, the better your results.

Aether can retrieve large amounts of data and transform it into relevant, well-structured narratives. But when it comes to complex spreadsheets, Aether works best when the input is clear and well-organised, just like onboarding a new colleague who needs context and clarity before they can contribute effectively.

This guide shows you how to prepare spreadsheet data so Aether can understand, analyse, and apply it correctly.

Spreadsheet pre-upload checklist

Ask yourself: would a new starter immediately understand this file? If not, refine before uploading.

  • ✅ Headers and titles are clear and descriptive

  • ✅ Each worksheet contains one table with one header row

  • ✅ Columns are consistent in data type

  • ✅ Tables are “tall” (more rows, fewer columns) rather than “wide”

  • ✅ Units of measurement are clear and fully written out

Aether upload capabilities:
• Supports files up to 20MB
• Up to 100 tables per file
10,000-cell tables (e.g. 10 columns x 1,000 rows)
• Handles many sheets per file


Correct vs. Incorrect Example

Good (tall, tidy) table example

Monthly Widget Sales 2024

This table shows the number of widgets sold to each customer each month in 2024, with revenue in NZD.

Customer Name

Month

Product

Quantity Sold

Revenue ($NZD)

Alice Smith

January

Widget A

3

60

Alice Smith

February

Widget A

4

80

Bob Jones

March

Widget B

2

40

Bad (wide, ambiguous) table example

Rev FY24

Customer Name

Jan Qty

Jan Val

Feb Qty

Feb Val

Mar Qty

Mar Val

Alice Smith

3

60

4

80

Bob Jones

2

40

❌ Why this fails:

  • Abbreviated headers (“Qty,” “Val”) are unclear.

  • Wide format makes the table hard to interpret or reuse.

  • The title “Rev FY24” gives little context.


Detailed Best Practices

1. Add Clear Titles and Descriptions

Aether looks to titles and notes to understand what a table represents. Without them, the data may be misinterpreted.

  • Use: Before or above each table (or as a nearby note), add

    • Title e.g., “Monthly Widget Sales 2024”

    • Description e.g., “This table shows the number of widgets sold to each customer each month in 2024, with revenue in NZD”.

  • Don’t: Use labels like "Table 1" or "Report Data".

  • Where should these go?

    • Preferred: Place the title in the first row above the table, and the description just beneath or in a “Notes” column/row.

    • Acceptable alternative: Use a separate sheet (“About” or “Notes”) summarising all tables.

2. Use Descriptive Headers

Headers are the first signals Aether reads. Ambiguous codes or duplicates reduce accuracy.

  • Use: Full, descriptive words (Product Name, Order Date, Quantity Sold), not codes or abbreviations.

  • Don’t: Use headers like PN, OD, Qty, or have duplicate column names.

3. Tall, Not Wide

Long-form (“tall”) data = one row per observation. This makes it easier for Aether to aggregate, compare, and interpret trends.

  • Tall example: Month and Quantity Sold are columns, with multiple rows per customer.

  • Wide example: months spread across columns → harder to process and scale.

4. One Table, One Header

AI tools expect a single, simple table per sheet, with one clear header row. Multiple header rows (multi-index), sub-headers, or complex formatting make it hard for AI to determine the structure, boundaries, and meaning of your data.

  • Use: Each table should be self-contained with just one row of headers, followed by data rows.

  • Don’t: Stack multiple tables vertically or horizontally on the same sheet, use merged header cells, or create “report-style” layouts.

5. Clear table boundaries and ranges

Tables must have clear, unambiguous boundaries to ensure both people and automated tools can reliably identify where a table begins and ends - preventing accidental data merges or omissions.

  • Use Excel’s native Insert Table feature to formally define your table’s region.

  • Always leave at least one full empty row and column between tables if more than one exists on a sheet.

  • Never place tables directly adjacent or with only a single blank cell as a separator.

  • Remove stray or floating data from outside your main table.

  • Use:

    • Sheet1 → Table 1

    • Sheet2 → Table 2

  • Don’t: Place two unrelated tables in the same worksheet or stack one beneath the other.

6. Summarised and contextualised, not raw data dumps

Clean, summarised data - with only the columns needed for the intended analysis - is more effective and less likely to cause errors or confusion.

  • Use: Only columns that are required, with human-readable values and summarised where appropriate (e.g. monthly totals instead of individual transactions, unless all are required).

  • Don’t: Export all system columns, cryptic flags, or intermediate calculations unless they’re necessary.

7. Consistent, precise units and column definitions

Columns like Units, Volume, or Value are too vague. Always specify what the measure is - are “units” the number of sales, items in stock, or something else? Are “volumes” litres, tonnes, or boxes?

  • Use: Quantity Sold (units), Revenue ($NZD), Order Volume (litres)

  • Don’t: Units, Volume, Value, Amount

8. Consistent data types

Columns should contain only one data type: all numbers, all text, or all dates. Avoid mixing numbers with units (5kg, 10), or text with numbers (Paid, 12, Pending). Use ISO date formats for clarity.

  • Use: 2025-06-30 for dates, 23.5 for numbers, and In Stock or Out of Stock for status.

  • Don’t: 30th June, twenty, 12kg

9. Use ISO dates and handle missing data consistently

Consistent date formats and missing value codes reduce ambiguity and improve parsing reliability. Dates should use the format YYYY-MM-DD and missing data should be either blank or marked as NA, never as “-”, “.”, or other placeholders.

10. Simple Formatting

AI tools ignore formatting and need all context to be present in the plain text or column structure.

  • Use: Plain, unmerged, unhidden rows and columns.

  • Don’t: Merged cells, formatting-only signals (e.g., colours, bold, cell borders to convey meanings like 'high/low' or 'total'), comments to describe critical information.

11. Provide sufficient context - would a stranger understand?

If someone outside your team opened this file, would they know what every sheet, table, and column means? Review your file before uploading - try opening it on another computer or having a colleague check it.

  • Use: Add a ReadMe or Notes sheet describing the file and data structure. Provide a separate “dictionary” sheet mapping columns to definitions, or include a “data dictionary” at the start or end.

  • Don’t: Assume people (or AI) will know what a code or abbreviation means.

Did this answer your question?