๐ŸŽฏ What You'll Learn

  • Identify and explain all six categories of data types Power BI supports in 2026
  • Distinguish structured, semi-structured, and unstructured data with real-world examples
  • Describe the roles of Data Lakes, Data Warehouses, and the Lakehouse hybrid
  • Explain what Microsoft Fabric and OneLake are and why they matter for reporting
  • Apply the right storage and connection strategy (Import vs. DirectLake) based on architecture

๐Ÿ“‹ Before You Begin

  • Basic familiarity with what Power BI is (you've seen a dashboard before)
  • Understanding of what a spreadsheet / table looks like (rows & columns)
  • No coding knowledge required โ€” this is a conceptual overview

Part 1 โ€” The Power BI Data Universe

In 2026, Power BI has evolved into a massive Data Mesh hub. It no longer just handles spreadsheets โ€” it digests everything from satellite imagery to AI-generated vectors. Here's every data category it supports.

โ‘  Structured Data โ€” The Foundation

Structured data is highly organised, living in rows and columns. This is what most users work with 90% of the time. It comes from sources like SQL Server, Excel, and Snowflake.

Power BI supports three numeric subtypes:
  • Decimal Number โ€” for values like $1.23 or 3.14
  • Fixed Decimal (Currency) โ€” always 4 decimal places, ideal for financial data
  • Whole Number โ€” integers only, e.g. 100, 42
Critical for time-series analysis. Includes:
  • Date โ€” day/month/year only
  • Time โ€” hours, minutes, seconds
  • DateTime โ€” combined date and time
  • DateTimeZone โ€” datetime with timezone offset, essential for global teams
  • Text (String) โ€” standard alphanumeric data like names, IDs, descriptions
  • Boolean โ€” True/False values used for logic gates and filters (e.g. "Is Active?")

โ‘ก Semi-Structured Data โ€” Web & APIs

Semi-structured data doesn't fit a rigid table, but has tags or markers to separate elements. This is the language of the web and modern applications.

{ } JSON

JavaScript Object Notation. Power BI can "flatten" complex JSON hierarchies from APIs like GitHub or Salesforce into usable tables.

</> XML

Extensible Markup Language. Common in legacy enterprise systems. Same flattening capability as JSON.

๐Ÿ“„ PDF Data

Using AI, Power BI "scrapes" tables from PDFs โ€” identifying headers and rows even in scanned documents.

๐Ÿ’ก How Flattening Works: A nested JSON like {"user": {"name": "Raj", "city": "Delhi"}} becomes two flat columns: user.name and user.city in Power BI's query editor.

โ‘ข Unstructured Data โ€” The AI Frontier

Unstructured data has no pre-defined model. In 2026, Power BI handles this via Azure Cognitive Services and Fabric AI.

You can store images directly in a Power BI dataset. Vision AI then automatically tags those images โ€” for example, classifying warehouse photos as "Truck" vs. "Shelf". OCR (Optical Character Recognition) can read text out of photos, converting them into searchable data.

1
Upload image to Blob Storage / OneDrive
Binary file stored as a column in your dataset
2
Azure Vision AI processes the image
Returns tags: ["warehouse", "forklift", "pallet"]
3
Tags become filterable columns in Power BI
You can now slice your data by image content

For emails, support tickets, or customer reviews, Power BI uses Sentiment Analysis to convert a block of text into a numerical "Happiness Score" between 0 and 1.

Customer ReviewSentiment Score
"Excellent service, very happy!"0.93 ๐Ÿ˜Š
"Delivery was late again"0.41 ๐Ÿ˜
"Worst experience ever, never again"0.04 ๐Ÿ˜ 

โ‘ฃ Spatial & Geographic Data

Power BI treats location data as a first-class citizen, offering multiple spatial formats.

Precise coordinate pairs like 28.6139ยฐ N, 77.2090ยฐ E (New Delhi). Used for point-based maps โ€” plotting store locations, delivery points, etc.
Beyond simple points, these handle complex boundaries โ€” like state borders, school districts, or custom sales territories. Rendered via the Azure Maps visual in Power BI.
Different countries use different coordinate systems (WGS84, NAD83, etc.). Power BI supports various systems to ensure global data aligns correctly on a single map.

โ‘ค Vector Data โ€” The LLM/AI Engine

With Generative AI, Vector Data has become critical. Vectors are numerical representations of meaning โ€” called embeddings.

๐Ÿ” Keyword Search (Old Way)

Search for "Profit" โ†’ only finds rows containing the exact word "Profit"

๐Ÿง  Semantic Search (Vector Way)

Ask "How is our financial health?" โ†’ finds data about revenue, margins, costs, losses

Vectors live in a Vector Store inside Microsoft Fabric. Power BI queries this store to surface AI-powered insights based on meaning, not just keywords. External stores like Pinecone are also supported.

โ‘ฅ Streaming & Real-Time Data

Not all data is "at rest." Some data is "in flight" โ€” arriving continuously in real time.

โšก Push API

Data pushed directly into Power BI in sub-second intervals. Perfect for stock prices, IoT sensors, or factory machines.

๐Ÿ“ก PubNub

Power BI connects to live event streams. Dashboards update automatically โ€” no "Refresh" button needed.

๐Ÿ” Kusto (ADX)

Azure Data Explorer โ€” optimised for high-volume log analytics and telemetry in real-time.

๐Ÿ“Š Part 1 Summary โ€” Data Coverage

Category Data Types Primary Source
StructuredInteger, Currency, Date, Text, BooleanSQL, Excel, Snowflake
Semi-StructuredJSON, XML, PDFAPIs, Web Pages
UnstructuredBinary, Image, Long-form TextBlob Storage, OneDrive
SpatialLatitude, KML, Shapefile, GeoJSONGIS Systems, Azure Maps
Vector / AIEmbeddings, Vector StoresMicrosoft Fabric, Pinecone
StreamingPush API, Event StreamPubNub, Kusto, IoT Hub

Part 2 โ€” The Modern Data Ecosystem

To excel at data visualisation, you must understand where data lives, how it is organised, and how it flows into your reports. The terminology has shifted from "spreadsheets" to complex "architectures."

โ‘ฆ Data Lakes vs. Warehouses vs. Lakehouse

Every visualisation starts with storage. Here's how the three main storage paradigms compare:

Storage TypeWhat It HoldsStructureBest For
๐Ÿž Data Lake Everything raw โ€” structured, semi, unstructured None (schema-on-read) Storing data now, figuring out meaning later
๐Ÿ› Data Warehouse Cleaned, processed, organised data Rigid (schema-on-write) Specific business reporting (e.g. Finance)
๐Ÿ— Data Lakehouse Both raw and processed data Flexible hybrid Modern analytics at scale โ€” best of both
๐ŸŽฏ The Key Insight: A Data Lake is like a swimming pool of raw ingredients. A Data Warehouse is like a plated meal. A Lakehouse is a restaurant kitchen that keeps both.

โ‘ง Microsoft Fabric โ€” The All-in-One Platform

Microsoft Fabric is the biggest shift in the ecosystem. It's an end-to-end SaaS platform that unifies data engineering, warehousing, science, and visualisation under one roof.

โ˜๏ธ OneLake

Think "OneDrive for data." All your organisation's data โ€” regardless of team โ€” lives in a single unified lake. One copy, governed centrally.

โšก DirectLake Mode

A breakthrough: Power BI reads data directly from the lake without importing it, enabling lightning-fast reports on billions of rows.

Why This Matters for You:
1
Old workflow (Import Mode)
Load data โ†’ compress into memory โ†’ visualise. Slow for huge datasets. Needs scheduled refresh.
2
New workflow (DirectLake)
Query OneLake directly โ†’ visualise instantly. No import. No stale data. Near real-time.

โ‘จ Domains, Workspaces & Data Mesh

As organisations grow, they need to divide data logically to avoid a "Data Swamp." Here's the three-tier hierarchy:

High-level groupings by business function: Sales, HR, Finance. A Domain establishes business ownership โ€” who is responsible for the quality and governance of that data.
Workspaces live inside or across Domains. They're collaborative project folders where developers build reports, dashboards, and semantic models. Think of them as shared Google Drive folders for your Power BI assets.
Data Mesh is the strategy of letting different departments (Domains) own their own data products โ€” rather than routing everything through one central IT team. This enables scalability, accountability, and faster delivery.
ConceptSimple DefinitionRole in Visualisation
FabricThe entire platformThe "workspace" where you build everything
OneLakeCentral storageWhere your raw materials are kept
DomainBusiness departmentDefines who owns and validates the data
WorkspaceProject folderWhere you collaborate with teammates
LakehouseStructured data lakeProvides clean data for your charts

โ‘ฉ Practice Quiz โ€” Test Your Knowledge

10 questions covering both parts. At least 6 test practical application.

๐Ÿ† Key Takeaways

  • Power BI handles 6 data categories: Structured, Semi-structured, Unstructured, Spatial, Vector, and Streaming.
  • The VertiPaq engine compresses and optimises all these types for instant visualisation.
  • A Data Lake stores everything raw; a Data Warehouse stores clean, structured data; a Lakehouse combines both.
  • Microsoft Fabric with OneLake is the modern unified platform replacing fragmented data stacks.
  • DirectLake Mode lets Power BI query billions of rows live โ€” no import, no stale data.
  • Data Mesh is a strategy that gives domain teams ownership of their own data products.