Connecting to Data Sources
Learn to connect Power BI to a broad set of data sources, authenticate securely, and choose the right connectivity mode for your needs.
Content
Connecting to Cloud Data Sources
Versions:
Watch & Learn
Connecting to Cloud Data Sources — Power BI (But Make It Cloudy)
"Cloud data is like a coffee shop: always buzzing, sometimes encrypted, and everyone expects it to be open when they arrive."
You're already comfortable grabbing spreadsheets from your desktop and chatting with on-prem SQL like an old pal (see: Connecting to Excel/CSV Files and Relational Databases). Now we level up: the cloud — a place with elastic compute, mysterious permissions, and connectors for days. This lesson shows how to connect Power BI Desktop and the Power BI Service to cloud data sources, pick the right authentication & mode, and avoid the classic potholes that make refreshes cry.
Why this matters (aka, the elevator pitch)
- Cloud sources are where most enterprise data lives now: Azure, Google, SaaS apps (Salesforce, Dynamics), data lakes, Databricks, BigQuery.
- Connecting correctly = faster reports, secure access, and fewer midnight emergency refreshes.
- Choosing the right connector and mode (Import vs DirectQuery vs Live Connection) shapes performance, cost, and governance.
Quick map: Which cloud sources we’ll cover
- Azure family: Azure SQL Database, Azure Synapse Analytics, Azure Blob Storage, Azure Data Lake Storage Gen2, Azure Table Storage
- Big Data / Compute: Azure Databricks, Google BigQuery, Snowflake
- SaaS / APIs: Salesforce, Google Analytics, Dynamics 365, SharePoint Online, REST APIs
- Files in cloud: OneDrive for Business, SharePoint Online, Amazon S3
Authentication flavors (aka, the part that confuses everyone)
- OAuth2: Common for SaaS and Microsoft cloud connectors. You sign in interactively. Used by SharePoint Online, OneDrive, Google Analytics.
- Basic / SQL Auth: Username/password — still around for some SQL endpoints.
- Account Key / SAS: For blob/storage (Azure Storage Account Key or SAS token).
- Service Principal / Managed Identity: Recommended for automation/enterprise deployments (Power BI Service using a service principal to connect to Azure resources).
- API Key / Bearer Token: REST APIs, some SaaS providers.
Pro tip: In the Power BI Service, you can use OAuth or a service principal to avoid interactive logins for scheduled refresh.
Import vs DirectQuery vs Live
- Import — Data is loaded into the PBIX; fastest visual performance, requires scheduled refreshes. Best for smaller to medium datasets.
- DirectQuery — Queries the data source at runtime. Great for very large datasets and near-real-time needs, but visuals impact source query performance and have modelling limitations.
- Live Connection — Connects to semantic models (Analysis Services / Power BI datasets). You can’t edit the model in Desktop.
Ask yourself: Do you need sub-minute freshness? Is the dataset >10GB? Do you want complex transformations in Power Query? Those answers tell you which mode.
Connector table: at-a-glance
| Connector | Common Auth | Import / DirectQuery | When to use | Quick tip |
|---|---|---|---|---|
| Azure SQL DB | SQL Auth / Azure AD | Import + DirectQuery | Relational cloud db | Ensure firewall allows client IP or enable Allow Azure Services |
| Azure Data Lake Gen2 | OAuth / Service Principal | Import | Big data files (Parquet/CSV) | Use hierarchical namespace and folder-level permissions |
| Azure Blob Storage | Account Key / SAS / OAuth | Import | Semi-structured files | Prefer ADLS Gen2 for analytics workloads |
| Databricks | Personal Access Token | Import / DirectQuery (via Spark) | ETL + analytics on big data | Use cluster SQL endpoints for DirectQuery |
| Google BigQuery | OAuth | Import / DirectQuery | Massive analytics datasets | Use query folding to offload work to BigQuery |
| Salesforce | OAuth | Import | CRM reporting | Beware API limits; schedule refresh accordingly |
| REST API | API Key / OAuth | Import | Custom SaaS or public data | Use pagination and caching; transform JSON in Power Query |
Real-world example: connect to Azure Blob (simple M snippet)
Use Power BI Desktop: Get Data > Azure > Azure Blob Storage. You can also do a REST call for a JSON API.
Example: calling a REST API with bearer token in Power Query M:
let
token = "YOUR_TOKEN",
Source = Json.Document(Web.Contents("https://api.example.com/v1/data", [
Headers = [Authorization = "Bearer " & token, Accept = "application/json"]
]))
in
Source
For Azure Blob using AccountKey, the connector handles auth; you’ll navigate blobs and select files like local files.
Common gotchas (and how to avoid being That Person in IT Slack)
- Firewall blocks: Azure SQL requires you to configure the server firewall to allow Power BI Service IPs or enable ‘Allow Azure services’. For Desktop, add your client IP.
- Authentication mismatch: You set up OAuth in Desktop but try scheduled refresh with stored credentials. Use a gateway or service principal in Power BI Service.
- API rate limits: SaaS connectors may throttle you — schedule staggered refresh times and cache where possible.
- Too much data in Import mode: If a PBIX ballooned, consider aggregations, incremental refresh, or DirectQuery.
- Query folding failure: Heavy Power Query steps may prevent query folding; push transformations to the source (e.g., SQL/BigQuery) to improve performance.
When to use a Gateway
If your cloud setup includes hybrid resources (e.g., on-premises data plus cloud), or if a cloud resource sits behind a VNET and requires a Data Gateway for Power BI Service, install and configure the On-premises data gateway. Note: purely public cloud endpoints usually don’t require a gateway.
Short checklist before you click Load
- Choose the right connector (Azure-specific vs generic REST).
- Pick Import vs DirectQuery based on size & freshness needs.
- Confirm authentication method for Power BI Service (Service Principal for production).
- Validate query folding and consider pushing heavy logic to the source.
- Plan refresh schedule and watch API limits.
Closing: TL;DR + parting zinger
- Cloud sources are powerful — but you must pick the right connector, the right auth, and the right mode.
- Use service principals and managed identities for production automation; use DirectQuery for huge datasets and Import for fast UX.
- Keep an eye on firewall rules, API throttles, and query folding — these are the usual suspects when dashboards fail.
Connect smart, authenticate cleanly, and schedule wisely. Your future self (and your manager) will thank you.
Go try it: connect to an Azure Data Lake folder, pull a Parquet file, and build a tiny report. If Power Query folds the work back to the cloud engine, do a celebratory dance. If not — do a debugging dance.
Version note: This builds on your earlier steps connecting to files and relational databases, assuming you're comfortable with Power Query basics and model design. Next up: Performance tuning and incremental refresh for cloud datasets.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!