Near-real-time replication at near-zero ingestion cost. No ETL pipelines to orchestrate. No ingestion compute to pay for. The pitch for Microsoft Fabric mirroring is genuinely compelling - and for a well-defined class of use cases, it delivers.
The question worth asking before you commit is not whether mirroring works. It is what kind of data layer mirroring produces, and whether that matches what your architecture actually requires.
Mirroring produces a current-state replica - a continuously updated copy of the source's present state, in Delta/Parquet format, in OneLake. It is not a historical record, a replayable event stream, or an immutable raw layer. For some architectures, that distinction is irrelevant. For others - particularly those where replayability, historical fidelity, or audit immutability matter - it is critical. This guide works through both sides.
This is a depth guide for data engineers and architects evaluating mirroring for production use. It is not a getting-started tutorial - the official documentation covers that ground. It covers connector-specific behaviour, architectural trade-offs, operational limits, and edge cases that surface in production. The appendix consolidates capability support across all ten connectors into a single reference matrix - including the gaps where Microsoft has not published documentation - so you can evaluate your specific source without cross-referencing ten separate limitations pages.
The three types of mirroring
The word "mirroring" covers three architecturally distinct mechanisms. Choosing the wrong type is the first mistake.
| Type | Data movement | Requires | Best for |
|---|---|---|---|
| Database mirroring | Replicated to OneLake as Delta/Parquet | Native connector for your source | Low-latency analytics without querying the source directly |
| Metadata mirroring | Stays in source; OneLake shortcuts created | Azure Databricks or Dremio | Unified Fabric experience over data already in a supported catalog |
| Open mirroring | You write files; Fabric merges them into Delta | Any system that can write Parquet or delimited text to OneLake | No native connector exists, or you need full control over change tracking |
1. Database mirroring - data moves
The source database's change log (CDC, Change Tracking, or Fabric's Change Event Stream) is continuously read, and changes land as Delta/Parquet files in OneLake. The data lives in Fabric. Queries hit OneLake, not the source.
Use this when you want low-latency analytics without hitting the production database, and your source is one of the ten supported connectors: Azure SQL Database, Azure SQL Managed Instance, SQL Server, PostgreSQL, MySQL, Oracle, Snowflake, Cosmos DB, Google BigQuery, or SAP. One important exception: SAP is not a direct connection. It routes through SAP Datasphere as an intermediary, which must be separately licensed and configured.
One special case: Fabric SQL database - Microsoft's own managed SQL database within Fabric - is mirrored to OneLake automatically, with no setup, no connector configuration, and no source-side changes required. If your operational workload runs on Fabric SQL database, the analytics layer comes for free.
2. Metadata mirroring - data stays put
The source data is not copied. Instead, OneLake shortcuts are created that point to the source. Querying the mirrored item queries the source system in real time - query performance depends on the source, not OneLake.
Use this when you want the Fabric SQL analytics experience over data that already lives in a supported catalog: Azure Databricks or Dremio. A caveat applies to both: neither vendor documents the integration from their own side - it is covered only in Microsoft's documentation. Dremio carries an additional flag: Microsoft's own page marks it as Preview while the mirroring overview table does not. Treat both connectors as early-stage and validate thoroughly before committing.
3. Open mirroring - you write the data
You deliver files (Parquet or delimited text) to a OneLake landing zone in a prescribed format. Fabric's replication engine picks them up and merges them into Delta tables. No Microsoft connector required - any system that can write files can feed it.
Use this when no native connector exists for your source, or when you need full control over what is replicated and how changes are tracked. It has the steepest setup curve of the three types but the widest applicability. The partners ecosystem lists pre-built integrations for common systems.
Each connector is its own evaluation
The three-type taxonomy understates the variation within database mirroring. Each connector uses a different change capture mechanism:
Each has different data type support, different DDL handling, different source-side prerequisites, and different maturity. Nothing generalises cleanly across connectors unless the documentation explicitly says it does. A confirmed behaviour on Azure SQL Database tells you nothing about BigQuery.
The table below summarises connector maturity and documentation coverage. The gap count reflects the number of - cells in the appendix - behaviours Microsoft has not documented. Azure SQL Database and SQL Server score 4 each due to undocumented DDL edge cases. BigQuery's 9 gaps include basic operational questions like whether ADD COLUMN is handled at all.
| Connector | Maturity | Documentation gaps |
|---|---|---|
| Azure SQL Database | Mature | 4 |
| Azure SQL MI | Mature | 2 |
| SQL Server | Mature | 4 |
| PostgreSQL | Mature | 1 |
| MySQL | Established | 3 |
| Cosmos DB | Established | 1 |
| Snowflake | Established | 6 |
| Oracle | Sparse | 3 |
| BigQuery | Sparse | 9 |
| SAP | Sparse | 7 |
What mirroring delivers
Where mirroring operates within its intended scope, it delivers well.
Ingestion compute is genuinely free. The replication engine runs off Fabric capacity - documented and confirmed by independent benchmarking. On an F64, practitioners report background CU usage around 3–4% for half a billion rows across 50 tables. Storage is free up to 1 TB per capacity unit. The advantage is not raw CU savings over a carefully built pipeline - it is continuous near-real-time replication at near-zero ingestion cost, a cadence that would be unworkable via scheduled pipelines on any reasonable SKU. The cost model section below covers the full breakdown, including caveats.
The Open Mirroring ingestion engine scales irrespective of capacity SKU. The ingestion engine that processes Open Mirroring landing zone files is an off-capacity service run by Microsoft. It does not matter whether you are on an F2 or an F256 - ingestion throughput is the same. One Microsoft engineer demonstrated 1.2 billion rows per minute ingestion on an F2 in a published benchmark. Two caveats: the write-side compute - generating and uploading files - is not reflected in Fabric capacity usage, and the conditions were artificial. The "free" part is Fabric's ingestion engine; whatever writes files to the landing zone bears its own cost.
Mirroring eliminates analytical query pressure on the source. For organisations with access only to a production source, a mirrored database offloads all reporting traffic to OneLake. The SQL analytics endpoint provides a T-SQL-queryable replica; Direct Lake provides a Power BI-queryable one. Neither touches the source at query time.
This works best with Azure SQL Database (Change Feed) and SQL Server 2025 (Change Event Stream), where source-side replication overhead is minimal. SQL Server 2016–2022 with CDC does add overhead - capture jobs and log space - which partially offsets the protection benefit, though analytical query traffic is still eliminated. One hard constraint: mirroring requires the primary of an Always On availability group. Connecting to a readable secondary is not supported. If your organisation already routes reporting traffic to a secondary, mirroring does not help for that path.
Cross-database queries work out of the box. The SQL analytics endpoint for each mirrored database supports cross-database queries using three-part naming, across multiple mirrored databases in the same workspace. No additional configuration required.
CI/CD is supported (GA), with one manual step — and likely a full initial load. Mirrored databases can be committed to Git and deployed through Fabric deployment pipelines. The database definition is stored as a {name}.MirroredDatabase folder containing mirroring.json. The caveat: mirroring does not start automatically after a deployment pipeline runs. You must manually start replication in each target workspace.
Deployment pipelines deploy the item definition — connection settings and table selection — not the underlying Delta data in OneLake. The target workspace receives a configured but empty mirrored database item with no existing replication state. Starting mirroring after deployment triggers a full initial load from current source state — not an incremental pickup from where the source workspace left off. This is confirmed behaviour: Microsoft's documentation states that restarting mirroring "results in all data being replicated from the start" and that "each time you stop and start, the entire table is fetched again" (confirmed in the Azure SQL MI FAQ and consistent with documented reseed behaviour across other connectors). For large databases, this means the target workspace has no data until the initial seed completes — factor this into promotion planning.
The downstream payoff: Direct Lake integration. Because mirrored data lands in OneLake as Delta tables, Power BI Semantic Models can connect via Direct Lake mode - reading Parquet files directly from OneLake without importing data or querying the SQL analytics endpoint at query time. As new changes land from the source, reports reflect them without a scheduled refresh. For large datasets, this is substantially faster than DirectQuery and avoids Import mode's storage cost.
The clearest downstream win: source-to-semantic with no transform layer. Not every reporting use case requires transforms. Operational dashboards, self-service analytics on source tables that are useful as-is, metrics that don't require heavy cross-domain joins - these can go from mirrored Delta tables directly to a Direct Lake Semantic Model. Source → mirrored database → Direct Lake Semantic Model: near-real-time, near-zero capacity cost, no Spark jobs or pipeline runs.
For use cases that fit this pattern, mirroring delivers on its promise. If your instinct is to add transform layers on top of mirrored data, ask first whether they serve a genuine reporting requirement or carry over a pipeline pattern that mirroring has made redundant.
Latency - what "near real time" actually means
The mirroring overview describes replication as "near real-time". No specific latency SLA is published.
Microsoft describes replication latency as 15–60 seconds. For CDC-based connectors (Azure SQL, SQL Server, PostgreSQL, MySQL, Cosmos DB), practitioners report typical latency in the 60–90 second range under normal load - the lower end of Microsoft's stated range appears to reflect optimal conditions rather than typical deployments. Latency can fall further under heavy source write pressure. BigQuery uses a different change capture mechanism (Storage Write API) and may have different latency characteristics; reliable published figures are not available.
There are two distinct latency layers the documentation does not make explicit:
- Delta table latency - time from source change to the Delta table in OneLake reflecting it. This is what the replication engine controls.
- SQL analytics endpoint latency - time from Delta table update to queries via the SQL endpoint seeing new data. The SQL endpoint always lags the Delta layer.
For applications requiring the freshest possible data, read directly from the Delta layer via Spark rather than via the SQL analytics endpoint.
CDC semantics and late-arriving data. For CDC-based mirroring, watermarks are largely irrelevant - and that is an advantage. CDC captures changes in commit order at the source, regardless of business timestamps on the rows. A batch job inserting six-month-old records into the source today will be replicated when those rows commit; no high-water mark is needed to detect them. The edge case to watch: CDC captures changes at commit time, not statement time. A transaction running for two hours before committing appears in the mirror as a burst at commit rather than spread across the transaction duration.
A published benchmark by a Microsoft engineer under optimal append-only conditions (5 × 32-core machines writing aggressively to an F2 capacity) measured Delta table lag at 11.3 seconds and SQL endpoint availability at 30–60 seconds after file upload. These are best-case figures under a specific workload configuration - the author explicitly describes them as "empirical results from hacking around tunables" rather than a reproducible benchmark. Read them as an upper bound on what the system can achieve, not as typical deployment expectations. The benchmark covers appends only - no updates or deletes.
Cost model
One disclosure worth making explicit: the Fabric mirroring overview states that compute and storage are free, with no exceptions noted. In practice, Google BigQuery charges for CDC compute, Storage Write API, and BigQuery storage, and SAP charges SAP Datasphere Premium Outbound Integration pricing. These are charged by the source-side provider, not Microsoft, but they are real costs that belong in your model.
| Cost component | Charged? | Notes |
|---|---|---|
| Replication compute | No | Off-capacity; free for all mirroring types |
| OneLake storage for mirrored data | No (up to limit) | 1 TB free per capacity unit (e.g. 64 TB free on F64); excess charged at standard OneLake rates |
| OneLake write transactions (landing zone writes) | Yes | Applies to Open Mirroring; files < 4 MB cost one transaction each — see below |
| Query compute (SQL endpoint, Spark, Power BI) | Yes | Charged at standard Fabric capacity rates |
| Source-side costs (BigQuery, SAP, Snowflake) | Yes (varies) | Charged by the source provider, not Microsoft |
What "free ingestion" means in practice. The replication engine runs off-capacity - confirmed by independent benchmarking and a Microsoft PM. On an F64, practitioners have reported background CU usage around 3–4% for half a billion rows across 50 tables. Storage is free at 1 TB per capacity unit; an F64 gives 64 TB. Once configured, the replication engine continues running even if capacity is paused.
To contextualise against a pipeline-based alternative: a well-optimised incremental pipeline (parameterised loads from multiple SQL sources, landing as Parquet and merging into Delta) can reach a similar absolute CU figure. But that pipeline runs once a day. Pipeline activities and Spark notebooks are background operations in Fabric's CU model, smoothed across a 24-hour window. Running the same pipeline five times a day compresses five times the background CU consumption into the same budget - already saturating an F16 at that cadence. At sub-minute frequency it becomes unworkable on any reasonable SKU.
Mirroring sidesteps this entirely: continuous replication at near-zero background CU cost. For teams running daily incremental pipelines, adopting mirroring is not primarily a cost trade-off - it is a step change in data freshness at no additional ingestion cost.
For a typical database mirroring setup on Azure SQL, SQL Server, PostgreSQL, MySQL, or Cosmos DB queried at modest frequency, Fabric cost is close to zero beyond storage. The cost grows with query scale: ingestion compute is eliminated, not the cost of reading or transforming data.
Open Mirroring write transactions. The ingestion compute is free, but writing files to the landing zone consumes OneLake write transactions, which are charged. One transaction per file smaller than 4 MB; one per 4 MB block for larger files. At 100–150 KB per file, writing one million files costs one million transactions - four times more than batching those records into 4 MB files. A community report documented consuming over 3.7 million CU-seconds on an F64 (more than half the capacity's daily budget) from a single initial load of approximately 1.5 million small Parquet files; a Microsoft engineer confirmed this as expected behaviour, not a bug. If you control the writer, target file sizes of 4 MB or larger and batch aggressively for initial loads. Very large numbers of small files can also cause OOM conditions in the mirroring backend - a Microsoft engineer acknowledged this as a known issue as of late 2024; verify current status before pushing millions of small files.
Mirroring is a current-state replica
This is the most important architectural property to understand before committing.
Mirroring produces a continuously updated copy of what the source looks like right now. That is a legitimate and valuable thing - operational reporting, query offload, Direct Lake integration all rest on it. But it is not a historical record, and it cannot be used as one.
Stop/restart reseeds from current state. If you stop mirroring and restart it - or delete and recreate the mirrored item - Fabric reseeds from the source's current state. You do not get a replay of historical changes; you get today's data. This behaviour is documented across all SQL-family sources and Cosmos DB. A practitioner on the community forums described stop-and-restart as "generally a bad idea, since the Parquet files that make up the table are no longer kept."
Deletes are applied to the mirror. When a row is deleted at the source, it is deleted from the Delta table in OneLake. It does not persist as a tombstone or soft-delete record. Delta time travel provides a short recovery window (1 day for mirrored databases created after mid-June 2025; 7 days for older ones by default; configurable via the portal or API) - but this is not a rebuild window for most organisations.
PostgreSQL TRUNCATE is not replicated. If the source table is truncated, the mirror retains the pre-truncation data. The mirror silently diverges from the source - it is not even a faithful current-state replica for PostgreSQL workloads that use TRUNCATE.
Cosmos DB TTL-deleted rows are not replicated. Rows deleted via Cosmos DB's TTL mechanism do not appear as deletes in the mirror. Those rows remain in the mirrored table indefinitely.
Mirroring and the raw layer question
Whether mirroring is an appropriate implementation of a raw or bronze ingestion layer is one of the more consequential architectural questions it raises. The answer is not universal - it depends on what your raw layer is actually for.
The term "raw layer" - also called bronze, landing, or ingestion depending on organisational convention - describes different things in different teams. Three distinct definitions are worth separating before forming an opinion:
Definition 1 - Current-state replica. A continuously updated copy of the source's present state, queryable as a table. This is what mirroring provides. It faithfully reflects current source data and is available immediately via Spark, the SQL analytics endpoint, or Power BI Direct Lake.
Definition 2 - Change event stream. A sequence of all source changes (inserts, updates, deletes) in commit order, preserved indefinitely as a queryable or replayable record. CDC platforms - Debezium, Azure Data Factory CDC, event streaming pipelines - produce this. Mirroring uses CDC-equivalent mechanisms under the hood to keep the replica current, but it exposes only the resulting state, not the raw change events. The Delta change data feed — an opt-in paid extended capability, now billed at standard Fabric capacity rates — moves somewhat closer to this: it captures inserts, updates, and deletes incrementally and is available across all mirroring sources including Oracle, Snowflake, Azure SQL, and Open Mirroring. It is still a derived feature rather than the raw underlying change stream, but it is a meaningful step toward incremental change processing without a separate CDC pipeline.
Definition 3 - Immutable append-only store. A write-once, never-delete collection of extracted records, typically with extraction timestamps. Source deletes and updates do not remove historical records; they are appended as new versions alongside originals. This is what "bronze" means in recovery-oriented architectures - a layer from which any downstream state can be reconstructed at any point in time. Mirroring explicitly does not provide this: source deletes are applied to the mirror, and stop/restart reseeds from current state.
When mirroring is sufficient as an ingestion layer
- The use case is operational reporting on current source state
- Historical reconstruction is not a requirement - downstream models do not need to be reprocessed from raw on schema changes or logic corrections
- The source system is the authoritative system of record and can serve as a recovery point when needed
- Delta time travel's short retention window covers the organisation's recovery or audit window
- Downstream processing is idempotent or designed to tolerate reseed events
When mirroring is insufficient
- Downstream processing must be replayable: changes to transform logic, data models, or business rules require re-processing historical source data from scratch
- Source deletes must be preserved in the analytical layer for audit, compliance, or slowly-changing dimension tracking
- An immutable historical record is a compliance or contractual requirement
- The source system cannot be relied upon as a recovery point - operational databases often compact transaction logs, purge historical data, or are themselves subject to failure scenarios
- The architecture requires point-in-time reconstruction of the source state at an arbitrary past date
The architectural response
These are not mutually exclusive scenarios. Using mirroring for current-state operational reporting does not prevent a separate, lightweight append-only capture layer from handling historical fidelity in parallel. A pipeline that appends extracted records to a separate OneLake path - without transformation, without schema enforcement, just faithful extraction - provides the raw archive that mirroring cannot. The two run independently: mirroring handles the real-time replica; the capture layer handles the immutable record. For many organisations, building this alongside mirroring is less work than it sounds, because a raw capture layer has no transformation requirements.
The framing to avoid is "mirroring eliminates ingestion infrastructure." What it eliminates is orchestrated incremental pipeline infrastructure for current-state reporting. That is a meaningful and real reduction in complexity. It is not a replacement for a raw archive if your architecture requires one.
A note on disaster recovery
A mirrored database is a derived asset. It can be recreated from the source at any time by deleting and re-mirroring. This makes it a query offload layer, not a backup:
- A mirrored database is not a recovery target if the source is lost or corrupted
- Mass deletion or corruption at the source will eventually be reflected in the mirror
- "We have mirroring enabled" is not equivalent to "we have a copy of the data" for recovery purposes
None of this is a flaw in mirroring. It is the correct framing of what the tool is. Architectures that treat the mirror as a backup are making a design error, not an engineering one.
What actually replicates - the data type story
Unsupported data types are the most common unexpected blocker when setting up mirroring. The critical point the documentation does not emphasise up front: unsupported types do not always block the table. Impact depends on where the unsupported type appears.
| Scenario | Impact |
|---|---|
| Unsupported type in a regular column | Column is silently excluded; the rest of the table mirrors |
| Unsupported type in a primary key or clustered index column | Entire table is blocked |
| Unsupported type used as a table feature (Always Encrypted, in-memory, etc.) | Entire table is blocked |
For SQL-family sources, the full blocklist is documented across the individual limitations pages (Azure SQL Database, SQL MI, SQL Server, PostgreSQL, MySQL). Key highlights:
json and vector columns block entire tables for Azure SQL Database. This is a table-level exclusion, not a column skip - the table cannot be mirrored at all. hierarchyid, datetime2(7) as a primary key, and datetimeoffset(7) as a primary key have the same effect.
PostgreSQL silently excludes a long list of column types, including all geometric types, all network address types, all range types, json, jsonb, xml, and interval. None block the table - they disappear from the replica without warning. The PostgreSQL limitations page documents the full list.
Oracle uses an allowlist, not a blocklist. Only fifteen specific types are supported. Everything else - including CLOB, BLOB, XMLTYPE, and SDO_GEOMETRY - is implicitly excluded.
Oracle NUMBER without explicit precision or scale causes hard failures. The common Oracle pattern col NUMBER rather than col NUMBER(10,2) causes the error Invalid Decimal Precision or Scale. Precision: 38, Scale: 127, blocking the entire table. There is no workaround within mirroring - it does not support custom SELECT queries that could add an explicit cast. Changing a production Oracle column type is typically not feasible, meaning affected tables must be handled outside mirroring via pipelines or copy jobs.
Columns with spaces or special characters are supported via Delta column mapping. Previously a replication blocker, these column names are now handled through Delta's column mapping feature and replicate correctly.
Source schema hierarchy is preserved. The source database's schema structure (e.g. dbo, sales, hr) is maintained in the mirrored database and is consistent across the SQL analytics endpoint, Spark, and semantic models.
DDL changes behave differently across sources. Azure SQL picks up ADD COLUMN automatically. PostgreSQL requires a stop-and-restart of replication for any schema change - and stop/restart reseeds from current state (see the raw layer section above). MySQL replication is disrupted by DDL changes. SQL MI does not support ALTER COLUMN or RENAME COLUMN while a table is being mirrored; those operations are blocked outright. Check the limitations page for your specific source before planning any schema migration on a mirrored table.
Operational limits and gotchas
Use a service principal or shared service account - never an individual. A mirrored database is permanently tied to the user who created it. There is no ownership transfer mechanism. If the owner leaves the organisation, the item must be deleted and recreated - which means a reseed from current state. The connection is tied to the owner's Entra ID authentication: if their token expires due to MFA re-prompt, device compliance failure, or account deactivation, replication stops. A Microsoft MVP documented (Jul 2024) that workspace and tenant admins cannot access the connection unless the original owner has shared it first. Create all mirrored database items using a service principal or a shared service account, and share the connection with an admin group immediately after creation.
Note: workspace managed identity cannot create or own mirrored database items per the workspace identity documentation.
Private link support is partial - and most connectors are blocked. When Fabric's "Block Public Internet Access" tenant setting is enabled, most database mirroring connectors are unsupported: active mirrored databases enter a paused state and mirroring cannot be started.
| Status | Connectors |
|---|---|
| Supported | Open Mirroring, Azure Cosmos DB, Azure SQL Managed Instance, SQL Server 2025 |
| Blocked | Azure SQL Database, PostgreSQL, MySQL, Oracle, Snowflake, BigQuery, SAP |
On-premises data gateways (required for Oracle, and for SQL Server in many on-premises deployments) fail to register when private link is enabled; VNet data gateways work as a substitute but require separate provisioning. If private link is enabled or planned, verify connector support before building a mirroring architecture around a connector that will not function in that network configuration.
Gateway support for sources behind firewalls (distinct from the Block Internet scenario). Separately from the private link constraint above, Azure SQL Database and Snowflake now support replication through On-Premises Data Gateway and VNet Data Gateway for sources that are behind a firewall but where the tenant-level "Block Public Internet Access" setting is not enabled. This expands gateway support beyond Oracle and SQL Server, which were previously the only connectors that used a gateway path. Azure SQL Managed Instance gateway support has since shipped — VNet data gateway or on-premises data gateway can be used when the instance is not publicly accessible, per the SQL MI FAQ.
CDC requirements vary significantly by source.
| Source | Mechanism used | Source-side requirement |
|---|---|---|
| Azure SQL Database | SQL Change Feed | None — no CDC required |
| Azure SQL MI | SQL Change Feed (newer) / CDC (SQL 2022 policy) | Depends on update policy |
| SQL Server 2016–2022 | CDC | CDC must be enabled on the source database |
| SQL Server 2025 | Change Event Stream | CDC must not be enabled — incompatible |
| PostgreSQL | Logical replication | wal_level = logical required |
| MySQL | Binary log replication | Binary logging required |
| Oracle | LogMiner | Archive log mode and supplemental logging required |
| Snowflake | Snowflake Streams | Change tracking on tables |
| Cosmos DB | Cosmos DB change feed | Built-in — no configuration required |
| BigQuery | BigQuery CDC (Storage Write API) | CDC must be enabled per table; Google charges for CDC compute and Storage Write API |
| SAP | SAP Datasphere change capture | Handled by SAP Datasphere — no direct source-side config; Datasphere must be separately licensed |
The SQL Server 2016–2022 CDC requirement carries the most operational risk. CDC writes captured changes to dedicated tables in the source database, consumes transaction log space, and requires active management to prevent log fill. On a busy SQL Server, DBAs often resist enabling it. If the transaction log fills and is truncated before mirroring can process it, the mirror reseeds from current state rather than resuming incrementally. Azure SQL Database and SQL Server 2025 use newer architectures (Change Feed and Change Event Stream respectively) that carry meaningfully lower source-side overhead. Where there is a choice of source version, the newer architecture is worth preferring.
Availability group secondaries are not supported for SQL Server. Fabric Mirroring for SQL Server requires connection to the primary of an Always On availability group. Connecting to a secondary - even a designated readable secondary - is not supported. Organisations that deliberately route external connections to the secondary for load isolation will find this a hard blocker.
Deletion vectors break some Python Delta readers. When rows are deleted from a source and replicated to Fabric, the Delta table uses deletion vectors - marking deleted rows without rewriting Parquet files. Spark handles this correctly, as does DuckDB 1.2 and above. Polars (via delta-rs) does not: it raises DeltaProtocolError: The table has set these reader features: {'deletionVectors'} but these are not yet supported. This was confirmed by community reports and acknowledged by Microsoft engineers in May 2025. Reads succeed until the first delete is replicated, then fail. Use Spark or DuckDB for Delta reads against mirrored tables.
Reseed behaviour and automatic recovery. If Fabric capacity is paused for an extended period and the source database's transaction log is truncated before mirroring can resume, a reseed from current source state is triggered. For Azure SQL Database and Azure SQL Managed Instance, automatic reseed is enabled by default - mirroring reinitialises automatically rather than staying broken. SQL Server 2025 supports it but it is disabled by default. Two reseed triggers exist: table-level (DDL changes, truncate, rename) and database-level (transaction log exceeds a configured threshold). In all cases, reseed reinitialises from current source state - the improvement is operational resilience, not historical fidelity.
Structured replication logs via Workspace Monitoring. Enable Workspace Monitoring and replication events are written to an Eventhouse KQL database automatically. The MirroredDatabaseTableExecution table records ProcessedRows, ProcessedBytes, ReplicatorBatchLatency, OperationStartTime, OperationEndTime, and ErrorMessage per operation - enough to query replication history, measure lag over time, and surface failures. Workspace Monitoring is not enabled by default and is charged at standard Eventhouse rates.
Minor configuration limits. Maximum table count per mirrored database is 1,000 - sources above this threshold must be split across multiple items. Every mirrored database automatically creates a paired SQL analytics endpoint that cannot be deleted or disabled independently. For MySQL, only one database per server instance can be mirrored, and tables cannot be added or removed after initial setup.
Security and governance
Row-level security (RLS), column-level security (OLS), and dynamic data masking configured in the source system are not propagated to the mirrored database. This is documented for all SQL-family sources and applies universally. A user restricted from seeing certain rows or columns in the source will have no such restrictions applied when querying the mirrored replica via the SQL analytics endpoint or Spark, unless those controls are manually re-implemented at the Fabric layer.
This is an architectural consequence of the data moving to a different system, not a mirroring bug. It is a significant governance consideration for organisations with row-level security in source databases. Any security model that relies on source-database RLS must be rebuilt entirely in Fabric.
The documentation on RLS behaviour at the SQL analytics endpoint, cross-tenant sharing scenarios, and the interaction between workspace permissions and item-level permissions is thin. Before deploying mirroring for data with regulatory or contractual access controls, test the security model explicitly - do not infer from the source database's configuration.
Choosing your path
The evaluation is two-dimensional. Your source constrains what is technically possible; your requirements determine whether what is possible is actually suitable. Most practitioners inherit their source - they do not choose it. Start there.
Phase 1: Source assessment
1. Does your source have a native connector, and how mature is it?
| Connector tier | Sources | Guidance |
|---|---|---|
| Mature | Azure SQL Database, Azure SQL MI, SQL Server, PostgreSQL | Well-documented, production-validated; proceed to step 2 |
| Established with known limits | MySQL, Cosmos DB, Snowflake | Documented with specific constraints; review appendix before proceeding |
| Sparse documentation | Oracle, BigQuery, SAP, Dremio | Significant gaps in published behaviour; treat - cells in appendix as unknowns requiring direct testing; budget validation time before committing |
| No native connector | Anything else | Jump to Phase 1, step 4 |
2. Are there hard blockers for your specific source?
Check these before going further. Any one of them may end the native mirroring evaluation:
- Oracle:
NUMBERcolumns without explicit precision/scale cause table-level failures; On-Premises Data Gateway required in all environments including cloud-hosted Oracle - MySQL: Burstable compute tier unsupported; only one database per server can be mirrored; tables cannot be added or removed after initial setup
- SQL Server: Availability group secondaries not supported - connection must be to the primary
- Any source: If your Fabric tenant has "Block Public Internet Access" enabled, check the private link support list - most connectors will enter a paused state
3. Can you get source-side configuration approved?
Configuration requirements vary significantly by source and typically require DBA or infrastructure team approval:
| Source | Requirement | Overhead |
|---|---|---|
| Azure SQL Database | None — Change Feed is built-in | Zero |
| SQL Server 2025 | None — Change Event Stream is built-in | Zero |
| Cosmos DB | None — change feed is built-in | Zero |
| SQL Server 2016–2022 | CDC must be enabled on the database | Moderate — log space, active management |
| PostgreSQL | wal_level = logical required |
Low — server restart may be required |
| MySQL | Binary logging required | Low |
| Oracle | Archive log mode + supplemental logging + LogMiner access | High — significant DBA engagement typically required |
| BigQuery | CDC must be enabled per table | Low — but Google charges for CDC compute and Storage Write API |
| SAP | Handled by SAP Datasphere | High — Datasphere must be separately licensed and configured |
If approval is uncertain or refused, native mirroring is blocked regardless of requirements. Go to step 4.
4. If no native connector, or source-side config not approvable: can you use Open Mirroring?
- Can you read changes from the source using existing mechanisms - Change Tracking, RowVersion, date-modified timestamps, triggers?
- Can you run a process that writes Parquet files to OneLake?
- Yes to both → Open Mirroring is your path. No source-side CDC configuration required.
- No → Traditional pipeline or CDC stack. Mirroring in any form is not available for this source.
Phase 2: Requirements fit
Only reach this phase if Phase 1 confirmed a viable source. These questions determine whether what is technically available is actually suitable.
5. What is the reporting use case?
- Operational reporting on current source state, minimal or no transforms → source-to-semantic with Direct Lake; this is mirroring's clearest win. Proceed.
- Reporting that requires transforms (cleaning, joins, aggregations) → mirroring handles ingestion; plan compute budget for transform layers separately. Near-real-time end-to-end requires Spark Structured Streaming or high-frequency micro-batches - neither is free. Decide whether that compute cost is justified before committing.
6. Do you need a replayable raw layer?
- Yes, and unwilling or unable to maintain one alongside the mirror → mirroring alone is insufficient. Use a traditional pipeline, or use mirroring for current-state reporting while maintaining a separate raw capture layer - see the raw layer section above.
- No, or able to maintain a raw capture layer separately → proceed.
7. Do you need source-level RLS or column security to propagate automatically?
- Yes → mirroring is the wrong tool regardless of source. Security policies are not propagated; they must be rebuilt at the Fabric layer.
- No → proceed.
8. Do you need a guaranteed, SLA-backed latency?
- Yes → mirroring has no published SLA. Use a scheduled pipeline for predictable, measurable timing.
- Near-real-time is sufficient, SLA not required → mirroring is a fit.
Summary comparison
| Native mirroring | Open Mirroring | Traditional pipeline | |
|---|---|---|---|
| Native connector required | Yes | No | No |
| Source-side config required | Yes (varies by source) | No | Depends on approach |
| Near-real-time freshness | Strong fit | Strong fit | Harder to achieve |
| Source-to-semantic, no transforms | Ideal | Viable | Overkill |
| Transform layers (clean, join, aggregate) | Ingestion only; transforms extra | Ingestion only; transforms extra | Full stack |
| Replayable raw layer | Not provided | Not provided | Built-in |
| Immutable append-only history | Not provided | Not provided | Built-in if designed for it |
| Source RLS / column security propagated | No | No | Depends |
| Private link (Block Internet enabled) | Most connectors unsupported | Supported | Supported |
| Guaranteed latency SLA | No | No | Yes (schedulable) |
| Engineering overhead | Low | Medium | High |
| Ingestion cost | Free | Free (write transactions charged) | Consumes capacity |
Verdict
Mirroring is one of the more genuinely compelling things in the Fabric toolkit - continuous ingestion at near-zero cost is a meaningful offer, and for a large class of operational reporting use cases it works exactly as described. The qualification is precision: it solves a specific problem, and the more clearly that problem is defined before committing, the less likely its limits are to surface in production.
Where it fits well:
- near-real-time operational reporting where the mirror reflects current source state
- replacing DirectQuery with Direct Lake on large datasets
- reducing analytical query pressure on production sources
- Open Mirroring as a flexible ingestion mechanism for sources without a native connector
Where it does not fit:
- architectures that require historical replay or reprocessable ingestion
- workloads where append-only, idempotent semantics matter
- organisations with compliance requirements for immutable historical records
- sources with Oracle
NUMBERcolumns, MySQL on Burstable compute, or SQL Server availability group secondaries - any scenario where source security policies must be automatically enforced in Fabric
On the raw layer question. Whether mirroring is an appropriate "bronze" or raw layer comes down to what that layer needs to guarantee. If it needs to be a current-state replica - a query target for operational reporting - mirroring is an excellent implementation of it. If it needs to be an immutable historical store, a replayable event stream, or a disaster recovery asset, mirroring is not a substitute. These two roles are not in competition; many production architectures will benefit from both running in parallel, with mirroring handling the real-time operational layer and a separate lightweight capture handling the historical record.
Documentation maturity as a risk signal. The - cells in the appendix matrix are not editorial omissions - they are behaviours Microsoft has not published documentation for. If the vendor of a feature cannot tell you whether TRUNCATE is replicated, or how a DDL change is handled, you will find out in production. The connectors with the most - entries - BigQuery, SAP, Oracle, Dremio - are also generally the less mature ones. Factor testing time into your evaluation, and do not assume that absence of a documented limitation means the limitation does not exist.
What to watch. The extended capabilities (Delta change data feed, view mirroring) are now billed at standard Fabric capacity rates — billing was enabled as of March 2026. These are opt-in features; charges apply only when enabled and only for actual processing activity. If you built on these capabilities when they were free, factor the ongoing cost into your capacity budget. SQL Server 2025 and the Change Event Stream architecture (available on Azure SQL Database and SQL MI) are meaningfully better than CDC-based mirroring for older SQL Server versions - prefer the newer architecture where there is a choice of source version.
Appendix: Capability limitations by source
The table below covers feature and capability support across all ten mirroring sources. It is compiled from the official Microsoft documentation limitations pages for each source, linked in the sources column.
A significant number of cells are marked -. These are not gaps in my research - in every case I checked the official Microsoft documentation and found no published guidance. For several sources, including BigQuery, SAP, and Oracle, fundamental operational questions go unanswered: does the connector handle DDL changes? Is TRUNCATE replicated? Can you mirror partitioned tables? Microsoft simply has not documented the answers. The volume of - entries is itself a signal: the less a connector is documented, the less mature it should be assumed to be. For any - cell that matters to your use case, treat it as test before you rely on it in production - you will not find the answer in the docs.
Column key
| Abbrev | Source | Limitations page |
|---|---|---|
| SQL DB | Azure SQL Database | link |
| SQL MI | Azure SQL Managed Instance | link |
| SQL Svr | SQL Server (on-premises) | link |
| PG | Azure Database for PostgreSQL | link |
| MySQL | Azure Database for MySQL | link |
| Oracle | Oracle (via On-Premises Data Gateway) | link |
| SF | Snowflake | link |
| CDB | Azure Cosmos DB | link |
| BQ | Google BigQuery | link |
| SAP | SAP (via SAP Datasphere + ADLS Gen2) | link |
Legend
| Symbol | Meaning |
|---|---|
| No (TABLE) | Not supported; table is entirely blocked from mirroring |
| No | Not supported; feature does not apply or is explicitly excluded |
| ✓ | Supported |
| N/A | Concept does not exist in this source |
| - | Microsoft has not published documentation for this combination; behaviour is unknown without direct testing |
Table and feature support
| Limitation | SQL DB | SQL MI | SQL Svr | PG | MySQL | Oracle | SF | CDB | BQ | SAP |
|---|---|---|---|---|---|---|---|---|---|---|
| Primary key required? | No [a] | No [b] | No (2025) / Yes (2016–22) | No | Yes | No [c] | No | N/A | No | - |
| Max tables per mirrored DB | 1,000 | 1,000 | 1,000 | 1,000 | 1,000 | 1,000 | N/A | N/A | N/A | N/A |
| Views replicated? | No | No | No | No | No | No | No [d] | N/A | - | - |
| Partitioned tables | - | - | - | No (TABLE) | - | Yes | - | N/A | - | - |
| Materialized views | No | No | No | No (TABLE) | N/A | - | - | N/A | - | - |
| External tables | No | No | No | No (TABLE) | N/A | N/A | No | N/A | - | N/A |
| In-memory OLTP tables | No (TABLE) | No (TABLE) | No (TABLE) | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| Always Encrypted tables | No (TABLE) | No (TABLE) | No (TABLE) | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| Temporal history tables | No (TABLE) | No (TABLE) | No (TABLE) | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| Graph tables | No (TABLE) | No (TABLE) | No (TABLE) | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| Clustered columnstore index | No (TABLE) | No (TABLE) | No (TABLE) | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
DDL and schema change behaviour
| Operation | SQL DB | SQL MI | SQL Svr | PG | MySQL | Oracle | SF | CDB | BQ | SAP |
|---|---|---|---|---|---|---|---|---|---|---|
| Add column | Auto | Auto | Auto (2025) / Error (2016–22) | Stop/restart | Disrupts replication | ✓ (partial) | Delayed | ✓ | - | - |
| Rename column | - | Blocked | - | Stop/restart | Disrupts replication | ✓ | - | ✓ (old + new col retained) | - | - |
| Change column type | - | Blocked | - | Stop/restart | Disrupts replication | Blocked | - | Compatible types only | - | - |
| Alter / add primary key | Blocked | Blocked | Blocked | - | - | N/A | - | N/A | N/A | N/A |
| TRUNCATE replicated? | - | - | - | No | - | - | - | N/A | - | N/A |
Security and connectivity
| Limitation | SQL DB | SQL MI | SQL Svr | PG | MySQL | Oracle | SF | CDB | BQ | SAP |
|---|---|---|---|---|---|---|---|---|---|---|
| RLS propagated from source | No | No | No | No | No | N/A | N/A | N/A | N/A | N/A |
| Column / OLS propagated | No | No | No | No | N/A | N/A | N/A | N/A | N/A | N/A |
| Dynamic data masking propagated | No | No | No | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| On-Premises Data Gateway required | No (optional) | Depends [e] | Always | No (optional) | No (optional) | Always | No (optional) | No [f] | Always | N/A [g] |
| Source-side costs during mirroring | No | No | No | No | No | No | Yes (compute) | No [h] | Yes (CDC compute) | Yes (Datasphere pricing) |
| Burstable compute tier supported | ✓ | ✓ | N/A | No | No | N/A | N/A | N/A | N/A | N/A |
Item and configuration limits
| Limitation | SQL DB | SQL MI | SQL Svr | PG | MySQL | Oracle | SF | CDB | BQ | SAP |
|---|---|---|---|---|---|---|---|---|---|---|
| Multiple databases per connection | ✓ | ✓ | ✓ | ✓ | No [i] | ✓ | ✓ | ✓ | ✓ | N/A |
| Can mirror to multiple workspaces | No | No | No | No | N/A | - | ✓ | - | - | - |
| Ownership change supported | No | No | No | No | No | No | No | No | No | No |
| Can change source after setup | No | No | No | No | No | No | No | No | No | No |
Notes
[a] No primary key required as of April 2025. Existing keyless tables that were excluded before this date must be manually re-added to the mirror.
[b] No primary key required as of May 2025. Same re-add caveat applies.
[c] Oracle supports tables without a primary key if a unique index exists. Tables with neither a primary key nor a unique index cannot be mirrored.
[d] View mirroring is available for Snowflake only, as a paid extended capability. Billing is currently disabled; the feature must be enabled via API as the UX toggle is temporarily unavailable.
[e] SQL MI on the SQL Server 2022 update policy requires a data gateway. Always-up-to-date and 2025 update policies do not.
[f] Cosmos DB uses Network ACL Bypass, so no gateway is required even for accounts on VNets or private endpoints.
[g] SAP mirroring routes data through SAP Datasphere into ADLS Gen2. Fabric connects to the ADLS Gen2 container, not to SAP directly. The gateway question does not apply in the same sense.
[h] Cosmos DB Data Explorer queries initiated from within the Fabric experience consume RUs from the source Cosmos DB account.
[i] Only one MySQL database per server instance can be mirrored. Tables cannot be added or removed after the initial mirror configuration is set up.
All factual claims in this post are linked to their source. Claims sourced from community reports are identified as such and may not reflect the current state of the product.
Brad Coles is a Senior Consultant and Data Engineering Capability Lead at Synechron Australia, specialising in Microsoft Fabric and modern data platform engineering. Connect on LinkedIn.