The Egress Tax: how cloud providers engineer data gravity
Free ingress, cheap storage, 18x egress markup. The asymmetry is intentional.
TL;DR: Cloud egress is priced at 18-24x wholesale cost to create switching costs, not recover bandwidth expenses. But internet egress is only part of the story: the cross-AZ and cross-region charges baked into every HA deployment and pipeline run are continuous and often larger. The hyperscalers' offer to waive egress only applies if you fully close your account. Design for locality, negotiate egress into contracts, and budget for intra-cloud transfer as a line item.
The economics of data gravity
Move 1 TB of data into AWS: free. Move it out: $90. The actual cost of that bandwidth to the provider? Roughly $0.005/GB, based on wholesale transit pricing1.
That's an 18x markup. Not 2x, not 5x. Eighteen times the actual cost.
I don't think this is price gouging in the traditional sense. It's something more deliberate: egress fees are designed to create switching costs. The "roach motel" model of cloud economics: data checks in but doesn't check out.
What data gravity means
The concept of "data gravity" was coined by Dave McCrory in 20102. The basic insight: data attracts applications, services, and more data. The larger your dataset, the harder it becomes to move, not because of technical limitations but because of the ecosystem that forms around it.
In my mental model, teams hit a "gravity threshold" somewhere between 10 TB and 100 TB. Below that, migration is annoying but feasible. Above it, the conversation shifts from "should we move?" to "can we afford to move?"
The math of switching costs
Example: 100 TB analytics workload
The egress cost equals roughly 4 months of storage. That's your "tax" for leaving. And that's just the transfer fee: add migration engineering, testing, and risk, and the real switching cost is multiples higher. Even at volume discount tiers ($0.05/GB at 150 TB+ on AWS3), the markup over wholesale is still 10x.
The "roach motel" business model
How free ingress creates lock-in
The cloud acquisition model works like this:
Stage 1: Land (free ingress + cheap storage)
"Try us out! Data transfer is free!"
Storage is cheap: $0.023/GB/month for S3 Standard4
Low barrier to entry, easy first step
Stage 2: Expand (add compute, services, integrations)
Now you're running queries, building pipelines
Add Lambda, Athena, SageMaker
Interconnections multiply
Stage 3: Lock (data gravity + egress costs)
50 TB and growing
15 services connected
Egress would cost $4,500+
Too expensive to leave
Free ingress is a marketing expense. Cheap storage is a retention mechanism. Expensive egress is a switching cost.
Example: A migration that won't happen
Scenario: A mid-size company considers migrating 500 TB from Redshift on AWS to BigQuery on GCP.
The first calculation:
"A 20% savings! Let's migrate!"
The real calculation:
Break-even: $125,000 / $17,000 = 7.4 months, if nothing goes wrong during migration.
The customer stays. Data gravity wins. And "next year" never comes, because by then the dataset has grown and the math is even worse.
It's not just the hyperscalers. Analytics vendors inherit or create their own gravity. Snowflake's storage sits on the underlying cloud provider (S3, Azure Blob, GCS)5, so leaving Snowflake still triggers cloud egress. Databricks avoids direct egress exposure by keeping data in the customer's account, but Unity Catalog creates its own form of gravity: governance metadata is harder to migrate than raw data. And as platforms like ClickHouse Cloud mature, they add transfer-based charges that look a lot like the hyperscaler playbook6.
Multi-cloud multiplies egress
Why multi-cloud costs more than single-cloud
The promise of multi-cloud: "Avoid lock-in by spreading across clouds."
The reality: multi-cloud multiplies egress costs.
Example: Cross-cloud analytics pipeline
Data lands in AWS S3
↓ ($0.09/GB egress)
Processing in GCP BigQuery
↓ ($0.12/GB egress)
Visualization in Azure Power BIEvery stage of the pipeline triggers egress. For a workload processing 10 TB/day:
That's $63,000/month on top of compute and storage costs, purely from egress.
Multi-cloud can work if you minimize cross-cloud data movement: separate workloads by region or by function (transactional on Cloud A, analytics on Cloud B) and batch-sync once daily instead of streaming. Active-active replication across clouds is the expensive extreme, justified only for critical availability requirements.
The transfer costs inside your cloud
Most egress discussions focus on internet egress: data leaving the cloud entirely. But the transfer costs that actually surprise teams are the ones inside the cloud. Cross-AZ and cross-region transfers are billed quietly, buried in line items that most engineers never see until someone audits the bill.
Cross-AZ: the invisible tax on high availability
Every cloud provider charges for data that crosses availability zone boundaries. Since multi-AZ deployments are the default for production workloads, this cost is effectively baked into any serious architecture.
Azure's free cross-AZ transfer looks like a clear win, but there's a significant catch: as of early 2026, roughly a third of Azure's public regions still don't support availability zones at all10. That includes GA regions like North Central US, Canada East, UK West, West US, and Australia Southeast. If your workload runs in one of these regions, "multi-AZ" isn't an option. Your HA strategy requires cross-region replication, which costs $0.02-0.08/GB11, the same range as AWS and GCP. Azure's free cross-AZ advantage only applies if you're in a region that actually has AZs.
On AWS and GCP, that $0.01/GB sounds trivial until you calculate the volume. The catch is that AWS charges $0.01/GB in each direction for cross-AZ traffic between services like EC2, RDS, and Redshift12, making the effective cost $0.02/GB round-trip. Notably, S3 transfers within the same region are free regardless of AZ, so services reading from S3 (including Redshift managed storage) don't incur this charge.
The services that do generate cross-AZ costs are the ones that communicate directly: RDS replication, EC2-to-RDS queries, load balancers distributing across AZs, and NAT gateways.
Example: Analytics pipeline with cross-AZ overhead
Consider a typical setup: an RDS instance in multi-AZ with an ETL process running on EC2 in a different AZ.
RDS multi-AZ replication is the main cost driver: it synchronously replicates every write to the standby in another AZ, charged at $0.01/GB in each direction. The downstream S3 and Redshift steps are free because S3 is a regional service.
That's $4,800/year for a modest workload. Scale the database to 2 TB/day of writes and RDS replication alone hits $1,200/month, or $14,400/year in transfer costs that don't appear in any compute or storage line item.
Cross-region: the compliance and DR multiplier
Cross-region transfer costs apply to disaster recovery, compliance-driven replication, and serving global users from regional data stores. On AWS, cross-region is double the cross-AZ rate. On GCP and Azure, the range is wider.
If you're replicating a 100 TB data lake from US to EU for GDPR compliance, that's a one-time $2,000 transfer cost on AWS, plus ongoing replication costs for new data. A daily sync of 500 GB of changes costs $10/day, or $3,650/year, just for the transfer.
Why this matters more than internet egress
Here's the thing most people miss: internet egress is a tax you pay occasionally, when migrating or serving external users. Cross-AZ and cross-region transfer costs are taxes you pay continuously, on every read, every replication, every pipeline run. They compound. For analytics workloads with multi-AZ HA and cross-region DR, intra-cloud transfer can easily exceed internet egress in aggregate.
Reducing the tax
Strategy 1: Minimize data movement by design
The cheapest data transfer is the one that doesn't happen. This applies to cross-AZ and cross-region transfers just as much as internet egress.
Design principles:
Process data where it lives
Aggregate before moving
Cache at edges
Batch instead of stream when possible
Example transformation:
If your analytics pipeline can work with pre-aggregated summaries, do the aggregation where the data lives.
Strategy 2: Negotiate egress into contracts
If you're spending $1M+ annually, negotiate egress relief into your enterprise agreement: committed egress credits, reduced rates tied to spend commitment, or free egress for specific use cases (backup, DR, compliance). GCP's more aggressive egress pricing is a useful leverage point against AWS and Azure.
Strategy 3: Use physical transfer for bulk migrations
For migrations over 100 TB, physical transfer devices can beat network transfer on both cost and time. AWS Snowball Edge16, GCP Transfer Appliance17, and Azure Data Box18 all ship storage devices to your datacenter, letting you move data without touching egress pricing at all. Check each provider's current device specs; capacities and product lines change frequently.
Strategy 4: Calculate true TCO including transfer costs
Most platform comparisons only account for compute and storage. Your TCO model should also include internet egress, cross-AZ transfer (how many AZs does the architecture span?), cross-region transfer (do you replicate for DR or compliance?), and one-time switching costs. These are often invisible until the first bill arrives.
The pressure on egress pricing
Regulatory pressure
EU Digital Markets Act19:
Designates major cloud providers as "gatekeepers"
Requires data portability and interoperability
Could force egress price reductions in EU
US regulatory interest:
FTC published a cloud market study in 2024 identifying egress fees and switching costs as competition concerns
No enforcement action yet, but the issue is on the regulatory radar
The "free to leave" offer that almost nobody can use
In March 2024, all three hyperscalers announced they would waive egress fees for customers migrating away. Headlines declared the end of egress lock-in.
The fine print: AWS issues retroactive credits (not a real-time waiver), requires support approval, and gives you 60 days to complete the migration. The offer targets customers who are switching away entirely. Partial repatriation, moving some workloads back on-prem while keeping your account active, doesn't qualify.
How many companies fully close their cloud accounts? Almost none. The typical enterprise has dozens of services, hundreds of integrations, and compliance dependencies that make a clean break practically impossible.
The most prominent company to actually do it is 37signals, makers of Basecamp and HEY. Their CTO, DHH, has documented the entire exit publicly: $3.2M/year in cloud spend reduced to well under $1M, with projected savings over $10M across five years. They migrated 6 petabytes of S3 data to on-prem Pure Storage, and AWS honored the waiver, crediting roughly $250,000 in egress fees. Even DHH noted that getting the credits approved "took a while."
But 37signals runs a handful of Rails applications with a simple architecture, a CTO who made cloud exit a personal crusade, and their own data center space. For a typical enterprise, full account closure isn't a realistic option, which is exactly the point. The waiver addresses the nuclear option of complete departure while leaving day-to-day egress pricing unchanged. The announcements were a response to the EU Data Act (which mandates zero switching fees by January 2027), not a genuine change in the economics of data gravity.
Competitive pressure from new entrants
Cloudflare R2:
S3-compatible storage
Zero egress fees
Clear competitive attack on data gravity
Oracle Cloud:
$0.0085/GB egress after a free 10 TB/month tier
Targeting migrations from AWS
Wasabi:
No egress fees
Hot storage priced below the hyperscalers' cold tiers
The incumbents have responded with expanded free tiers and volume discounts, but core pricing remains high. Competition is pressuring the edges, not the center.
Open formats don't solve location gravity
Open table formats (Parquet, Iceberg, Delta Lake) reduce format lock-in, but your Iceberg tables are still in S3. Moving them to GCS still triggers egress. Format portability is not location portability.
What I'd tell a data team today
I don't think cloud providers are villains for charging egress. They're rational actors optimizing for retention in a market with high customer lifetime value. Understanding the game lets you play it strategically.
The key things I'd want any data team to internalize:
Egress pricing is strategic, not cost-based. An 18-24x markup over wholesale bandwidth tells you this isn't about cost recovery.
Data gravity is engineered. Free ingress, cheap storage, expensive egress. The asymmetry is intentional.
Internet egress is only part of the transfer cost story. Cross-AZ and cross-region charges are continuous and often larger in aggregate for analytics workloads.
Multi-cloud multiplies all of these costs. If you're going multi-cloud to avoid lock-in, model the transfer costs first.
Design for locality. Process data where it lives, aggregate before moving, co-locate compute and storage in the same AZ when possible.
If I were advising a data team making platform decisions today, I'd say: build transfer costs into your TCO model from day one, not as an afterthought. Negotiate egress relief into enterprise agreements. And budget for cross-AZ costs as a line item, because they will surprise you if you don't.
The egress tax is real. The intra-cloud transfer tax is real and less visible. Plan for both.
Footnotes
This post is part of the Business of Analytics series, examining vendor incentives across the data stack to help practitioners make informed technology decisions.
Data Gravity: In the Clouds - Dave McCrory, 2010. Original concept definition.
2024 Internet Transit Pricing - DrPeering. Wholesale bandwidth cost analysis.
ClickHouse Cloud Pricing Changes - ClickHouse, 2024-2025. Evolution from zero egress to consumption-based egress.
Cloudflare R2 Pricing - Cloudflare. Zero egress object storage.
AWS Data Transfer Pricing - AWS, December 2024.
Google Cloud Network Pricing - Google Cloud, December 2024.
Azure Bandwidth Pricing - Microsoft Azure, December 2024.
List of Azure regions - Microsoft Learn, February 2026. Of ~57 public regions, roughly 38 support availability zones; the remainder, including several non-restricted GA regions, do not.
Amazon S3 Pricing - AWS. $0.023/GB/month for S3 Standard, first 50 TB, US East.
Snowflake Architecture Overview - Snowflake Documentation. Storage layer uses cloud provider object storage (S3, Azure Blob, GCS).
AWS Snowball Edge - AWS. Physical data transfer devices.
Transfer Appliance - Google Cloud. Physical data transfer devices.
Azure Data Box - Microsoft Azure. Physical data transfer devices.
Digital Markets Act - European Commission. Regulation (EU) 2022/1925, effective May 2023.
Examining the Impact of Cloud Computing on Competition - FTC, October 2024. Identifies egress fees and switching costs as barriers to competition.
Oracle Cloud Networking Pricing - Oracle. $0.0085/GB egress after 10 TB/month free.
Wasabi Pricing - Wasabi. No egress fees on hot storage.
Free Data Transfer Out to Internet When Moving Out of AWS - AWS Blog, March 2024. Google Cloud and Azure made similar announcements the same quarter.
Our Cloud-Exit Savings Will Now Top Ten Million Over Five Years - DHH, 2024. See also It's Five Grand a Day to Miss Our S3 Exit, March 2025.



