
Industry research
Scope
US
Companies
31
Table of contents
What does the data warehousing market landscape look like in the US?
Customer contracts primarily feature usage-based pricing models, driven by fees for compute processing and data storage. To illustrate, Snowflake’s consumption model prices are computed through credits and storage on a TB-month basis (Snowflake, December 2025). However, this model exposes players to customers who optimize their spending during economic stress, when CFOs may mandate cloud cost reductions by shrinking data retention or pausing workloads. As a result, some platforms have shifted toward hybrid pricing, layering in incremental revenue from data transfer fees, subscription tiers for advanced security and governance features, as well as paid support or professional services. For example, Databricks supplements its Databricks unit (DBU)-based pricing with add-ons for security, support, training and professional services, thereby contributing additional recurring and project-based revenue (Databricks, April 2024). In commercial terms, contract structures typically combine usage-based billing with 1- to 3-year subscription commitments. This is exemplified by SingleStore’s Helios service on AWS Marketplace, which offers multi-year contract terms of 12, 24 or 36 months for the managed database offering (AWS, December 2025). On the cost side, players are exposed to third-party public cloud charges for compute, storage and networking, which compress gross margins as customers optimize usage and as pricing pressure outpaces reductions in underlying hyperscaler fees. For example, Teradata reports ~$223m in public cloud fees (~15% of product sales) within its 2024 cost of sales, which highlights the scale of hyperscaler infrastructure costs embedded in its cloud-delivered offering (Teradata, February 2025).
What is the level of investor activity in the US's data warehousing industry?
Sponsor-led interest has been significant, with ~93% of identified assets being investor-backed (December 2025). Herein, investor appetite is concentrated among VC funds that view data warehousing platforms as critical enablers of AI and streaming workloads, with strong usage-based economics, high-margin potential and durable customer lock-in. For instance, Databricks raised ~$10bn in a 2024 funding round at a valuation of ~$100bn, which illustrates the scale of capital flowing into modern data platforms. Investors are attracted to (i) the rapid expansion of AI programs that push enterprises to modernize data platforms, (ii) the rising adoption of real-time analytics and streaming workloads that boost demand for low-latency architectures, as well as (iii) the expansion of global compute and storage capacity that supports scaling of cloud-delivered analytics platforms. However, deterring factors for sponsors include (i) rising cloud infrastructure costs that compress margins and weaken returns on modern analytics projects, (ii) the persistent reliance on legacy data estates that slow migration to modern architecture and increase integration workload, as well as (iii) shortages of skilled data and AI talent that raise wage levels and limit delivery capacity for complex deployments.
What are the key ESG considerations in the US's data warehousing industry?
ESG topics cover environmental and governance concerns. Environmental risks arise from high energy demand in data centers and substantial water use for cooling across cloud infrastructure. Players mitigate these risks through energy-efficient cloud setups, low-overhead architectures and water-saving cooling methods. Governance risks stem from data breaches, cloud misconfigurations and limited oversight of AI-driven workloads. To address these issues, players strengthen security controls, centralize data governance and improve transparency around how data and AI systems operate.

Technavio (February 2025) estimates that the global data warehousing generated ~$34.9bn in revenue in 2024 and forecasts it to reach ~$67.2bn by 2029 (+14.0% CAGR 2024-2029)
The global data lakehouse market is projected to grow from ~$13.9bn in 2025 to ~$74.0bn by 2033 (+23.2% CAGR 2025-2033), with North America holding ~35% of the market share in 2024 (LinkedIn, September 2025)
Rapid expansion of AI initiatives is pushing enterprises to modernize core data platforms, creating demand for years to come. Surveys show that ~88% of senior executives plan to increase AI-related budgets on an NTM basis, with the average projected AI investment for the next 12-month outlook at ~$130m. Herein, data and analytics are emerging as a top spending priority, accounting for ~20% of overall IT budgets, which underscores and supports the sustained migration toward modern data platforms (KPMG, September 2025; PwC, May 2025)
Rising prioritization of real-time analytics and streaming data boosts demand for platforms that support low-latency workloads across warehouse-centric lakehouse and real-time architectures. To illustrate, ~68% of IT leaders expect investment in data streaming technology to increase over the next two years, which supports sustained demand for low-latency analytics platforms (Confluent, May 2025)
Sustained investment in data center capacity, driven by the demands of AI and high-performance computing, is expanding the global compute and storage base. This expansion provides ample resources for advanced analytics workloads and supports the long-term growth of cloud-delivered data warehousing. To illustrate, data centers around the globe are estimated to receive a capital investment of ~$6.7tn by 2030 to support the rising demand for computing power (RCR Wireless, September 2025; McKinsey & Company, April 2025)
Rising cloud infrastructure costs for compute, storage and data transfer continue to compress margins for data warehousing businesses, potentially delaying or diminishing the expected financial return from modern analytics projects. As per a Flexera survey, cloud expenditure rose by ~28% in 2025 and ~84% of cloud decision makers struggle to manage these costs, which signals persistent pressure on budgets and platform profitability (Flexera, March 2025)
Legacy and fragmented customer data estates keep providers tied up in complex integrations and migrations. This complexity is expected to increase delivery effort and elongate sales cycles, ultimately slowing the adoption of modern data architectures. According to industry surveys, ~62% of US organizations continue to depend on legacy platforms and ~63% report lacking AI-ready data practices, with forecasts that most AI projects without such foundations will be abandoned through 2026 (Saritasa, August 2025; Gartner, February 2025; Back End News, February 2025)
Persistent shortages of skilled data engineers, AI specialists and streaming architects raise wage costs and limit the capacity of data warehousing businesses to deliver complex projects at scale. This is exemplified by an AI talent gap that exceeds ~4m workers, along with forecasts that ~50% of AI roles may remain unfilled by 2027, highlighting sustained pressure on hiring capacity and labor costs (Comrise, September 2025; Bain & Company, March 2025)
With the full report, you’ll gain access to:
Detailed assessments of the market outlook
Insights from c-suite industry executives
A clear overview of all active investors in the industry
An in-depth look into 31 private companies, incl. financials, ownership details and more.
A view on all 191 deals in the industry
ESG assessments with highlighted ESG outperformers







