A cloud data warehouse is a service that collects, organizes and often stores data that is used by organizations for different activities including data analytics and monitoring.
Protecting your company’s data is critical. Cloud storage with automated backup is scalable, flexible and provides peace of mind.
The market for cloud data warehouses has grown in recent years, as organizations move to take advantage of cloud economics and reduce their own physical data center footprints.
Cloud data warehouses typically include a database or pointers to a collection of databases, where the production data is collected. The second core element of many modern cloud data warehouses is some form of integrated query engine that enables users to search and analyze the data.
Below are seven of the major cloud data warehouses.
A key differentiator for Redshift is that with its Spectrum feature, organizations can directly connect with data stores in the AWS S3 cloud data storage service, reducing the time and cost it takes to get started.
Redshift's performance benefits from AWS infrastructure and large parallel processing data warehouse architecture for distributing queries and data analysis.
For data that is outside of S3 or an existing data lake, Redshift can integrate with AWS Glue, which is an extract, transform, load (ETL) tool to get data into the data warehouse.
Data warehouse storage and operations are secured with AWS network isolation policies and tools including virtual private cloud (VPC).
As a fully managed cloud service, setup of the data warehouse and resource provisioning are all handled by Google, using serverless technologies.
The ability to easily query data with either SQL or via Open Database Connectivity (ODBC), is a key value of BigQuery enabling users to use existing tools and skills.
Logical data warehousing capabilities in BigQuery lets users connect with other data sources including databases and even spreadsheets to analyze data.
Integration with BigQuery ML is a key differentiator, bringing the worlds of data warehouse and Machine Learning (ML) together. With BigQuery ML machine learning workloads can be trained on data in a data warehouse.
IBM Db2 Warehouse
Integrates the Db2 in-memory, columnar database engine, which can be a big benefit for organizations looking for a data warehouse that includes a high-performance database.
Apache Spark engine is also integrated with Db2, which means that users can use both SQL as well as Spark queries, against the data warehouse to derive insights.
Db2 Warehouse benefits from IBM's Netezza technology with advanced data lookup capabilities
Cloud deployment can be done in either IBM cloud or in AWS, and there is also an on-premises version of Db2 Warehouse, which can be useful for organizations that have hybrid cloud deployment needs.
Microsoft Azure SQL Data Warehouse
Microsoft released a major update for Azure SQL Data Warehouse in July 2019, with the Gen2 update, providing more SQL Server features and advanced security options.
Dynamic Data Masking (DDM) provides a very granular level of security control enabling sensitive data to be hidden on the fly as queries are made.
Existing Microsoft users will likely find the most benefit from Azure SQL Data Warehouse, with multiple integrations across the Microsoft Azure public cloud and more importantly SQL Server for database.
In contrast to simply running SQL Server on-premises, Microsoft has built on a massive parallel processing architecture that can enable users to run over a hundred concurrent queries at the same time.
Oracle Autonomous Data Warehouse
A key differentiator for Oracle is that it is running the Autonomous Data Warehouse in an optimized cloud service running Oracle's Exadata hardware systems, which have been purpose built for Oracle database.
The service integrates a web-based notebook and reporting services to share data analysis and enables easy collaboration.
While Oracle's own namesake database is supported, users can also migrate data from other databases and clouds, including Amazon Redshift, as well as on-premises object data stores.
Oracle's SQL Developer feature is another key feature, which integrates data loading wizard as well as a database development environment.
SAP Data Warehouse Cloud
SAP's HANA cloud services and database are at the core of Data Warehouse Cloud, supplemented by best practices for data governance and integrated with a SQL query engine.
A key differentiator for the platform is the integration of pre-built business templates that can help solve common data warehouse and analytics use-cases for specific industries and lines of business.
For existing SAP users, the integration with other SAP applications means easier access to on-premises as well as cloud data sets.
Snowflake's columnar database engine can handle both structured and semi-structured data such as JSON and XML.
The decoupled Snowflake architecture allows for compute and storage to scale separately, with data storage provided on the user's cloud provider of choice.
The system creates what Snowflake refers to as virtual data warehouse, where different workloads share the same data, but can run independently.
Queries are made via standard SQL, for analytics, with integration with both the R and Python programming languages.