What Is DWS?
Data Warehouse Service (DWS) is an online data processing database that uses the public cloud infrastructure to provide scalable, fully-managed, and out-of-the-box analytic database service that frees you from database management and monitoring. It is a native cloud service based on the Huawei converged data warehouse GaussDB, and is fully compatible with the standard ANSI SQL 99 and SQL 2003, as well as the PostgreSQL and Oracle ecosystems. DWS provides competitive solutions for PB-level big data analysis in various industries.
DWS employs the shared-nothing architecture and massively parallel processing (MPP) engine and consists of numerous independent logical nodes that do not share system resources, such as CPUs, memory, and storage. In such a system architecture, service data is separately stored on numerous nodes. Data analysis tasks are executed at the location nearest the data. Massively parallel data processing enables quick response.
- Application layer
Data loading tools, Extract-Transform-Load (ETL) tools, Business Intelligence (BI) tools, and data mining and analysis tools can be integrated with DWS through standard interfaces. DWS is compatible with PostgreSQL. The SQL syntax has been modified to make it compatible with Oracle, MySQL, and Teradata. Applications can be smoothly migrated to DWS with few changes.
Applications can connect to DWS through the standard JDBC 4.0 and ODBC 3.5.
- DWS (MPP cluster)
A data warehouse cluster contains nodes with the same flavor in the same subnet. These nodes jointly provide services. DataNodes (DNs) in a cluster store data on disks. The coordinator node (CN) receives access requests from the client and returns the execution results back. Then, the CN divides each task into several smaller ones and assigns them to the DNs for execution.
- Automatic data backup
Cluster snapshots can be automatically backed up to Object Storage Service (OBS), an EB-level object storage service, which facilitates periodic backup of the cluster during off-peak hours, ensuring data recovery after exceptions occur.
A snapshot is a complete backup that records point-in-time configuration data and service data of a data warehouse cluster.
- Tool chain
DWS provides the parallel data loading tool General Data Service (GDS), SQL syntax migration tool DSC, and SQL development tool Data Studio. You can use the management console for cluster O&M and monitoring.