All Products
Search
Document Center

Hologres:Build a real-time data warehouse analytics dashboard

Last Updated:Nov 28, 2024

This topic describes how to connect Hologres to Realtime Compute for Apache Flink and DataV to build a real-time data warehouse analytics dashboard.

Prerequisites

  • Hologres is activated, a Hologres instance is purchased, and a development tool is connected to the Hologres instance. For more information, see Connect to HoloWeb and perform queries.

  • Realtime Compute for Apache Flink is activated. For more information, see Activate Realtime Compute for Apache Flink.

    Note

    You must make sure that Realtime Compute for Apache Flink and Hologres are activated in the same region and use the same virtual private cloud (VPC) and vSwitch.

  • DataV is activated. For more information, see Activate DataV.

Background information

Hologres is an interactive analytics service developed by Alibaba Cloud. Based on the built-in HoloHub API, Hologres connects to Realtime Compute for Apache Flink to support real-time data write and query in high concurrency. Hologres can respond to queries within seconds.

Hologres is compatible with PostgreSQL. You can connect Hologres to a business intelligence (BI) tool to analyze queried data and display data analytics results in a visualized manner.

In this example, an e-commerce store is used as an example to show how to build a dashboard that displays operation metrics in real time. The metrics include the number of unique visitors (UVs) to the store, number of UVs to each product, sales amount in each city, and statistics on hot-selling products.

The following figure shows how to use Hologres to build a dashboard that displays operation metrics in real time.a

  • Source data is collected and written to Realtime Compute for Apache Flink in real time for cleansing and aggregation.

  • Data processed by Realtime Compute for Apache Flink is written to Hologres for interactive queries.

  • Hologres is connected to DataV to display query results in a dashboard in real time.

Procedure

  1. Collect source data.

    Use DataHub or service logs to collect source data. DataHub is a streaming data processing service.

    To simplify the process, source data is generated in Realtime Compute for Apache Flink in this example. For more information, see Step 3 in this section.

  2. Create a table in Hologres to receive data.

    Use HoloWeb to create a table to receive data. Make sure that the table contains the same fields of the same data types as the source table. For more information, see Connect to HoloWeb and perform queries. The following SQL statements provide an example:

    BEGIN;
    CREATE TABLE public.order_details (
    "user_id" int8,
    "user_name" text,
    "item_id" int8,
    "item_name" text,
    "price" numeric(38,2),
    "province" text,
    "city" text,
    "ip" text,
    "longitude" text,
    "latitude" text,
    "sale_timestamp" timestamptz NOT NULL
    );
    CALL SET_TABLE_PROPERTY('public.order_details','orientation', 'column');
    CALL SET_TABLE_PROPERTY('public.order_details','clustering_key', 'sale_timestamp:asc');
    CALL SET_TABLE_PROPERTY('public.order_details','segment_key', 'sale_timestamp');
    CALL SET_TABLE_PROPERTY('public.order_details','bitmap_columns', 'user_name,item_name,province,city,ip,longitude,latitude');
    CALL SET_TABLE_PROPERTY('public.order_details','dictionary_encoding_columns','user_name:auto,item_name:auto,province:auto,city:auto,ip:auto,longitude:auto,latitude:auto');
    CALL SET_TABLE_PROPERTY('public.order_details','time_to_live_in_seconds', '3153600000');
    CALL SET_TABLE_PROPERTY('public.order_details','distribution_key', 'user_id');
    CALL SET_TABLE_PROPERTY('public.order_details','storage_format', 'orc');
    COMMIT;
  3. Log on to the development console of Realtime Compute for Apache Flink and upload the ordergen JAR file of the desired custom connector. For more information, see Upload and use a custom connector.

  4. Use Realtime Compute for Apache Flink to cleanse data.

    Log on to the development console of Realtime Compute for Apache Flink, create a deployment to cleanse and aggregate collected data in the data source, and then call the HoloHub API to write the processed data to Hologres in real time. For more information, see Develop an SQL draft. The following SQL statements provide an example:

    CREATE TEMPORARY TABLE source_table (
    user_id BIGINT,
    user_name VARCHAR,
    item_id BIGINT,
    item_name VARCHAR,
    price numeric (38, 2),
    province VARCHAR,
    city VARCHAR,
    longitude VARCHAR,
    latitude VARCHAR,
    ip VARCHAR,
    sale_timestamp TIMESTAMP
     )
    WITH ('connector' = 'ordergen');
    
    CREATE TEMPORARY TABLE hologres_sink (
    user_id BIGINT,
    user_name VARCHAR,
    item_id BIGINT,
    item_name VARCHAR,
    price numeric (38, 2),
    province VARCHAR,
    city VARCHAR,
    longitude VARCHAR,
    latitude VARCHAR,
    ip VARCHAR,
    sale_timestamp TIMESTAMP
     )
    WITH (
    'connector' = 'hologres',
    'dbname' = '<holo_db>',
    'tablename' = '<receive_table>',
    'username' = '<uid>',
    'password' = '<pid>',
    'endpoint' = '<host>'
     );
    
    INSERT INTO hologres_sink
    SELECT user_id,
     user_name,
     item_id,
     item_name,
     price,
     province,
     city,
     longitude,
     latitude,
     ip,
     sale_timestamp
    FROM
    source_table;

    The following table describes the parameters in the SQL statements.

    Parameter

    Description

    holo_db

    The name of the Hologres database that you want to connect.

    receive_table

    The name of the Hologres table used to receive data. In this example, this parameter is set to public.order_details.

    uid

    The AccessKey ID of your Alibaba Cloud account.

    pid

    The AccessKey secret of your Alibaba Cloud account.

    host

    The VPC endpoint of the Hologres instance. To obtain the VPC endpoint, perform the following steps: Log on to the Hologres console. In the left-side navigation pane, click Instances. On the Instances page, find the desired Hologres instance and click the instance name. In the Network Information section of the Instance Details page, view the VPC endpoint of the Hologres instance.

  5. Start the deployment on the Deployments page of the development console of Realtime Compute for Apache Flink. For more information, see Start a deployment.image

  6. Query data in Hologres in real time.

    Execute SELECT statements to query data that is written to Hologres in real time from different dimensions. The following SQL statements provide an example:

    SELECT SUM(price) AS "GMV" FROM order_details ;
    
    SELECT COUNT(DISTINCT user_id) AS "UV" FROM order_details ;
    
    SELECT city AS "City", COUNT(DISTINCT user_id) AS "Number of customers who purchased products" FROM order_details GROUP BY "City" ORDER BY "Number of customers who purchased products" DESC limit 100;
    
    SELECT item_name AS "Product", SUM(price) AS "Sales amount" FROM order_details GROUP BY "Product" ORDER BY "Sales amount" DESC limit 100;
    
    SELECT to_char(sale_timestamp, 'MM-DD') AS "Date", SUM(price) AS "GMV" FROM order_details GROUP BY "Date" ORDER BY "GMV" DESC limit 100;                
  7. Create a dashboard in DataV to display query results obtained from Hologres.

    To create a dashboard in DataV to display query results obtained from Hologres, perform the following steps:

    1. Add a data source.

      1. Log on to the DataV console. On the homepage of the DataV console, click Data Sources. On the Data Sources tab, click Add Source.

      2. In the Add Data dialog box, configure the parameters to add a Hologres data source.

      3. After the parameters are configured, click OK.

    2. Create a dashboard to display real-time data.

      Select the widgets that you want to display on the dashboard and configure a data source for each widget based on your business requirements. For more information, see Overview.

      In this example, the column chart, image carousel, basic flat map, and ticker board are selected. The following steps describe how to configure a ticker board.

      1. Configure information about a data source.

      2. Configure the border, font, and color for the ticker board.

    3. Decorate the dashboard and present query results obtained from Hologres on the dashboard.

      After you configure the widgets and their data sources, you can decorate the dashboard based on your business requirements.

      • The total number of UVs to each product and the sales amount in top cities are displayed on the left side in real time.

      • The map in the middle highlights the location of each transaction order and refreshes the total sales amount in real time. The total number of UVs to the store is displayed on top of the map.

      • The sales percentage and sales ranking of each product are displayed on the right side in real time.