Wiki source code of AWS Redshift Refresh
Last modified by Martijn Woudstra on 2022/06/13 14:56
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | {{container}}{{container layoutStyle="columns"}}((( | ||
| 2 | |||
| 3 | Below you will find a document describing the migration path to add the AWS Redshift Refresh functionality to a data pipeline solution you have previously built. | ||
| 4 | If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included. | ||
| 5 | |||
| 6 | Should you have any questions, please get in touch with academy@emagiz.com. | ||
| 7 | |||
| 8 | * Last update: March 1st, 2022 | ||
| 9 | * Required reading time: 6 minutes | ||
| 10 | |||
| 11 | == 1. Prerequisites == | ||
| 12 | |||
| 13 | * Basic knowledge of the eMagiz platform | ||
| 14 | * Understanding of Data pipelining concepts | ||
| 15 | * A existing Data pipeline solution within your eMagiz project. | ||
| 16 | |||
| 17 | == 2. Key concepts == | ||
| 18 | |||
| 19 | * This functionality automatically refreshes the materialized view in Redshift | ||
| 20 | * By making eMagiz refresh it automatically, the data in your dashboards are kept up to date | ||
| 21 | |||
| 22 | == 3. Migration Path * Root Cloud to eMagiz Cloud == | ||
| 23 | |||
| 24 | Below you will find a document describing the migration path to add the AWS Redshift Refresh functionality to a data pipeline solution you have previously built. | ||
| 25 | If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included. | ||
| 26 | |||
| 27 | === 3.1 Remove unnecessary components === | ||
| 28 | |||
| 29 | First, we will delete components that have become obsolete as of late. The parts you can remove from the flow are: | ||
| 30 | |||
| 31 | * support.bus-connection-plain | ||
| 32 | * support.bus-connection-caching | ||
| 33 | |||
| 34 | Furthermore, you could remove the following debug components as every interesting step is already monitored and can therefore be tracked without the help of the debugger: | ||
| 35 | |||
| 36 | * global channel interceptor | ||
| 37 | * activate.debug-bridge | ||
| 38 | * send.debug | ||
| 39 | * entry.channel.debug-queue | ||
| 40 | * debugBridgeChannel | ||
| 41 | |||
| 42 | === 3.2 Add new components to refresh AWS Redshift === | ||
| 43 | |||
| 44 | AWS Redshift offers you the option to create a materialized view. A materialized view is a table that combines information from several source tables into one view. | ||
| 45 | To keep updating this materialized view when a source table is updated eMagiz has created functionality to refresh the materialized view. | ||
| 46 | |||
| 47 | To make sure that your existing data pipeline will function in the same way you should execute the following steps: | ||
| 48 | |||
| 49 | * Add a channel called job-execution | ||
| 50 | * Add a channel called dummy | ||
| 51 | * Add a channel called job-completed | ||
| 52 | * Add a channel called job-completed-delay | ||
| 53 | * Add a channel called log | ||
| 54 | * Add a job execution listener gateway and configure it as follows | ||
| 55 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-job-execution-listener.png]] | ||
| 56 | * Add a standard filter and configure it as follows (the expression checks whether the Job is already finished) -> payload.status == T(org.springframework.batch.core.BatchStatus).COMPLETED | ||
| 57 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-execution-status.png]] | ||
| 58 | * Add a standard activator and configure it as follows (the expression puts the thread to sleep for 60 seconds) -> T(java.lang.Thread).sleep(60000) ?: 'AWS Materialized view refreshed.' | ||
| 59 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-delay.png]] | ||
| 60 | * Add a JDBC outbound channel adapter and configure it as follows (query can be found below) | ||
| 61 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-jdbc-outbound.png]] | ||
| 62 | * Add a logging channel adapter and configure it as follows | ||
| 63 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-log.png]] | ||
| 64 | * Make sure that the job-completed-delay channel is wiretapped with the help of the log channel | ||
| 65 | |||
| 66 | === 3.3 Query you need for refresh === | ||
| 67 | |||
| 68 | The following query is needed to refresh the AWS materialized view. First, make sure that the name of the materialized view is filled in as a property value. | ||
| 69 | |||
| 70 | REFRESH MATERIALIZED VIEW ${dp.jdbc.distinctive-name.tablename}\_mv; | ||
| 71 | COMMIT; | ||
| 72 | |||
| 73 | === 3.4 Intermediate Result === | ||
| 74 | |||
| 75 | The result should look something like this: | ||
| 76 | |||
| 77 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-result-part-one.png]] | ||
| 78 | |||
| 79 | === 3.5 Add listener to job === | ||
| 80 | |||
| 81 | If you only add the listener section in the flow, you would have a solution that only partly works. | ||
| 82 | To make sure the complete chain of events starts working in unison, you need to tell the job which listeners need to listen to the execution of the job. | ||
| 83 | |||
| 84 | To do so, open the component called support. job.{technicalnameofmessagetype}. On the advanced tab, you need to add two listeners: | ||
| 85 | |||
| 86 | * A Gateway Listener -> Select the job execution listener that you have added in step 2 | ||
| 87 | * A Exception Listener -> This one makes sure that when an error occurs in AWS Redshift, the job status will be Failed instead of Completed. | ||
| 88 | |||
| 89 | The configuration should look as follows: | ||
| 90 | |||
| 91 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-add-job-listeners.png]] | ||
| 92 | |||
| 93 | === 3.6 End Result === | ||
| 94 | |||
| 95 | The result of all this should look something like this: | ||
| 96 | |||
| 97 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-end-result.png]] | ||
| 98 | |||
| 99 | ===== Practice ===== | ||
| 100 | |||
| 101 | == 4. Key takeaways == | ||
| 102 | |||
| 103 | * This functionality automatically refreshes the materialized view in Redshift | ||
| 104 | * By making eMagiz refresh it automatically, the data in your dashboards are kept up to date | ||
| 105 | |||
| 106 | )))((({{toc/}}))){{/container}}{{/container}} |