Qlik World 2021 Notes - Bring Your Data Together | Data Integration

I attended Qlik’s online conference, Qlik World 2021. Three days of presentations showcased current and upcoming features of their software, with heavy emphasis on their newest addition, the Qlik Data Integration platform.

Qlik is leaning into a vision they call Active Intelligence, where instead of waiting for analysis, data loaded into their software triggers actions and provides insights in real-time.

Below are my notes and takeaways from a few of the more interesting sessions.

Automation Demo

Qlik purchased Blendr.io, an iPaaS (Integration Platform as a Service)
automation will be available later this year in Qlik Cloud
over 500 built-in connectors for application-to-application integration
automation is created using a drag-and-drop interface
some functions: create loops, add conditions, schedule integration, implement listeners (via “webhooks”)
enables task chaining:
- -e.g., wait for reload to complete before triggering another reload
- -e.g., fetch viz image, store in DropBox, send to Slack channel
- -e.g., capture image of chart and send to email
- -e.g., get leads from Hubspot and add to Google Sheet (can use condition to grab only leads from a certain company)

Takeaway: IMO the most exciting upcoming feature from Qlik. This combines data integration with software integration and looks like it will be able to accomplish a lot of what programmers currently must hand-code.

Case Study: Skechers

Skechers uses Qlik Data Integration to enable real-time data analysis
Qlik Replicate features CDC (Change Data Capture), which pulls incremental updates from a data source by reading transaction logs
Replicate does full load first, then automatically switches over to CDC
Skechers’ data sources: Informix, MariaDB, SQL Server
push data to Confluent Cloud Kafka
Spark streaming jobs extract from Kafka topics and persist data in Databricks Delta tables
use Talend to identify, validate, normalize data, then load into Snowflake
Qlik Replicate exposes an API, which can be used to build dashboard

Takeaway: With its adoption by large companies, Qlik has established itself as a major player in the data integration space. It’s worth noting that Skechers also uses Talend in their solution. Something I wish they would have talked about is why they chose Qlik for the load and Talend for the transform.

Moving Qlik Sense to Qlik Sense Enterprise SaaS

Qlik Sense Enterprise SaaS is the cloud-based version of Qlik Sense
Qlik Data Transfer enables pushing data from on-prem server to Qlik SaaS
can query data sources or copy file on a schedule
how it works: turns data into QVD (Qlik data file), which is then pushed to Qlik Sense Enterprise SaaS
Qlik Data Transfer is not a replacement for Qlik Data Integration
if currently running Qlik Sense Enterprise on-prem, can build everything there, then push to Qlik Sense Enterprise SaaS (this setup is called “Multi-Cloud”)
also can install Qlik Replicate on-prem, then push to cloud storage (e.g., Amazon S3)

Takeaway: I see the Qlik Data Transfer tool as a temporary solution during the transition of migrating to their SaaS version. Overall feeling I had is that Qlik is subtly encouraging customers to move from on-prem installations to SaaS. One clear benefit of Qlik SaaS is automatic updates. For anyone who’s been through the version upgrade process, that’s a reasonably strong incentive.

Case Study: Raymond James

Top 5 complaints from business:

1. Constantly having to go between systems to do research
2. Takes too long to get data feeds from IT
3. Difficult to get historical data for trending
4. Want direct access to data
5. Why can’t we have data all in one place?

companies typically have many independent business systems: on-prem, cloud, 3rd-party data feeds etc.
common problem: business and IT not on same page with data
- IT doesn’t have business knowledge to know what business needs
- business knows what they need, but doesn’t understand data structure
- the result is that many Excel spreadsheets get created and spread around
- for more sophisticated needs, departments create their own Access databases so can create reports
- all of this requires hiring more people
Raymond James data sources: 25+ SQL db’s, 10+ Oracle db’s, several cloud apps, 100+ Excel files, Sharepoint lists and views, departmental Access databases, daily text files
Raymond James brought all this data into Qlik as QVDs, rather than creating many extracts that each took only a slice of the data
also brought in as CSV files for users who weren’t ready yet to use Qlik to analyze the data
common transformations: field names, uppercase, leading zeroes not dropped, leading/trailing spaces dropped

Takeaway: This presentation did a thorough job of outlining the most common pain points companies face, including the tendency of departments to create their own Excel files and Access databases. An interesting aspect of their solution was the choice to extract data in parallel as QVD and CSV files to enable consuming the data with other tools in addition to Qlik.

Case Study: Schneider Electric

main pain point: the custom solution they developed was missing records
realized 8x faster dev time using Qlik Data Integration and AWS DMS (Data Migration Service)
raw data transferred, lands in Amazon S3
create three levels of data:
1. “bronze” data: copy of raw data, but converted to Apache Parquet; added timestamps, quality checks
2. “silver” data: organized and unified by object type, plus more quality checks
3. “gold” data: analytics-ready data for biz
use Spark with Databricks for transformation
then load into Amazon Redshift (for high speed analytical queries)
use Tableau for viz, Databricks for data scientists

Takeaway: The most helpful part of this presentation was their dev time savings by using a data integration tool versus hand-coding. Surprised to learn their custom solution was missing records, but yet another reason in favor of using an established tool.