Marketing Data Lakes: Why Most Marketers Need a Tool That Does More
Enterprise marketing teams have been tackling the problem of storing large volumes of marketing data by using data lakes and data warehouses. What's the difference? While they are both places to consolidate and store marketing data, the term data lake' is used to describe a solution that is more flexible, with the ability to store unstructured data from any data source while data warehouses usually require data to be organized before storing it.
So, for example, if a marketing team wants to consolidate their Google Ads, Facebook Ads, and display ads data, all of which come with different metrics, they would need to decide how that would be organized in terms of columns and metrics by using a separate tool, usually an ETL (Extract, Transform, Load) tool before storing it in a data warehouse. Whereas, a data lake allows the data to be stored in whatever form it comes in and it can be organized later.
Data lakes and data warehouses can be extremely useful but there are two common drawbacks with both for marketers:
- Using them requires database programming skills to access the data.
- Analytics and distribution of that data need to be done using a separate tool.
Both of these issues require a great deal of time-consuming work when you're running multiple campaigns for many clients or brands. Using separate tools to handle different parts of the marketing lifecycle is unnecessarily complex and having to rely on a data team to provide the data you need can be a cumbersome and frustrating process.
At TapClicks, we built a solution specifically for marketers to act as a marketing data lake (or data warehouse) and handle the entire data management lifecycle in one scalable data operations platform from planning and execution stages through to analysis and distribution.
In our experience, we believe there are four essential factors that a good marketing data management process should have, one of which is storing data in a data lake. Specifically, an effective marketing data management solution should:
- Easily collect data from any sales and marketing source, automatically.
- Store all your data in one fully managed data lake, accessible to marketers with no coding experience.
- Transform unstructured data and do advanced calculations on that dataso you can use it for meaningful analytics.
- Distribute your data wherever you want, via a powerful reporting function or to another platform.
In this post, we'll consider the differences between marketing data lakes and data warehouses, including their challenges. Then, we'll discuss how TapClicks meets each of the four essential features we outlined above.
What Is a Marketing Data Lake vs. a Data Warehouse?
A marketing data lake is a cloud-based solution that stores all your company's unstructured cross-channel marketing data. Popular data lake providers include Google Cloud and Amazon S3.
Traditionally, marketers have used data warehouses (for example AWS or Microsoft Azure) to store the ever-increasing volumes of data they have to handle (e.g. customer data, social media and Google Analytics data). There are many use cases for data warehouses (whether on-premise or a cloud platform) but their drawback is that the data needs to be structured before the data is loaded, unlike a data lake (more on this, below).
By structured we mean you need to decide how you want to organize the schema of each data set at the outset and use a third-party tool to do this for you, such as an ETL (Extract, Transform, Load) tool. By schema we mean the different data fields and data types from your data sets, for example, Impressions and CTR for Google Analytics data, etc.
Data warehouses are also limited in the number of data platforms they integrate with, meaning that you may have to spend extra time extracting some of your data manually from these sources.
This is where marketing data lakes have the edge. The advantage of a data lake over a data warehouse is that data from any source can be stored and, importantly, the data can be unstructured (i.e., in its raw form). This means you can store all your data in a data lake and organize it later, when you want to do something with that data (for example, segmentation or data analytics).
(Aside: it's important to have clear data governance in place when a data lake is part of your ecosystem. If you don't, new data added from many different sources and stored in silos will quickly become disorganized and affect your data quality.)
The 2 Common Drawbacks of Data Lakes and Data Warehouses
While data lakes and data warehouses have their benefits, for marketers there are two common challenges in addition to the ones we just mentioned:
#1: They Are Not Marketer-Friendly
Whether you store your marketing data in a data lake or a data warehouse, you'll be reliant on IT expertise to access that data. Typically, a data team with SQL and data science skills will do the work for you extracting the data sets you need and doing the advanced analytics on that data, usually with a sophisticated business intelligence tool.
This means if you receive an ad hoc request from a client for a particular data set or report, you cannot easily collect and aggregate that data yourself. You have to fit it in with the workload and timeframe of the data scientists.
#2: You Need to Use Separate Analytics and Reporting Tools
The second challenge of using data lakes and data warehouses is that they help with only one part of the data management process. They're useful for storing large volumes of data, and in the case of data lakes, they can store all your data and let you organize it later, but that's where their remit usually ends. To analyze and report on your data to clients and stakeholders you need to add in separate solutions to the data architecture.
For example, data is often imported to a data visualization tool (e.g. Tableau) or even a spreadsheet where it can be plotted into a presentation to share with others. This makes the process unnecessarily complicated and time consuming.
In contrast, our solution, TapClicks, meets the two common challenges of data lakes and data warehouses:
- Marketers can import and extract data from the data lake without database programming skills or calling on IT to help.
- Marketers have the option of doing data analysis and even creating powerful scalable dashboards and reports from right within TapClicks without going to an external analysis tool. (But TapClicks is also flexible enough to export data to 3rd party analysis tools like Tableau or even Google sheets and be only used as a data lake.)
Let's walk through how TapClicks meets the essential criteria we set out above.
Feature #1: TapClicks Automatically Collects Data from Almost All Marketing Data Sources
Like most data lakes (but not all data warehouses), TapClicks allows you to connect to any data source (e.g., CRM such as Salesforce) via our Smart Connector tool. To date, we have built connections with over 6,000 data sources, including proprietary and offline data sources.
We also have hundreds of pre-built API-based data integrations including all the marketing platforms you would expect (e.g. Facebook Ads, Twitter Ads, etc.) as well as many lesser known ones (e.g., Genius Monkey and Tiger Pistol).
Here's a short video on how to create a Smart Connector in TapClicks:
An advantage of the TapClicks solution is that for many data sources we can also extract up to 12 months of historical data too. This can be stored alongside all the new data that's being pulled through in near real-time and updated automatically every day or whenever you choose to refresh it.
Another benefit of using TapClicks is that our team manages all the API connections. So, if a connection breaks, your developers don't need to get involved because we will take care of it for you.
There is no need for a third-party tool (such as an ETL tool as with a data warehouse) because TapClicks extracts the data in whatever form it's in (structured or unstructured) and pulls it straight into your TapClicks data lake we'll discuss this next.
Feature #2: TapClicks Acts as Your Marketing Data Lake No Programming Skills Required
Data lakes can act as dumping grounds for all your unstructured data and data warehouses do too (albeit after the data has been structured' see #3). Not only does TapClicks collect all your data (#1) in whatever form it comes in, transform it and allow you to create custom metrics (#3, below), it also stores it in your own TapClicks data lake or data warehouse, whichever term you want to use.
The TapClicks data lake is fully managed by our team and the key benefit is that there are no data engineering skills required to access the data. Your TapClicks data lake acts as a central hub allowing you to do everything you want within the TapClicks platform, including pushing the data out to other locations (see #4).
Your data is stored safely forever and can be accessed by any marketer at any time with no IT skills.
Feature #3: Marketers Can Transform Data and Do Complex Calculations (Structured and Unstructured)
As we've mentioned, the benefit of data lakes over data warehouses is that they can store unstructured data whereas data warehouses typically require the data to be structured (transformed or organized differently so you can use it in visualizations) by an ETL tool, for example. This means you can store any data in a data lake and leave the decision-making about what you're going to do with that data until later, albeit with the risk of it becoming unwieldy big data stored in silos.
Transforming Unstructured Data
Unstructured data is data in its raw form. Data sets typically come in many different formats from simply formatting the date differently or displaying extra columns to more complex differences in hierarchical set-ups, for example, the different lines for Facebook Ads would be campaign level, ad group level, individual ad, etc. Each data source will be different and with TapClicks, these hierarchies will be transferred across to the TapClicks platform exactly as they are, just like a data lake.
If you want to standardize the names of some metrics so they are comparable across different channels you can do this in the TapClicks platform. For example, if different channels use different terms for impression, such as imp, hit or view, with TapClicks you can label them as one umbrella term (e.g. impression'') to ensure the attribution is uniform. This allows you to look at one clear data set across many channels or even clients.
As well as transforming data, TapClicks allows you to do more advanced calculations by normalizing data sets. For example, if you were running a campaign across different channels such as PPC, SEO and programmatic advertising, you will want to look at how that campaign performed across all the channels. You will also want to see how the campaign impacted your sales.
Typically, doing even this simple analysis (sales of a particular campaign across multiple channels) requires tedious work of exporting the sales metrics from each campaign to a spreadsheet, making sure date ranges and other details are consistent, and manually adding up sales to get total numbers and comparing across channels. But with TapClicks, you can define the campaign name, for example, Summer Sale Campaign and every campaign across any channel that matches your definition criteria will be associated with this campaign. This way, you can easily ask TapClicks to tell you total sales of Summer Sale Campaign, how those sales differed by channel, cost per sale by channel, average cost per sale, and more no manual spreadsheet work required.
These are not functions that a data lake or data warehouse facility could do without the use of a third party tool and data science skills. But with TapClicks, you can do everything in one place and, as with everything in the TapClicks solution, you only have to set up your calculation metrics once and they'll be ready for you to use over and over.
Feature #4: Distribute Data to Stakeholders via TapClicks Reports or to Third-Party Platforms
As we said, a typical data lake based marketing workflow requires a third party solution (e.g. Tableau or Google Sheets) to create visualizations and reports to send to clients and stakeholders.
TapClicks can absolutely be used in that way (as a data lake that pushes data to Tableau, Google Sheets, or other BI or analysis tools), but with TapClicks you can also have the option of creating visualizations and reports within the TapClicks platform. In fact, our visualization and reporting features are something we've been known for for years and that we've designed with enterprise level scalability in mind.
Create Visualizations via TapClicks Dashboards
With Tapclicks you can create meaningful stories with your data for clients and stakeholders. Users can choose from our readymade dashboard templates and configure them to include whichever metrics (we call these widgets) you set up (#3). You can then visualize your data however you wish (e.g. graph, bar chart, pie chart, etc.).
The beauty of TapClicks for enterprise-sized agencies is that we made our solution so powerful you can use your dashboards as templates that can be scaled across hundreds of clients automatically. Plus, because each widget and dashboard is customizable, you can customize metrics for each data source and, by simply using the filter feature, apply it to hundreds of clients. You can read more about our marketing dashboards in this post.
Here's a short video on how to create a dashboard in TapClicks:
TapClicks Automatically Populates PowerPoint-style Reports with Up-to-Date Data
TapClicks also has its own powerful reporting solution, ReportStudio where you can create presentation report templates with whatever plots and graphs you want and schedule them to be distributed to the stakeholders you select.
As with the dashboards, you can choose from our white label templates and customize the reports to include whatever data you wish. You can set permissions so that each audience sees only the data that's relevant to them and set up the reports to be sent out on a schedule weekly, monthly, etc. The reports will automatically be populated with the latest (near real-time) data without any need for third party tools or data programmers.
Here's a short video on how to schedule reports in TapClicks Report Studio:
Distribute Data to Any Other Platform
Because TapClicks manages the data for the entire marketing lifecycle, not only can you pull data from any source you wish, store it in your own TapClicks data lake, transform and do calculations on that data and create visualizations and reports, you can also deliver data in any form you like (structured or unstructured) to any other destination.
For example, TapClicks has a specific data integration with Google Sheets. This means that data can be pushed out to Google Sheets directly without a need to use the TapClicks reporting facility. You can read more about this here.
TapClicks will also distribute data via its open source API to other reporting platforms such as Tableau.
And if you already use a specific data lake or data warehouse, you can use TapClicks to collect and send data to those too using the same scheduling methodology as ReportStudio.
Data lakes are becoming an essential initiative for marketers to store increasing volumes of data. However, it's significantly beneficial for marketers to use a solution that can handle the entire data management process including data collection, data transformation, visualization and reporting as well as act as a data lake.
Using a marketer-friendly platform such as TapClicks that can do everything for you automatically without the need for data programming skills provides a flexible solution that saves a significant amount of time.