A Dual Approach to Data Storage

Understanding Data Lakes and Data Warehouses

Data Warehouse

A data warehouse is a central repository of integrated data from one or multiple disparate sources. It enables businesses to store historical data from software applications used by different departments in one place. Here, data is selected and extracted according to a target need. Before being uploaded to the data warehouse the data is cleansed to ensure data quality.

The data is then transformed, structured, and categorized, before being stored in the output zone of the data warehouse, called a data mart. Managers and other business professionals can use data stored in the data mart to create performance reports, conduct online analytical processing, and to support decision-making.

Data Lake

Data lakes differ to data warehouses because they allow data to be stored within a system in its natural form such as CSV, Logs, XML, JSON, email, PDF, image, audio, and video. Businesses can use data lakes for raw data without having to determine pre-defined needs. Instead, the data can be stored just in case it may be useful in the future.

One big advantage is that businesses may not yet know the value of trends or patterns hidden in the raw data until they bring in data scientists, who will conduct data mining. Data lakes therefore facilitate discovery of trends and patterns, for example by providing a more accurate forecast or identifying sales opportunities.

Comparison

Raw data cannot organise itself, so constructing and maintaining a data warehouse does require additional resources. These costs are usually offset by the benefits brought to users. This pay-and-get relationship is somewhat blurred in the case of data lakes. Since it is collected in the absence of a precise goal, data is usually left in its raw format as a way to save costs. Schemas are written in order to extract the useful information only when a specific need is identified, or where businesses embark on data mining.

The result is that data is kept in a more intact format. For example, data related to customer service may include recordings of dialogue which include the voice and tone for customer calls. This would be captured in its entirety under data lakes.

In contrast, the same data stored in a data warehouse may only capture key points relating to the same exchange. The completeness of the raw data in a data lake provides the business with enriched opportunities for analysis compared to the data stored in a data warehouse.

Conclusion

Contrary to popular belief, data lakes are not a successor of data warehouses: they are distinct approaches to storing data that respond to different business needs and work complementarily to each other. While data lakes are indispensable for data mining, data warehouses enable daily operations and ongoing monitoring.

Successful strategies will seek to incorporate both forms of storage to optimise use of one of the most valuable business assets: data.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

A Dual Approach to Data Storage

Understanding Data Lakes and Data Warehouses

Data Warehouse

Data Lake

Comparison

Conclusion

Henry Fung

Trending Now

Recently Posted

Categories

Archives

Previous PostNavigating the Digital Marketing Landscape : China

IMS is a global digital transformation agency that helps companies adopt profitable digital initiatives that drive business growth. We offer end-to-end solutions from software solutions to marketing automation.

Our Partners:

Contact Us:

Pages

Digital Transformation

Digital Marketing

Analytics

Cyber Security

Corporate Venturing

Property Tech

Health and Wellness

A Dual Approach to Data Storage

Understanding Data Lakes and Data Warehouses

Data Warehouse

Data Lake

Comparison

Conclusion

Henry Fung

Trending Now

Recently Posted

Categories

Archives

Previous PostNavigating the Digital Marketing Landscape : China

Related Posts

Like, Share, Retweet: Find Out Which Social Media Platform Works Best for Your Business.

The Importance of Data: Why It Matters and How to Leverage It for Success in Real Estate

Analytics in digital transformation should inspire transformative thinking to achieve business outcomes

IMS is a global digital transformation agency that helps companies adopt profitable digital initiatives that drive business growth. We offer end-to-end solutions from software solutions to marketing automation.

Our Partners:

Contact Us:

Pages

Digital Transformation

Digital Marketing

Analytics

Cyber Security

Corporate Venturing

Property Tech

Health and Wellness