Search
  • ankitrathi

Data Lakes in Modern Data Architecture


Data Lake (DL): a storage repository that holds a vast amount of raw data in its native format until it is needed

DWH Vs DL


Data Lakes will be central to the modern data architecture because of these features:

  1. Agility: ability to convert data >> information >> action

  2. Insight: ability to give business insights

  3. Scalability: ability to accommodate data growth

All data is welcome:

  1. Stores all type of data: structured, semi-structured & unstructured

  2. Stores raw data in its original form for extended period of time

  3. Uses various tools to correlate, enrich & query for the insights on the data

  4. Provides democratized access via single unified view across the Enterprise

Traditional Data Architecture

Sources >> ETL >> EDW >> Data Discovery/Analytics/BI

Modern Data Architecture

Streaming/Unstructured/Various Sources >> Data Lake (Derived/Discovery Sandbox) >> EDW >> Data Science/Data Discovery/Analytics/BI


Data Lake Challenges & Complications

In Building:

  1. Rate of change in data sources

  2. Skill gap in the industry

  3. Complexity involved in accommodating different data sources

In Managing:

  1. Ingestion of different data sources

  2. Lack of visibility for future requirements

  3. Privacy & Compliance related

In Delivering:

  1. Quality Issues with data

  2. Reliance on IT

  3. Reusability of data

Approach for Data lakes

Enable the Data Lake

  1. Ingest the data

  2. Organize the data

  3. Register in Catalog

Govern the data in the lake

  1. Cleanse the data

  2. Secure the data

  3. Operationalize the data

Engage with business

  1. Discover the data

  2. Enrich the platform

  3. Provision the data sources

Data Lake Reference Architecture


Data Lake Management Platform

  1. Unified Data Management

  2. Managed Ingestion

  3. Data Reliability

  4. Data Visibility

  5. Data Privacy & Security

Getting Started


References

Building a Modern Data Architecture

DWH Vs DL

Thank you for reading my post. I regularly write about Data & Technology on LinkedIn & Medium. If you would like to read my future posts then simply ‘Connect’ or ‘Follow’. Also feel free to connect on Slideshare.

Originally published at https://www.linkedin.com/today/posts/ankitrathi on February 22, 2017.

#BigData

3 views

Call

T: +91 9891XXX969  

Follow me

  • Facebook Clean
  • Twitter Clean
  • White Google+ Icon

©  2020  Ankit Rathi