โก๏ธ This architecture enables customers to build data analytics pipelines using a Modern Data Analytics approach to derive insights from the data.
1) Data is collected from multiple data sources across the enterprise, SaaS applications, edge devices, logs, streaming media, and social networks.
2) Based on the type of the data source, AWS Database Migration Service, AWS DataSync, Amazon Kinesis, Amazon Managed Streaming for Apache Kafka, AWS IoT Core, and Amazon AppFlow are used to ingest the data into a Data Lake in AWS.
3) AWS Data Exchange is used for integrating third-party data into the Data Lake.
4) AWS Lake Formation is used to build the scalable data lake, and Amazon S3 is used as the data lake storage.
5) AWS Lake Formation is also used to enable unified governance to centrally manage the security, access control, and audit trails.
6) AWS Glue and AWS Glue DataBrew are used to catalog, transform, enrich, move, and replicate data across multiple data stores and the data lake.
7) Amazon Kinesis Data Analytics is used to transform and analyze streaming data in real time.
8) Amazon QuickSight provides machine learning-powered business intelligence.
9) Amazon SageMaker and AWS AI services can be used to build, train and deploy machine learning models, and add intelligence to your applications.
10) Amazon Redshift is used as a Cloud Data Warehouse. 11) Amazon EMR provides the cloud big data platform for processing vast amounts of data using open source tools.
12) Amazon SageMaker and AWS AI services can be used to build, train and deploy machine learning models, and add intelligence to your applications.
13) Amazon Redshift Spectrum and Amazon Athena enable interactive querying, analyzing, and processing capabilities.
You can download the Implementation Guide from this link ๐๐ https://d1.awsstatic.com/architecture-diagrams/ArchitectureDiagrams/modern-data-analytics-using-lake-house-ra.pdf?did=wp_card&trk=wp_card
I hope you will find it useful for your organization! ๐ค