Azure Data Engineering Full Stack
April 8, 2025 2025-08-20 16:18Azure Data Engineering Full Stack
Azure Data Engineering Full Stack
Home Courses Azure Data Engineering Full Stack




Azure Data Engineering
Azure Data Engineering Full Stack prepares learners to manage, process, and analyze massive datasets using Microsoft Azure. The course covers Azure Data Factory, Databricks, Synapse Analytics, and cloud data architecture, enabling professionals to become skilled data engineers.
Contact for More Information
+91 96660 64406
+91 96660 64406
Course Curriculum
Day 1
- What is Big Data Analytics
- Data Analytics Platform
- Storage
- Compute
- Data Processing Paradigms
- Monolithic Computing
- Distributed Computing
Day 2
- Distributed Computing Frameworks
- Hadoop MapReduce
- Apache Spark
- Big Data Analytics : Data Lakes
- Tightly Coupled Data Lake
- Loosely Coupled Data Lake
Day 3
- Big Data File Formats
- Row Storage Format
- Columnar Storage Format
- Scalability
- Scale-Up (Vertical Scalability)
- Scale-Out (Horizontal Scalability)
Day 4: Intruduction To Azure Databricks
- Core Databricks Concepts
- Workspace
- Notebooks
- Library
- Folder
- Repos
- Data
- Compute
- Workflows
Day 5: Introducing Spark Fundamentals
- What is Apache Spark
- Why Choose Apache Spark
- What are the Spark use cases
Day 6: Spark Architecture
- Spark Components
- Spark Driver
- SparkSession
- Cluster Manager
- Spark Executors
Day 7: Create Databricks Workspace
- Workspace Assets
Day 8: Creating Spark Cluster
- All-Purpose Cluster
- Single Node Cluster
- Multi Node Cluster
Day 9: Databricks - Internal Storage
- Databricks File System (DBFS)
- Uploading Files to DBFS
Day 10: DBUTILS Module
- Interaction with DBFS
- %fs Magic Command
Day 11: Spark Data API's
- RDD (Resilient Distributed Dataset)
- DataFrame
- Dataset
Day 12: Create Data Frame
- Using Python Collection
- Converting RDD to DataFrame
Day 13: Reading CSV data with Apache Spark
- Inferred Schema
- Explicit Schema
- Parsing Modes
Day 14: Reading JSON data with Apache Spark
- SingleLine JSON
- Multiline JSON
- Complex JSON
- explode() Function
Day 15: Reading XML Data with Apache Spark
- Install Spark-xml Library
- User Defined Schema
- DDL String Approach
- StructType() with StructFields()
Day 16: Reading Excel File With Apache Spark Single Sheet Reading
- Multiple Sheet Reading Using List object
Day 17: Reading Excel File With Apache Spark
- Multiple Excel Sheets with Same Structure
- Multiple Excel Sheets with Different Structures
Day 18: Reading parquet data With Apache Spark
- Uploading parquet data
- View the data DataFrame
- View the Schema of the DataFrame
- Limitations of parquet file
- Schema Evolution
Day 19: Intruduction to Delta Lake
- Delta Lake Features
- Delta Lake Components
Day 20: Delta lake Features
- DML Operations
- Time Travel Operations
Day 21: Delta lake Features
- Schema Validation and Enforcement
- Schema Evolution
Day 22: Access Data from Azure Blob Storage
- Account Access Key
- Windows Azure Storage Blob driver (WASB)
- Read Operations
- Write Operation
Day 23: Access Data from Azure Data Lake Gen2
- Azure Service Principal
- Azure Blob Filesystem driver (ABFS)
- Read Operations
- Write Operation
Day 24: Access Data from Azure Data Lake Gen2
- Shared access signatures (SAS)
- Azure Blob Filesystem driver (ABFS)
- Read Operations
- Write Operation
Day 25: Access Data from Azure SQL Database
- Configure a connection to SQL server
Day 26: Access Data from Synapse Dedicated SQL Pool
- Configure storage account access key
- Read data from an Azure Synapse table
- Write Data to Azure Synapse table
Day 27: Access Data from Snowflake
- Reading Data
- Writing Data
Day 28: Create Mount Point to Azure Cloud Storages
- Azure Blob Storage
- Azure Data Lake Storage
Day 29: Introduction to Spark SQL Module
- Hive Metastore
- Spark Catalog
Day 30: Spark SQL - Create Global Managed Tables
- DataFrame API
- SQL API
Day 31: Spark SQL - Create Global Un-Managed Tables
- DataFrame API
- SQL API
Day 32: Spark SQL\_Create Views
- Temporary Views
- Global Temporary Views
- DataFrame API
- SQL API
- Dropping Views
Day 33: Spark Batch Processing
- Reading Batch Data
- Writing Batch Data
Day 34: Spark Structured Streaming API
- Reading Streaming Data
- Write Streaming Data
- checkPoint Location
Day 35: Spark Structured Streaming API - outputModes
- Append
- Complete
- Update
Day 36: Spark Structured Streaming API\_Triggers
- Unspecified Trigger (Default Behavior)
- trigger(availableNow = True)
- trigger(processingTime = “n minutes”)
Day 37: Spark Structured Streaming API
- Data Processing
- Joins
- Aggregation
Day 38: Code Modularity of Notebooks
- %run Magic Command
Day 39: dbutils.notebook Utility
- run()
- exit()
Day 40: Widgets\_Types of Widgets
- text
- dropdown
- multiselect
- combobox
Day 41:Parameterization of Notebooks
- History Load
- Incremental Load
Day 42:Trigger Notebook from Data Factory Pipeline
- Notebook Parameters
Day 43:Databricks Workflow
- Orchestration of Tasks
Day 44:Databricks Workflow
- Job Trigger
Day 45: Delta Lake Implementation
- SCD Type0 Dimension
Day 46:Delta Lake Implementation
- SCD Type1 Dimension
Day 47:Delta Lake Implementation
- SCD Type2 Dimension
Day 48:Delta Lake Implementation
- SCD Type3 Dimension
Day 49:PySpark Performance Optimization
- Cache()
- Persist()
Day 50:PySpark Performance Optimization
- repartition()
- coalesce()
Day 51:PySpark Performance Optimization
- Column Predicate Pushdown
- partitionBy()
Day 52:PySpark Performance Optimization
- bucketBy()
Day 53:PySpark Performance Optimization
- BroadCastJoin
Day 54:Delta Lake\_Performance Optimization
- OPTIMIZE
- ZORDER
Day 55:Delta Lake\_Performance Optimization
- Delta Cache
Day 57:Delta Lake\_Performance Optimization
- Partitioning
- Liquid Clustering
Day 58:Databricks Unity Catalog
- Metastore
- Catalog
- Schema
- Tables
- Volumes
- Views
Day 59:Databricks Unity Catalog
- Managed Tables
- External Tables
Day 60: Databricks Unity Catalog
- Managed Volumes
- External Volumes
Day 61: Databricks - Auto Loader
- Auto Loader file detection modes
- Directory Listing mode
- File Notification mode
- Schema Evolution with Auto Loader
Day 62: Delta Live Tables
- Simple Declarative SQL & Python APIs
- Automated Pipeline Creation
- Data Quality Checks