❗The content presented here is sourced directly from Udemy platform. For comprehensive course details, including enrollment information, simply click on the 'Go to class' link on our website.
Updated in [August 13th, 2023]
Skills and Knowledge Acquired:
This course will provide learners with the skills and knowledge to use the DataFrame API & SQL to manipulate data in Apache Spark. Learners will also gain an understanding of how Apache Spark runs on a cluster with multiple nodes, and how to write and run Apache Spark code using Databricks. Additionally, learners will learn how to read and write data from the Databricks File System (DBFS), select, rename and manipulate columns, filter, drop and aggregate rows, join DataFrames, create UDFs and use them with the DataFrame API or Spark SQL, and write DataFrames to external storage systems. Finally, learners will gain an understanding of the elements of Apache Spark execution hierarchy such as Jobs, Stages, and Tasks.
Contribution to Professional Growth:
This course on Databricks and Apache Spark 2.4 and 3.0.0 provides a comprehensive introduction to the Apache Spark framework and the Databricks platform. It covers the fundamentals of Apache Spark and how to use the DataFrame API & SQL to perform data manipulation tasks. It also explains how Apache Spark runs on a cluster with multiple nodes and how to write and run Apache Spark code using Databricks. By taking this course, professionals can gain a better understanding of Apache Spark and the Databricks platform, which can help them to develop more efficient and effective Big Data processing applications. This course can also help professionals to stay up-to-date with the latest developments in Apache Spark and Databricks, which can contribute to their professional growth.
Suitability for Further Education:
This course is suitable for preparing further education in Apache Spark and Databricks. It covers the fundamentals of Apache Spark and Databricks, including how to write Spark applications using Scala and SQL, how to read and write data from the Databricks File System (DBFS), and how to use the DataFrame API and SQL to perform data manipulation tasks. Additionally, the course covers the elements of Apache Spark execution hierarchy such as jobs, stages, and tasks.
Course Syllabus
Setup
Introduction to Databricks and Apache Spark
The DataFrame API: Basics
The DataFrame API: Transforming Data
Spark SQL & SQL Fundamentals
Working with different type of data
Data Sources