❗The content presented here is sourced directly from Coursera platform. For comprehensive course details, including enrollment information, simply click on the 'Go to class' link on our website.
Updated in [July 14th, 2023]
This course explores the manipulation of large-scale data distributed over clusters using functional concepts, which have become integral to various industries. We delve into Apache Spark, a fast, in-memory distributed collections framework written in Scala, that extends the data parallel paradigm to the distributed case. Through hands-on examples in Spark and Scala, participants will learn how to effectively address issues related to distribution, such as latency and network communication, to optimize performance. By the end of the course, learners will be able to read data, manipulate it using Spark and Scala, express data analysis algorithms in a functional style, and avoid unnecessary shuffles and recomputation in Spark. It is recommended to have at least one year of programming experience, with proficiency in Java or C# being ideal. Familiarity with other languages such as C/C++, Python, Javascript, or Ruby is also sufficient. Join now and expand your skills in big data analysis with Scala and Spark.