❗The content presented here is sourced directly from Udemy platform. For comprehensive course details, including enrollment information, simply click on the 'Go to class' link on our website.
Updated in [July 27th, 2023]
This course provides an introduction to data cleaning and preprocessing in Python. It covers common issues with data, such as missing values, noise values or univariate outliers, multivariate outliers, data duplication, and improving the quality of data through standardizing and normalizing it, as well as dealing with categorical features. The course provides theoretical explanation, mathematical evaluation, and code for each concept. Lectures are organized into sections, with the first number referring to the section number and the second number referring to the lecture number within the section. All code is written in Python using Jupyter Notebook. Data cleaning is essential for building reliable machine learning models that can produce good results. Without sorting out the issues in the data, the results of even the most efficient models would be unreliable. This course provides the skills necessary to make useful analysis with business data acquired from multiple online sources.
Course Syllabus
Introduction
Detecting Missing and Noise Values (Univariate Outliers)
Handling Missing and Noise Values (Univariate Outliers)
Multivariate Outliers
Anomalies in Textual data
Structuring Textual Documents
Feature Scaling (Normalization)
Handling Categorical Features
Machine Learning Overview
Data Acquisition