Classroom + App Based Learning
Faculty Guidance through App
Industry Experience Faculty
35 Mn Learners Worldwide
Program starts with basics of Python Programming and covers the essential programming knowledge required for conducting data analysis in Python, evolving into How to work with Data in Python and applying machine learning algorithms on data for analysing and visualizing data in python.
Modules Covered:
1. Case Study on Online Credit Card Fraud Detection
Industry: Banking, Telemarketing
Description: In this case study we will classify the outbound calls of a bank to see if such a call will result in a credit application or not using three most popular classification methods Gradient Boosting Naïve Bias, Generalized Linear Model and Random Forest. We will compare the performance of these methods using various performance and cost metrics for example, precision, recall, F1-score and Receiver Operating Characteristic (ROC).
Dataset: We will use the data related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The dataset has 45211 records across 17 attributes ordered by date (from May 2008 to November 2010).
3. Case Study on Forecasting River Flow using Time Series Models
Industry: Natural Resource Management
Description: We will see various techniques of handling, analyzing, and building models for time series data. We will use the autoregressive moving average (ARMA) model and its generalization—the autoregressive integrated moving average(ARIMA) model to predict the future from time series data.
Dataset: The datasets for this chapter come from the web archive of monthly river flows where in all the time series data is in chronological order (reading across).The river flow data units of measurement are cubic meters per second.
4. Case Study on Price Distribution Analysis of Sacramento's Houses.
Industry: Real Estate, Sales
Description: In this case study we will process real estate transactions data of houses sold in Sacramento by imputing missing observations and normalizing and standardizing the features. Then we will investigate the correlations by calculating the Pearson, Kendall, and Spearman correlation between the features of interest. Lastly we will visualize the interactions between interesting features by creating, displaying, and saving histograms.
Data-set: The Data set used consists of 985 real estate sales transactions took place in the Sacramento area over a period of five consecutive days.
Find out what our Alumni have to say.