#

Artificial Intelligence and Machine learning

by Sopan Shewale on

Schedule

Start Date:

End Date:

Instructor

sopan_shewale
Sopan Shewale
Technical Evangelist
https://www.linkedin.com/in/sopanshewale/
Full Stack Developer with more than 25 years of Experience

Course Objective

  1. Define Artificial Intelligence (AI) and Machine Learning (ML) Concepts
  2. Explain the Mathematics required in the industry to use AI and ML
  3. Introduce the Frameworks, Libraries from Python Programming used in AI and ML
  4. Introduce Machine Learning Concepts, Processes
  5. Introduce Neural Networks Concept
  6. Introduce Computer Vision so that students can work on Face Detection or any other object detection as well as object tracking projects

Course Outcomes

  1. Students Understand the AI and ML concepts
  2. They get Hands on experience on handling unstructured as well as structured data
  3. They are able to use Python Libraries meant for AI & ML and develop useful tools to solve problems
  4. They are able to track and identify the various objects by developing Machine Learning Models.
  5. They are able to use API’s or libraries available on internet solve AI & ML problems

Module 1: Foundations of AI and ML

  1. Introduction to AI and ML
    1. Definition and historical perspective
    2. Overview of AI and ML applications
  2. Mathematics for Machine Learning
    1. Linear algebra
    2. Probability and statistics
    3. Calculus
  3. Programming Basics for ML
    1. Introduction to Python
      1. Loops
      2. Functions
      3. Classes & Modules
    2. Libraries for scientific computing (NumPy, Pandas)
  4. Reading And Writing Data using Python Libraries (Numpy & Pandas) (Practical)
    1. Read & Write CSV, Microsoft Excel Files

Module: Machine Learning Basics

  1. Supervised Learning
    1. Linear Regression
    2. Logistic Regression
    3. Decision Trees and Random Forests
  2. Unsupervised Learning
    1. Clustering (K-Means, Hierarchical)
    2. Dimensionality Reduction (PCA)
  3. Reinforcement Learning

Module 3: Advanced Machine Learning - Deep Learning

  1. Neural networks basics
    1. Representation
    2. Computing Output
    3. Vectorizing Examples
    4. Supervised Learning with Neural networks
  2. Activation functions
    1. Non-Linear Activation function
    2. Derivatives of Activation functions
  3. Shallow Neural Networks
    1. Forward Propagation
    2. Backward Propagation

Module 4: Practical Applications

  1. Natural Language Processing
    1. Text classification
    2. Named Entity Recognition (NER)
  2. Computer Vision
    1. Object detection

Module 5: Capstone Project

Students work on a hands-on project applying AI and ML techniques to solve a real-world problem. This can involve data acquisition, preprocessing, model training, and evaluation. Possible Set of projects are as follows:

  1. Face Detection with Deep Learning
  2. Predicting CO2 Emission Footprint Using AI through Machine Learning
  3. Movie Recommendations with Movielens Dataset
  4. Stock Price Predictions
  5. Human Activity Recognition with Smartphones

Problem Sets:


[1] Foundation of AI & ML - use of Pandas, Numpy Python Libraries

Data Source: https://dravate.com/assets/courses/machinelearning/adult.data.csv

(The original source : https://archive.ics.uci.edu/dataset/2/adult )
The data has following features
age: continuous;

workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked;

fnlwgt: continuous;

education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool;

education-num: continuous;

marital-status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse,

occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty,

Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces;

relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried;

race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black;

sex: Female, Male;

capital-gain: continuous.

capital-loss: continuous.

hours-per-week: continuous.

native-country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands;

salary: >50K, <=50K.

Use Panda to answer following questions,

  1. How many men and women (sex feature) are represented in this dataset?
  2. What is the average age (age feature) of women?
  3. What is the percentage of German citizens (native-country feature)?
  4. What are the mean and standard deviation of age for those who earn more than 50K per year (salary feature)
  5. and those who earn less than 50K per year?
  6. Is it true that people who earn more than 50K have at least a high school education?
  7. Display age statistics for each race (race feature) and each gender (sex feature). Use groupby() and describe(). Find the maximum age of men of Amer-Indian-Eskimo race.
  8. Among whom is the proportion of those who earn a lot (>50K) greater: married or single men (marital-status feature)? Consider married those who have a marital-status starting with Married (Married-civ-spouse, Married-spouse-absent or Married-AF-spouse), the rest are considered bachelors.
  9. What is the maximum number of hours a person works per week (hours-per-week feature)? How many people work such a number of hours, and what is the percentage of those who earn a lot (>50K) among them?
  10. Count the average time of work (hours-per-week) for those who earn a little and a lot (salary) for each country (native-country). What will these be for Japan?

[2] Supervised Learning:

Data Source: https://dravate.com/assets/courses/machinelearning/placement-data-class.csv

University Campus recruitment is a strategy for sourcing, engaging and hiring young talent. Our dataset revolves around the placement season of a Business School, Where it has various factors on candidates getting hired such as work experience,exam percentage etc., Finally it contains the status of recruitment and remuneration details.

  • Do a exploratory analysis of the Recruitment dataset
  • Do an visualization analysis of the Recruitment dataset
  • Prediction: To predict whether a student got placed or not using classification models

[3] Supervised Learning

Data Source: https://dravate.com/assets/courses/machinelearning/iris_flower.csv

Create the model that can classify the different species of the Iris flower

[4] Unsupervised Learning

Data Source:

  1. https://dravate.com/assets/courses/machinelearning/random.csv
  2. https://dravate.com/assets/courses/machinelearning/cluster.csv

K­means Clustering

In this assignment students are expected to apply k‐means clustering to the provided two data sets. The first data set “cluster.csv” contains the data that needs to be clustered. The second data set “random.csv” contains a randomly distributed reference data whose value ranges match those of the “cluster.csv” data.

You need to perform the following experiments.

  1. Cluster the cluster.csv data using “simplekmeans” under the cluster tab, with k set to 1, 2, 3, 4, 5 respectively. The cluster mode should be set using a training set. With each clustering run, you can obtain the “within cluster sum of squared error” for that run in the output.
  2. For each k value, we run kmeans for five times, each time set the seed value to a different value (please use 1, 2, 3, 4, 5). Among the five runs, choose the one that gives the lowest “Within cluster sum of squared error” and record the value. (Note that for k=1, no clustering is run, just record the value once will do.) ‐‐‐ This step should give you a set of values we denote WCSE1, WCSE2, WCSE3,WCSE4,WCSE5 for the cluster.csv data
  3. Plot the values of WCSEi for i =1, 2, 3, 4, 5. Can you tell the number of clusters from this plot?
  4. Repeat steps 1 and 2 for the “random.csv” data and obtain the “within cluster squared error” for k =1, 2, 3, 4, 5 respectively. Let’s denote them by WCSE’1, WCSE’2, WCSE’3,WCSE’4,WCSE’5.
  5. Plot the values of WCSEi/WCSE’i, for i =1, 2, 3, 4, 5. From this plot, how many clusters do you think this data contains?

HAC clustering

Create by hand the clustering dendrogram for the following samples of ten points in one dimension. Sample = (‐1.8, ‐1.7, ‐0.3, 0.1, 0.2, 0.4, 1.6, 1.7, 1.9, 2.0)

  • Using single link
  • Using complete link

Note that below are example dendrograms for the following three points (1, 2, 4) with single and complete link.

machine_learning_course_one

[5] Reinforcement Machine Learning

A few examples are:

  1. Traffic analysis and real-time road processing by video segmentation and frame-by-frame image processing
  2. CCTV cameras for traffic and crowd analytics

These examples can involve use of Python Libraries like OpenCV to detect crowds or identify vehicles.

[6] Neural Networks - I

Predict the Burned Area of Forest Fire with Neural Networks.

Data Source: https://dravate.com/assets/courses/machinelearning/forestfires.csv

About Dataset:

  • month : Month of the year: 'jan' to 'dec'
  • day : Day of the week: 'mon' to 'sun'
  • FFMC : Fine Fuel Moisture Code index from the FWI system: 18.7 to 96.20
  • DMC : Duff Moisture Code index from the FWI system: 1.1 to 291.3
  • DC : Drought Code index from the FWI system: 7.9 to 860.6
  • ISI : Initial Spread Index from the FWI system: 0.0 to 56.10
  • temp : Temperature in Celsius degrees: 2.2 to 33.30
  • RH : Relative humidity in percentage: 15.0 to 100
  • wind : Wind speed in km/h: 0.40 to 9.40
  • rain : Outside rain in mm/m2 : 0.0 to 6.4
  • area : The burned area of the forest (in ha): 0.00 to 1090.84

[7] Neural Networks - II

Predicting Turbine Energy Yield (TEY) using Ambient Variables as Features.

The dataset contains 36733 instances of 11 sensor measures aggregated over one hour (by means of average or sum) from a gas turbine.

The Dataset includes gas turbine parameters (such as Turbine Inlet Temperature and Compressor Discharge pressure) in addition to the ambient variables.

Data Source: https://dravate.com/assets/courses/machinelearning/gas_turbines.csv


Find more blog posts with similar tags

python ai ml machine learning

Join 2,000+ subscribers
Stay in the loop with everything you need to know.
We care about your data in our privacy policy.