by Sopan Shewale on
Schedule
Start Date:
End Date:
Students work on a hands-on project applying AI and ML techniques to solve a real-world problem. This can involve data acquisition, preprocessing, model training, and evaluation. Possible Set of projects are as follows:
[1] Foundation of AI & ML - use of Pandas, Numpy Python Libraries
Data Source: https://dravate.com/assets/courses/machinelearning/adult.data.csv
(The original source : https://archive.ics.uci.edu/dataset/2/adult )
The data has following features
age: continuous;
workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked;
fnlwgt: continuous;
education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool;
education-num: continuous;
marital-status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse,
occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty,
Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces;
relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried;
race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black;
sex: Female, Male;
capital-gain: continuous.
capital-loss: continuous.
hours-per-week: continuous.
native-country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands;
salary: >50K, <=50K.
Use Panda to answer following questions,
[2] Supervised Learning:
Data Source: https://dravate.com/assets/courses/machinelearning/placement-data-class.csv
University Campus recruitment is a strategy for sourcing, engaging and hiring young talent. Our dataset revolves around the placement season of a Business School, Where it has various factors on candidates getting hired such as work experience,exam percentage etc., Finally it contains the status of recruitment and remuneration details.
[3] Supervised Learning
Data Source: https://dravate.com/assets/courses/machinelearning/iris_flower.csv
Create the model that can classify the different species of the Iris flower
[4] Unsupervised Learning
Data Source:
Kmeans Clustering
In this assignment students are expected to apply k‐means clustering to the provided two data sets. The first data set “cluster.csv” contains the data that needs to be clustered. The second data set “random.csv” contains a randomly distributed reference data whose value ranges match those of the “cluster.csv” data.
You need to perform the following experiments.
HAC clustering
Create by hand the clustering dendrogram for the following samples of ten points in one dimension. Sample = (‐1.8, ‐1.7, ‐0.3, 0.1, 0.2, 0.4, 1.6, 1.7, 1.9, 2.0)
Note that below are example dendrograms for the following three points (1, 2, 4) with single and complete link.
[5] Reinforcement Machine Learning
A few examples are:
These examples can involve use of Python Libraries like OpenCV to detect crowds or identify vehicles.
[6] Neural Networks - I
Predict the Burned Area of Forest Fire with Neural Networks.
Data Source: https://dravate.com/assets/courses/machinelearning/forestfires.csv
About Dataset:
[7] Neural Networks - II
Predicting Turbine Energy Yield (TEY) using Ambient Variables as Features.
The dataset contains 36733 instances of 11 sensor measures aggregated over one hour (by means of average or sum) from a gas turbine.
The Dataset includes gas turbine parameters (such as Turbine Inlet Temperature and Compressor Discharge pressure) in addition to the ambient variables.
Data Source: https://dravate.com/assets/courses/machinelearning/gas_turbines.csv