Introduction to Machine Learning

TOPICS covered:

1. what is ML? or, ML definition.

2. Some Use-Cases of ML

3. Types of ML algorithms

4. Steps Involved in ML

1. what is ML? or, ML definition.

Theoretical Definition: Machine Learning is a field of computer science that uses statistical techniques to give computer systems the ability to " learn" with data, without being explicitly programmed.

Precisely, we can say that Machine Learning is a way to make computers make efficient to do various things fluently without any human help.

This means we give intelligence to a machine so that, it not only does mathematical computations but also learns from given data and finds out the pattern and understands the logic behind it so that it is not able to do computations only for the given data but also on various other data efficiently.

Key terms:

The data on which ML model trains or finds the logic or computational pattern is called Training data.

The data for which we predict the output of it using ML models are called Test data.

2. Some Use-Cases of ML:

Retail-Inventory Management and Customer Segregation
Social Media Sentiment Analysis
E-commerce -Recommender Systems
Logistics-Demand Forecasting and Route Optimization
Banking-Fraud Detection
And also in many fields...

3. Types of Machine Learning Models-

1. Supervised Learning:

Supervised Learning is where you have input variables ( X={ x1, x2, x3, .., xn }) and output variable ( y ) and you use an algorithm to learn the mapping function from the input to the output which can be written as f(X)=y.

Supervised learning can be divided into two parts based on the output( y ).

if the output vector y is numerical then we use Regression Algorithms.

and if the output vector is categorical then we use Classification Algorithms.

Types of data:

Numerical: Here is no hierarchy of ordering between data.

1. Continuous data ( e.x- the price of a phone )

2. Discrete data ( e.x- number of apps in a phone )

Categorical:

1. Nominal: Here is no hierarchy of ordering between data. (ex.- male and female)

2. Ordinal: Here we concerned about the hierarchy of data. ( e.x.- battery performance of a phone[ best> good> bad).

1.1.

Regression:

This is a type of problem where we need to predict the continuous-response value( ex- predicting any number which can vary from - infinity to +infinity. )

Sample examples are:

1. What is the price of a house in a specific city?

2. What is the value of the stock?

3. How many total runs can be on board in a cricket game?

4. etc...

Regression Types

Linear Regression
Logistic Regression
Polynomial Regression
Stepwise Regression
Ridge Regression
Lasso Regression
ElasticNet Regression

1.2.

Classification:

This is a type of problem where we predict the categorical response value where the data can be separated into specific " classes " (ex: We predict if a product has been purchased or not which is classified as yes or no )

Sample examples are:

1. This mail is spam or not?

2. Will it rain today or not?

3. is this picture a cat or not?

4. etc...

There are perhaps four main types of classification tasks that you may encounter;

they are:

Binary Classification :

Popular algorithms that can be used for binary classification include:

Logistic Regression
k-Nearest Neighbors
Decision Trees
Support Vector Machine
Naive Bayes

Multi-Class Classification:

Popular algorithms that can be used for multi-class classification include:

k-Nearest Neighbors.
Decision Trees.
Naive Bayes.
Random Forest.
Gradient Boosting.

Algorithms that are designed for binary classification can be adapted for use for multi-class problems.

This involves using a strategy of fitting multiple binary classification models for each class vs. all other classes (called one-vs-rest) or one model for each pair of "Classes" (called one-vs-one).

One-vs-Rest: Fit one binary classification model for each class vs. all other classes.

One-vs-One: Fit one binary classification model for each pair of classes.

Binary classification algorithms that can use these strategies for multi-class classification include:

Logistic Regression.
Support Vector Machine.

Multi-Label Classification:

Unlike binary classification and multi-class classification, where a single class label is predicted for each example, Multi-Label Classification refers to those classification tasks that have two or more class labels, where one or more class labels may be predicted for each example.

Classification algorithms used for binary or multi-class classification cannot be used directly for multi-label classification. Specialized versions of standard classification algorithms can be used, so-called multi-label versions of the algorithms, including:

Multi-label Decision Trees
Multi-label Random Forests
Multi-label Gradient Boosting

Imbalanced Classification:

Imbalanced Classification refers to those classification tasks where the number of examples in each class is unequally distributed.

Problems based on it are modeled as binary classification tasks, although they may require specialized techniques.

Cost-sensitive Logistic Regression.
Cost-sensitive Decision Trees.
Cost-sensitive Support Vector Machines.

2. Un-Supervised Learning:

The training data does not include targets(outputs) here. So we don't tell the system where to go, the system has to understand itself from the data we give.

Un-Supervised learning can be divided into three parts, i.e-

2.1. Clustering

2.2 Association

2.1 Clustering:

Clustering is a type of problem where we group similar things together.

Some examples are:

1. given news articles, cluster into different types of news.

2. given a set of tweets, cluster-based on the content of the tweet.

3. given a set of images, cluster them into different objects.

Clustering Types

Hierarchical clustering
K-means clustering
K-NN (k nearest neighbors)
Principal Component Analysis
Singular Value Decomposition
Independent Component Analysis

2.2 Association:

Association rule mining was used in unsupervised scenarios to discover interesting patterns. For example, you could mine the transaction data of a grocery store for frequent patterns and association rules,

3. Reinforcement Learning:

Reinforcement learning aims at using observations gathered from the interaction with the environment to take actions that would maximize the reward or minimize the risk. The reinforcement learning algorithm ( called the agent ) continuously learns from the environment in an iterative fashion.

Some examples are:

1. Self-driving cars

2. Computers Games. ( ex.- Alpha Go )

And if you are a fresher in this area, then there are top 10 algorithms that you must have to know.

These are as follows,

You can learn each of them by following any book, and to start with ML you can see Andrew Ng's "Machine Learning course" from Coursera.

4. Steps involved in Machine learning are as follows:

All the steps involved in Machine Learning Process has its own specific role. You can study it by reading books or online articles.

Thank you...

Medium link: < / Click Here >

IntoWebDataScience

Search This Blog