What is Unsupervised Learning?
Unsupervised learning is frequently used in exploratory data analysis to uncover underlying patterns in data. Unsupervised learning, unlike supervised learning, does not use labelled data and instead focuses on the data’s attributes. Each input in labelled training data has a matching output. Because the purpose of the algorithm is to uncover correlations within the data and categorize data points based on the input data alone, we are not concerned with the targeted outputs while utilizing unsupervised learning. Unsupervised learning does not use labelled data to create predictions, whereas supervised learning does.
Source :https://miro.medium.com/max/1400/1*iNIixlUFwhJvLQjlilI7eQ.jpeg
Why Unsupervised Learning?
- Unsupervised algorithms find features that can be used for categorization and find all kinds of unknown patterns in data.
- This works in real time; in the presence of the learner all the input data is analyzed.
- To collect unlabeled data is much easier than to collect labelled data.
Unsupervised Learning is further divided into Clustering and Association rule:
Association
It looks for data item dependencies on other data items and maps accordingly to make it more profitable. It looks for intriguing relationships or associations between the dataset’s variables.
● Apriori Algorithm
● Eclat Algorithm
● F-P Growth Algorithm
Clustering
Clustering is the division of a population or set of data points into different groups so that data points from one group are more similar to data points from other groups.
- Hierarchical clustering
- K-means clustering
Advantages
- The chance for human mistake is limited.
- It creates distinct spectral classes.
- Generally simple and quick to complete.
Disadvantages
- It does not take into account spatial linkages in the data and does not necessarily represent features on the ground.
- Interpreting the spectral classes can take some time.
Applications
- Based on similarities clustering automatically splits the data into groups
- Unusual data points in the dataset can be discovered by irregularity detection. It can be used to track down fraudulent transactions.
- Sets of items which occur together in one’s dataset can be identified by association mining.