Monday, June 1, 2020

Machine Learning: Building Clustering Algorithms


Clustering is a widely-used Machine Learning (ML) technique. Clustering is an Unsupervised ML algorithm that is built to learn patterns from input data without any training, besides being able of processing data with high dimensions. This makes clustering the method of choice to solve a wide range and variety of ML problems. Machine Learning and Clustering has been best explained by the best digital service desk AI softwareZero Incident Framework (ZIF).
ZIF is an award-winning tool developed by GAVS Technologies for the management of AIOps, AI automated root cause analysis solution, AI data analytics monitoring tools and many more such applications. Some excerpts from the blog are provided herein -
What is Clustering and how does it work?
Clustering is finding groups of objects (data) such that objects in the same group will be similar (related) to one another and different from (unrelated to) objects in other groups.
Clustering works on the concept of Similarity/Dissimilarity between data points. The higher similarity between data points, the more likely these data points will belong to the same cluster and higher the dissimilarity between data points, the more likely these data points will be kept out of the same cluster.
This blog also encompasses how a clustering algorithm can be built, how a dissimilarity matrix is built, properties of a distance matrix, and how it is built.
Considerations for the selection of clustering algorithms:
Before the selection of a clustering algorithm, the following considerations need to be evaluated to identify the right clustering algorithms for the given problem. Some of them are -
1.      Partition criteria: Single Level vs hierarchical portioning
2.      Separation of clusters: Exclusive (one data point belongs to only one class) vs non-exclusive (one data point can belong to more than one class)
3.      Similarity measures: Distance-based vs Connectivity-based
Clustering is broadly used in two applications namely - As an ML tool to get insight into data, and as a pre-processing or intermediate step for other classes of algorithms. Read this blog here to know more.

No comments:

Post a Comment