Statistics and Research Methodology for Managerial Decisions: Cluster Analysis Essay Example
Statistics and Research Methodology for Managerial Decisions: Cluster Analysis Essay Example

Statistics and Research Methodology for Managerial Decisions: Cluster Analysis Essay Example

Available Only on StudyHippo
Topics:
  • Pages: 2 (407 words)
  • Published: October 20, 2017
  • Type: Analysis
View Entire Sample
Text preview

Introduction: Statistics and Research Methodology for Managerial Decisions Assignment - Cluster Analysis

Cluster analysis is the process of grouping similar objects, entities, or people. It is commonly used in research and other tasks to categorize similar groups. Cluster analysis has similarities with factor analysis, especially when applied to individuals (Q-analysis) rather than variables. The first step in cluster analysis involves starting with a homogeneous group and identifying homogenous subgroups based on specific characteristics of the objects.

When measuring similarity, data has been gathered on each of K features for all the N entities under consideration of being divided into groups. Unlike multiple discriminant analysi

...

s, cluster analysis does not have predefined groups. The main goal of cluster analysis is to determine the number of distinct groups that exist and define their composition. It analyzes a sample of objects and does not predict relationships.


Terminologies in cluster analysis

  1. Agglomeration schedule

It refers to a table that indicates the clusters and the objects combined within each cluster, and it can be read from top to bottom.

The tabular array begins with two instances combined and also states the distance coefficients and the first appearance of the phase bunch. The distance coefficients are important in determining the number

View entire sample
Join StudyHippo to see entire essay

of clusters for the data.

  1. Cluster centroid

This refers to the values of the variable being considered for all instances in a particular cluster. Each cluster will have different centroids for each variable.

  1. Cluster rank

This is the cluster to which each instance belongs.

To perform ANOVA on the data and determine group importance for analyzing groups, the following are crucial:

  1. Cluster Center

These are the starting points in non-hierarchical clusters. The clusters are formed around these centers and are therefore referred to as seeds.

  1. Dendrogram

This is used more frequently in interpreting results than the agglomeration schedule, and it provides an easy way to interpret. The dendrogram is a graphical summary of the cluster solution. The best solution is where the horizontal distance in the graph is maximum.

This might be a subjective process.

  1. Icicle diagram

It shows how instances are combined into clusters at each analysis loop.

  1. Similarity/ distance coefficient matrix

This contains the matrix that calculates the distances between the instances.

Measuring methodology
Let theK features be measured byK variables asK1,K2,K3,. . . ,KK.The task of measuring similarity between the objects is complex because, in most cases, the data is measured in different units/scales in its original form. To solve this problem, each variable is

standardized by subtracting its mean value and then dividing by the standard deviation. This converts it into a pure numerical form.

The similarity between two objects, I and J, can be represented as Calciferolij, which is calculated using the formula (teni1 - xj1)2 + (teni2 - xj2)2 + ...... + (tenik- xjk). This calculation indicates the level of similarity between the two objects. To organize groups mathematically, it is important to have a benchmark that can evaluate different groupings and determine the optimal number of objects in each group. The methodology involves using distances among objects from groups to achieve this.

The distance similarity matrix among three objects is displayed as follows:

Distance or similarity matrix

8 < / t d >
6 < / t d >
5 < / t d >
S.no Oxygen1 Oxygen2 Oxygen3
1 5 2 8
2< / td >

5< / td >

6< / td >
< / tr >

Bunch 1

< strong >< u >< s trong style="mso-bidi-font-weight: normal">D ist ance wi thi n two bunchs/std>/dt

<< <

<<<<<<<<<<<<

(=8+2)

Distance between two bunchs/std>

Entire distances of two objects/std>

16 < / tr >

3


(=6+8)
/std>/dt>

<< <

Steps to use cluster analysis

  1. First, choose the sample that requires clustering.
  2. Next, clearly define the variables that will be measured. These variables can include objects, events, or entities.
  3. Then, calculate the similarities between different entities using techniques like correlation and Euclidean distances.
  4. Select mutually exclusive clusters based on these calculated similarities.
  5. Finally, compare and validate the formed clusters.

Premises of cluster analysis

In order for cluster analysis to work effectively, it depends on distance measures. It assumes that if variables are on the same dimensions or units, they will have similar means. However, if different dimensions or units are used for measurement or if the variables are not similar in nature, this can impact the resulting clusters. To address this issue, standardization can be applied to equalize the impact of variables measured on different scales.

Bunch processs and methods There are two main processes in bunch analysis. They are

  • Hierarchical
  • Non- Hierarchical

Hierarchical method is structured like a tree, known as a dendrogram. This method consists of the following:

  • Agglomerate

This approach begins with each instance being treated as an individual cluster, and at each stage, similar clusters are merged together. It culminates in a single cluster.

  • Dissentious

The text begins with a single instance and then divides the instances based on their differences. It concludes with all the instances separated. The

agglomerate method is further divided into:


  1. Linkage methods
  1. Single linkage

Minimal distance or nearest vicinity

  1. Complete linkage

Maximal distances or furthest vicinity

  1. Average linkage

Average distances between linkages


  1. Centroid methods

Distance between two centroids.


  1. Ward’s methods

Squared distance from the agencies.

Non-hierarchical method ( k-means bunch )

K-means bunch provides more stable bunchs, since ; it is an synergistic process as compared with hierarchal method. It needs a pre-specified figure of get downing points to acquire an initial place ; hence, it is best suited in continuance with hierarchal method.

The process of cluster analysis in computer applications is used to segment applications in the research field. The benefits of the internet include availability of updated information, easy navigation across websites, prompt online ordering and query handling, easily comparing prices and services from multiple sellers, access to competitive and educational information about products/services, increase in speed of gathering information from vendors/suppliers, reduction in ordering cost and processing time, and reduction in paper flow. The SPSS process for cluster analysis involves typing in the desired variables according to the input date and proceeding with hierarchical cluster analysis. This includes classifying through the chink analysis menu and choosing

variables to direct to the variable List Box by clicking the right pointer button.

In a similar manner, follow the same steps for other variables. Select the cluster and show according to your needs. Click on the statistics… button to open its sub-dialog box. Check the Agglomeration Schedule, along with any other desired parameters. Proceed by clicking on Continue to close this sub-dialog box. The previous dialog box will reappear. Then choose the Plots… button to open its sub-dialog box.

Select the desired secret plans and click on "Continue" to close this dialogue box. The previous dialogue box will reappear. Click on the "Method..." button to open its dialogue box for the bomber. Choose the desired method and click on "Continue" to close this dialogue box. The previous dialogue box will reappear. In the "Hierarchical bunch analysis" dialogue box, click on "Oklahoma" to view the resulting viewer in K-Means Cluster Analysis.
To classify "K-Means Cluster...", go to the "Analysis" menu and open the K-Means Cluster Analysis dialogue box.
Choose which variable you want analyzed and move it to the variable list by clicking the right pointer button. Repeat this step for other variables as well.
Specify how many clusters you want.
Click on the "Iterate..." button in order to open its dialogue box for the bomber. Select your preferred number of loops, then click on "Continue" to close this dialogue box.The previous one will be displayed again.
To access options for Statistics, select "Options…". After choosing your required options, click on "Continue" to close this annoying dialogue box.The previous dialogue box will reappear.Click on "Oklahoma" to see the final outcome spectator.

Mention Textbooks:

  • "Business research methods" by Donald

R Cooper - Pamela S Schindler

  • "Business research analysis" by T N Srivatsava - Shailaga Rego
  • "Business research methods" by William G. Zikmund
  • "SPSS 17.0 for Researchers" by Dr.S.L Gupta - Hitesh Gupta
  • -

    Get an explanation on any task
    Get unstuck with the help of our AI assistant in seconds
    New
    2 1 and 3 8 (=6+2)

    16
    < /tr >