Unsupervised data mining | Business & Finance homework help

Sep 15, 2023

BUA 6315: Business Analytics for Decision Making

1

Don't use plagiarized sources. Get Your Custom Essay on
Unsupervised data mining | Business & Finance homework help
Just from $13/Page
Order Essay

Overview:

Module 6 Assignment Handout:

Unsupervised Data Mining

In this assignment, you will learn how to apply three unsupervised data mining techniques using

country-level health and population measures data and social media usage patterns data.

Prompt:

For this assignment, you will analyze the three case studies below and address the questions associated

with each.

For all cases, first partition data sets into 50% training, 30% validation, and 20% test and use 12345 as the

default random seed. If the predictor variable values are in the character format, then treat the predictor

variable as a categorical variable. Otherwise, treat the predictor variable as a numerical variable.

Case 1:

For this case, first download the data: Health Population data (available in Blackboard).

Next review the following case study:

The data set Health Population contains country-level health and population measures for 38 countries from

the World Bank’s 2000 Health Nutrition and Population Statistics database. For each country, the measures

include death rates per 1,000 people (Death Rate, %), health expenditure per capita (Health Expend, in

US$), life expectancy at birth (Life Exp, in years), male adult mortality rate per 1,000 male adults (Male

Mortality), female adult mortality rate per 1,000 female adults (Female Mortality), annual population growth

(Population Growth, in %), female population (Female Pop, in %), male population (Male Pop, in %), total

population (Total Pop), size of labor force (Labor Force), births per woman (Fertility Rate), birth rate per

1,000 people (Birth Rate), and gross national income per capita (GNI, in US$).

Then complete the actions below and record your answers in a Microsoft Word document.

Note: For step-by-step instructions on how to use Excel and Analytic Solver to estimate and predict with

both clustering methods and how to interpret results, refer to the following videos from the module’s lesson:

● Hierarchical Cluster Analysis – Introduction (5:14)

● Using Analytic Solver to Perform Agglomerative Clustering (4:42)

● Using Analytic Solver to Perform K-Means Clustering (3:15)

Section 1: Hierarchical Clustering:

1. Perform agglomerative clustering to group 38 countries according to their health measures listed

below. Use the Euclidean distance and the average linkage clustering method (Group average

linkage) to cluster the data into three clusters. Is data standardization necessary in this case?

Explain.

Use the following measures only: Death Rate, Health Expend, Life Exp, Male Mortality, and

Recent Posts