Data Mining for Business Intelligence (Shmueli) ISDS 474 CSUF Chapter 3 and 4 – Flashcards
44 test answers
Unlock all answers in this set
Unlock answers 44question
give 3 examples of Basic Plots
answer
Line Graphs, Bar Charts, and Scatterplots
Unlock the answer
question
give 2 examples of Distribution Plots
answer
Boxplots and Histograms
Unlock the answer
question
A bar chart helps you determine ____
answer
differences between subgroups
Unlock the answer
question
A _____ might replace a category with a 1 or 0
answer
dummy variable
Unlock the answer
question
A _____ displays relationship between two numerical variables
answer
scatterplot. For example, A decreases B increases
Unlock the answer
question
Line Graphs, Bar Charts, and Scatterplots are examples of ___ plots
answer
basic
Unlock the answer
question
___ plots help determine the potential methods and variable transformations
answer
distribution
Unlock the answer
question
Boxplots and Histograms are examples of ___ plots
answer
distribution
Unlock the answer
question
___ graphs are best for time series data
answer
line
Unlock the answer
question
_____ plots are good for prediction tasks, or supervised learning
answer
distribution
Unlock the answer
question
Histogram shows the distribution of the ____ variable.
answer
outcome. For instance, the median house value.
Unlock the answer
question
___ plots are useful for comparing subgroups
answer
Side-by-side boxplots. For example, the distribution of outcome variable for two neighborhoods
Unlock the answer
question
In a box plot, the top outliers defined as those above ____
answer
Quartile 3 + 1.5 times the difference of Q3 and Q1
Unlock the answer
question
The wider the box, the greater the ____.
answer
variation
Unlock the answer
question
____ are graphical displays where color is used to convey information
answer
Heat Maps
Unlock the answer
question
Heat Maps are used to visualize ___ and ____.
answer
Correlation and Missing Data
Unlock the answer
question
The correlation coefficient lies between __ and ___.
answer
+1 and -1
Unlock the answer
question
The closer the correlation is to 1, the ___ the association.
answer
stronger
Unlock the answer
question
A ____ table for p variables has the SAME number of rows and columns
answer
correlation
Unlock the answer
question
A ____ table can have DIFFERENT number of columns/variables and of rows/records
answer
data
Unlock the answer
question
How to build correlation table that looks like a basic heat map
answer
Highlight all, Data analysis, Home, conditional formatting
Unlock the answer
question
What are some Common methods of pre-processing of data?
answer
Rescaling, Aggregation, Zooming and Panning, and Filtering
Unlock the answer
question
What does Rescaling do?
answer
Can often enhance the plot and illuminate relationships
Unlock the answer
question
What is Filtering?
answer
removing some "noise" from data to focus attention on certain data
Unlock the answer
question
What is Zooming and Panning?
answer
- reveal patterns and outliers (Google maps - zoom certain areas of interest)
Unlock the answer
question
What is Aggregation?
answer
temporal scale: by granularity (monthly, weekly), geographical (by zip codes)
Unlock the answer
question
What are two ways of deriving new variables?
answer
binning and condensing categories
Unlock the answer
question
______ removes crowding and allows a better view of the linear relationship between the two logged-scale variables
answer
Rescaling
Unlock the answer
question
_______ plot Helps visualize and identify clusters and outliers, detect patterns.
answer
scatter plot with labeling
Unlock the answer
question
Scatterplots for ____ can sometimes be ineffective
answer
large observations
Unlock the answer
question
Some alternatives for using scatterplots in large observation are:
answer
Sampling....Reduce marker size....Breaking data down into subsets....Aggregation.....Jittering
Unlock the answer
question
What is jittering?
answer
Slightly moving each marker by adding a small amount of noise
Unlock the answer
question
___ are actors and relations between them, like "nodes", "edges"
answer
Network graphs
Unlock the answer
question
____ plot is multiple scatterplots together for pairwise relationships
answer
Matrix
Unlock the answer
question
Interactive visualization is often preferred over ___ graphs because all plots are on one screen
answer
"static"
Unlock the answer
question
____ maps are good for hierarchical large-scale data
answer
Tree
Unlock the answer
question
In ____ plots, the same record is highlighted in each plot
answer
Linked
Unlock the answer
question
Bar charts, scatterplots Boxplots, histograms, multiple panels, color added Aggregation methods are examples of ___
answer
Prediction and Classification
Unlock the answer
question
Line charts - temporal and seasonal aggregations and Zooming and panning are examples of _____ forecasting
answer
Time series
Unlock the answer
question
Matrix plots / Heatmaps / Aggregation / zooming and panning Map charts / parallel coordinate plots are examples of visualization for ___ learning
answer
Unsupervised
Unlock the answer