Machine Learning Primer – Flashcards
Unlock all answers in this set
Unlock answersquestion
definition of ml
answer
a machine is said to learn if it is able to take experience and utilize it such that its performance improves up on similar experiences
question
knowledge representation
answer
"The formation of logical structures that assist with turning raw sensory information into a meaningful insight.
question
During the process of knowledge representation
answer
the computer summarizes raw inputs in a model, an explicit description of the structured patterns among data. There are many different types of models.
question
You may already be familiar with some. Examples include:
answer
...
question
Equations
answer
...
question
Diagrams such as trees and graphs
answer
...
question
Logical if/else rules
answer
...
question
Groupings of data known as clusters"
answer
...
question
training
answer
the process of fitting a particular model to a dataset
question
learning vs. training
answer
"Learning requires an additional step to generalize the knowledge to future data. Second, the term training more accurately describes the actual process undertaken when a model is fitted to the data.
question
Learning implies a sort of inductive
answer
bottom-up reasoning. Training better connotes the fact that the machine learning model is imposed by the human teacher onto the machine student, providing the computer with a structure it attempts to model after."
question
generalization
answer
The term generalization describes the process of turning abstracted knowledge into a form that can be utilized for action.
question
heuristics
answer
"machine learning algorithms generally employ shortcuts that more quickly divide the set
question
of concepts. Toward this end
answer
the algorithm will employ heuristics, or educated guesses about where to find the most important concepts."
question
bias
answer
"The heuristics employed by machine learning algorithms also sometimes result in erroneous conclusions. If the conclusions are systematically imprecise, the algorithm is said
question
to have a bias."
answer
...
question
noise
answer
"the failure for models to perfectly generalize is due to the problem of noise, or unexplained variations in data."
question
overfitting
answer
"Trying to model the noise in data is the basis of a problem called overfitting. Because noise is unexplainable by definition, attempting to explain the noise will result in erroneous conclusions that do not generalize well to new cases. Attempting to generate theories to explain the noise also results in more complex models that are more likely to ignore the true pattern the learner is trying to identify. A model that seems to perform well during training but does poorly during testing is said to be overfitted to the training dataset as it does not generalize well. "
question
steps to apply machine learning to your data
answer
"collecting data
question
exploring and preparing data
answer
...
question
training a model on the data
answer
...
question
evaluating model performance
answer
...
question
improving model performance"
answer
...
question
____ percent of the effort in machine learning is devoted to data.
answer
80 (often cited statistic)
question
example
answer
An example is literally a single exemplary instance of the underlying concept to be learned; it is one set of data describing the atomic unit of interest for the analysis.
question
unit of observation
answer
The phrase unit of observation is used to describe the units that the examples are measured in.
question
feature
answer
"A feature is a characteristic or attribute of an example, which might be useful for learning the desired concept."
question
"______ format data is by far the most common form used in machine learning
answer
though as you will see in later chapters, other forms are used occasionally in specialized cases.",Matrix
question
"numeric
answer
categorical, ordinal","If a feature represents a characteristic measured in numbers, it is unsurprisingly called numeric. Alternatively, if it measures an attribute that is represented by a set of categories, the feature is called categorical or nominal. A special case of categorical variables is called ordinal, which designates a nominal variable with categories falling in an ordered list."
question
predictive model
answer
"A predictive model is used for tasks that involve, as the name implies, the prediction of one value using other values in the dataset. The learning algorithm attempts to discover and model the relationship among the target feature (the feature being predicted) and the other features. Despite the common use of the word ""prediction"" to imply forecasting predictive models need not necessarily foresee future events. For instance, a predictive model could be used to predict past events such as the date of a baby's conception using the mother's hormone levels; or, predictive models could be used in real time to control traffic lights during rush hours."
question
supervised learning
answer
the process of training a predictive model is known as supervised learning.
question
classification
answer
The often-used supervised machine learning task of predicting which category an example belongs to is known as classification
question
"class
answer
levels","The target feature to be predicted is a categorical feature known as the class and is divided into categories called levels. A class can have two or more levels, and the levels need not necessarily be ordinal. Because classification is so widely used in machine learning, there are many types of classification algorithms."
question
numeric prediction
answer
"To predict such numeric values, a common form of numeric prediction fits linear regression models to the input data. Although regression models are not the only type of numeric models, they are by far the most widely used. Regression methods are widely used for forecasting, as they quantify in exact terms the association between the inputs and the target, including both the magnitude and uncertainty of the relationship."
question
unsupervised learning
answer
"because there is no target to learn, the process of training a descriptive model is called unsupervised learning."
question
descriptive model
answer
"A descriptive model is used for tasks that would benefit from the insight gained from summarizing data in new and interesting ways. As opposed to predictive models that predict a target of interest; in a descriptive model, no single feature is more important than any other. In fact, because there is no target to learn, the process of training a descriptive model is called unsupervised learning. "
question
pattern discovery
answer
the descriptive modeling task called pattern discovery is used to identify frequent associations within data. Pattern discovery is often used for market basket analysis on transactional purchase data.
question
clustering
answer
The descriptive modeling task of dividing a dataset into homogeneous groups is called clustering.
question
four types of tasks to match a learning task to a ml approach
answer
"classification, numeric prediction, pattern detection, clustering"
question
"for _____
answer
more thought is needed to match a learning problem to an appropriate classifier",
question
supervised learing algorithms
answer
"Nearest Neighbor
question
naive Bayes
answer
...
question
Decision Trees
answer
...
question
Classification Rule Learners
answer
...
question
Linear Regression
answer
...
question
Regression Trees
answer
...
question
Model Trees
answer
...
question
Neural Networks
answer
...
question
Support Vector Machines"
answer
...
question
unsupervised learning algorithms
answer
"Association Rules
question
k-means Clustering"
answer
...
question
algo: Nearest Neighbor
answer
supervised; classification
question
algo: naive Bayes
answer
"supervised, classification"
question
algo: Decision Trees
answer
"supervised, classification"
question
algo: Classification Rule Learners
answer
"supervised, classification"
question
algo: Linear Regression
answer
"supervised, numeric prediction"
question
algo: Regression Trees
answer
"supervised, numeric prediction"
question
algo: Model Trees
answer
"supervised, numeric prediction"
question
algo: Neural Networks
answer
"supervised, dual use"
question
algo: Support Vector Machines
answer
"supervised, dual use"
question
algo: Association Rules
answer
"unsupervised, pattern detection"
question
algo: k-means clustering
answer
"unsupervised, Clustering"
question
(supervised) numeric prediction algos:
answer
"Linear Regression, Regression Trees, Model Trees"
question
(supervised) dual use algos:
answer
"Neural Networks
question
Support Vector Machines"
answer
...
question
(supervised) single use classification algos:
answer
"Nearest Neighbor
question
naive Bayes
answer
...
question
Decision Trees
answer
...
question
Classification Rule Learners"
answer
...
question
Pattern detection algo:
answer
Association Rules
question
Clustering algo:
answer
k-means clustering
question
"In conceptual terms
answer
learning involves the abstraction of data into a structured representation, and the generalization of this structure into action. In more practical terms, a machine learner uses data containing _______ and _______ of the concept to be learned, and summarizes this data in the form of a __________, which is then used for __________ or __________ purposes. These can further be divided into specific tasks including __________ __________ _________, and _________. Among the many options, machine learning algorithms are chosen on the basis of the _________ and the ___________.","examples, features, model, predictive, descriptive, classification, numeric prediction, pattern detection, clustering, input data, learning task"