Question 1

In R, to sort data in descending order, we use a negative parameter in the order function.

Accepted Answer

A) True 
 B)False

Question 2

Four observations were binned into one group. In this group, the values are: 40, 45, 66, & 33. What is the average of the group?

Accepted Answer

A)  48 
B)  47 
C)  45 
D)  46 
A)  48 
B)  47 
C)  45 
D)  46

Question 3

A foreign key (FK) is the only unique identifier in a table structure.

Accepted Answer

A) True 
 B)False

Question 4

Megan took a phone survey where each question posed had an answer range of unsatisfied to completely satisfied describing her purchase experience. Because the categories are in equal increments, the category can be recoded into a number transforming the category into what is called a category score.

Accepted Answer

A) True 
 B)False

Question 5

Subsetting is a technique used to convert numerical values into categorical variables.

Accepted Answer

A) True 
 B)False

Question 6

When too many variables are categorized in an analysis, several potential issues may occur. Which of the following is not one of the issues that may occur?

Accepted Answer

A)  model performance suffers. 
B)  rarely occurring categories may not be captured accurately. 
C)  difficulty in differentiating among observations. 
D)  an increase in the number of categories as the data set becomes larger. 
A)  model performance suffers. 
B)  rarely occurring categories may not be captured accurately. 
C)  difficulty in differentiating among observations. 
D)  an increase in the number of categories as the data set becomes larger.

Question 7

Using the simple mean imputation strategy, what value would be placed in the missing observation in x1?

Accepted Answer

A)  18 
B)  82 
C)  80 
D)  66 
A)  18 
B)  82 
C)  80 
D)  66

Question 8

In a data set with 20 variables, if 8% of the values, randomly spread across observations, are missing (blank), what is the probable percent of complete and usable observations?

Accepted Answer

A)  92% 
B)  8% 
C)  18.87% 
D)  15.29% 
A)  92% 
B)  8% 
C)  18.87% 
D)  15.29%

Question 9

Using R, what function is used to evaluate the categories in the variable to identify the dummy variables?

Accepted Answer

A)  referral 
B)  if 
C)  ifelse 
D)  view 
A)  referral 
B)  if 
C)  ifelse 
D)  view

Question 10

Ann is analyzing a data set that contains two variables, Job Title and 401K. 401K contains the name of the three companies that carry the retirement accounts. It is mandatory to have an account, thus no observation is blank. If 401K was transformed to dummy variables, how many should be created?

Accepted Answer

A)  2 
B)  3 
C)  4 
D)  1 
A)  2 
B)  3 
C)  4 
D)  1

Question 11

Kara is reviewing categories where a series of numbers represent the type of loan. She would prefer the actual name of the loan be retained when running her analysis. Using Analytic Solver, what function will allow Kara to retain the category name instead of recording them in numbers?

Accepted Answer

A)  log function 
B)  view function 
C)  IF function 
D)  head function 
A)  log function 
B)  view function 
C)  IF function 
D)  head function

Question 12

A dummy variable takes on a value of 1 or 0 to describe two categories of a variable.

Accepted Answer

A) True 
 B)False

Question 13

In the following table, there are four observations with three variables. Which category is the best fit to be transferred into dummy variables?

Accepted Answer

A)  age 
B)  marital status 
C)  income 
D)  none are a good fit for a dummy variable. 
A)  age 
B)  marital status 
C)  income 
D)  none are a good fit for a dummy variable.

Question 14

Marcus wants to include the month of the year in the analysis as categories. How many dummy variables will be needed?

Accepted Answer

A)  12 
B)  11 
C)  6 
D)  1 
A)  12 
B)  11 
C)  6 
D)  1

Question 15

Which term represents data items, events, or things stored in a database file?

Accepted Answer

A)  instance 
B)  entity 
C)  settings 
D)  quantitative 
A)  instance 
B)  entity 
C)  settings 
D)  quantitative

Question 16

A non-relational database structure that can support the storage of a wide ranges of data, including structured, semi-structured, and unstructured is called ___________.

Accepted Answer

A)  SQL 
B)  Free Range 
C)  NoSQL 
D)  Recreational 
A)  SQL 
B)  Free Range 
C)  NoSQL 
D)  Recreational

Question 17

The primary purpose of a(n) _____________ is to support decision-making and provide a composite view of the organization.

Accepted Answer

A)  data warehouse 
B)  data mart 
C)  entity 
D)  attribute 
A)  data warehouse 
B)  data mart 
C)  entity 
D)  attribute

Question 18

Four observations were binned into one group. In this group, the values are: 40, 45, 38, & 33. What is the average of the group?

Accepted Answer

A)  41 
B)  40 
C)  38 
D)  39 
A)  41 
B)  40 
C)  38 
D)  39

Question 19

Select, From, and Where keywords are statements used in __________.

Accepted Answer

A)  DBMS 
B)  XML 
C)  SQL 
D)  JAVA 
A)  DBMS 
B)  XML 
C)  SQL 
D)  JAVA

Question 20

Simple mean imputation is the best route for replacing large quantities of missing variables within a data set without distorting the relationship among variables.

Accepted Answer

A) True 
 B)False

In R, to sort data in descending order, we use a negative parameter in the order function.

Four observations were binned into one group. In this group, the values are: 40, 45, 66, & 33. What is the average of the group?

A foreign key (FK) is the only unique identifier in a table structure.

Megan took a phone survey where each question posed had an answer range of unsatisfied to completely satisfied describing her purchase experience. Because the categories are in equal increments, the category can be recoded into a number transforming the category into what is called a category score.

Subsetting is a technique used to convert numerical values into categorical variables.

When too many variables are categorized in an analysis, several potential issues may occur. Which of the following is not one of the issues that may occur?

Using the simple mean imputation strategy, what value would be placed in the missing observation in x₁?

In a data set with 20 variables, if 8% of the values, randomly spread across observations, are missing (blank), what is the probable percent of complete and usable observations?

Using R, what function is used to evaluate the categories in the variable to identify the dummy variables?

Ann is analyzing a data set that contains two variables, Job Title and 401K. 401K contains the name of the three companies that carry the retirement accounts. It is mandatory to have an account, thus no observation is blank. If 401K was transformed to dummy variables, how many should be created?

Kara is reviewing categories where a series of numbers represent the type of loan. She would prefer the actual name of the loan be retained when running her analysis. Using Analytic Solver, what function will allow Kara to retain the category name instead of recording them in numbers?

A dummy variable takes on a value of 1 or 0 to describe two categories of a variable.

In the following table, there are four observations with three variables. Which category is the best fit to be transferred into dummy variables?

Marcus wants to include the month of the year in the analysis as categories. How many dummy variables will be needed?

Which term represents data items, events, or things stored in a database file?

A non-relational database structure that can support the storage of a wide ranges of data, including structured, semi-structured, and unstructured is called ___________.

The primary purpose of a(n) _____________ is to support decision-making and provide a composite view of the organization.

Four observations were binned into one group. In this group, the values are: 40, 45, 38, & 33. What is the average of the group?

Select, From, and Where keywords are statements used in __________.

Simple mean imputation is the best route for replacing large quantities of missing variables within a data set without distorting the relationship among variables.

Introduction to Business Analytics

Data Visualization and Summary Measures

Probability and Probability Distributions

Statistical Inference

Regression Analysis

Advanced Regression Analysis

Introduction to Data Mining

Supervised Data Mining: K-Nearest Neighbors and Naãve Bayes

Supervised Data Mining: Decision Trees

Unsupervised Data Mining

Forecasting With Time Series Data

Introduction to Prescriptive Analytics

Filters

Exam 2: Data Management and Wrangling

In R, to sort data in descending order, we use a negative parameter in the order function.

Four observations were binned into one group. In this group, the values are: 40, 45, 66, & 33. What is the average of the group?

A foreign key (FK) is the only unique identifier in a table structure.

Subsetting is a technique used to convert numerical values into categorical variables.

When too many variables are categorized in an analysis, several potential issues may occur. Which of the following is not one of the issues that may occur?

Using the simple mean imputation strategy, what value would be placed in the missing observation in x1?

In a data set with 20 variables, if 8% of the values, randomly spread across observations, are missing (blank), what is the probable percent of complete and usable observations?

Using R, what function is used to evaluate the categories in the variable to identify the dummy variables?

Ann is analyzing a data set that contains two variables, Job Title and 401K. 401K contains the name of the three companies that carry the retirement accounts. It is mandatory to have an account, thus no observation is blank. If 401K was transformed to dummy variables, how many should be created?

Kara is reviewing categories where a series of numbers represent the type of loan. She would prefer the actual name of the loan be retained when running her analysis. Using Analytic Solver, what function will allow Kara to retain the category name instead of recording them in numbers?

A dummy variable takes on a value of 1 or 0 to describe two categories of a variable.

In the following table, there are four observations with three variables. Which category is the best fit to be transferred into dummy variables?

Marcus wants to include the month of the year in the analysis as categories. How many dummy variables will be needed?

Which term represents data items, events, or things stored in a database file?

A non-relational database structure that can support the storage of a wide ranges of data, including structured, semi-structured, and unstructured is called ___________.

The primary purpose of a(n) _____________ is to support decision-making and provide a composite view of the organization.

Four observations were binned into one group. In this group, the values are: 40, 45, 38, & 33. What is the average of the group?

Select, From, and Where keywords are statements used in __________.

Simple mean imputation is the best route for replacing large quantities of missing variables within a data set without distorting the relationship among variables.

Introduction to Business Analytics

Data Visualization and Summary Measures

Probability and Probability Distributions

Statistical Inference

Regression Analysis

Advanced Regression Analysis

Introduction to Data Mining

Supervised Data Mining: K-Nearest Neighbors and Naãve Bayes

Supervised Data Mining: Decision Trees

Unsupervised Data Mining

Forecasting With Time Series Data

Introduction to Prescriptive Analytics

Filters

Using the simple mean imputation strategy, what value would be placed in the missing observation in x₁?