Deck 13: The Data Warehouse
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/62
Play
Full screen (f)
Deck 13: The Data Warehouse
1
All of the following are steps in building a data warehouse, except ____.
A) data extraction
B) data manipulation
C) data cleaning
D) data transformation
E) data loading
A) data extraction
B) data manipulation
C) data cleaning
D) data transformation
E) data loading
B
2
All of the following are reasons that having separate files for each DSS application is wasteful, expensive and inefficient, except ____.
A) having separate files for each DSS application interferes with the transactional database environment by compromising its performance
B) different DSS applications often need the same data, causing duplicate files to be created for each application
C) while particular files support particular DSS applications, they tend to be inflexible and do not support closely related applications that require slightly different data
D) individual files tied to specific DSS applications do nothing to encourage other people and groups in the company to use the company's accumulated data to gain a competitive advantage over the competition
E) even if someone in the company is aware of existing DSS application data that they could use to their (and the company's) own advantage they generally can't get access to it because it is "owned" by the application for which it was created
A) having separate files for each DSS application interferes with the transactional database environment by compromising its performance
B) different DSS applications often need the same data, causing duplicate files to be created for each application
C) while particular files support particular DSS applications, they tend to be inflexible and do not support closely related applications that require slightly different data
D) individual files tied to specific DSS applications do nothing to encourage other people and groups in the company to use the company's accumulated data to gain a competitive advantage over the competition
E) even if someone in the company is aware of existing DSS application data that they could use to their (and the company's) own advantage they generally can't get access to it because it is "owned" by the application for which it was created
A
3
The process of adding externally acquired data to a data warehouse is known as ____.
A) data evolution
B) data embellishment
C) data enlarging
D) data enhancement
E) data enrichment
A) data evolution
B) data embellishment
C) data enlarging
D) data enhancement
E) data enrichment
E
4
The process of removing errors from data while building a data warehouse is known as ____.
A) data extraction
B) data manipulation
C) data cleaning
D) data transformation
E) data loading
A) data extraction
B) data manipulation
C) data cleaning
D) data transformation
E) data loading
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
5
Data in a typical data warehouse ____.
A) is frequently updated by changing existing attribute values as the values change in the operational database
B) grows as new data is appended to the end of the existing tables
C) does not require timestamps
D) is always fully detailed, not summarized data
E) is organized according to the company's TPS applications
A) is frequently updated by changing existing attribute values as the values change in the operational database
B) grows as new data is appended to the end of the existing tables
C) does not require timestamps
D) is always fully detailed, not summarized data
E) is organized according to the company's TPS applications
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
6
Data in a typical data warehouse may be denormalized to improve performance because ____.
A) the data does not have necessarily have to be absolutely current
B) data about each of the "subjects" in the data warehouse is typically collected from several of the company's transactional databases
C) data integrity problems are not a concern since the existing data is not updated
D) the data must be of high quality
E) the data may be aggregated
A) the data does not have necessarily have to be absolutely current
B) data about each of the "subjects" in the data warehouse is typically collected from several of the company's transactional databases
C) data integrity problems are not a concern since the existing data is not updated
D) the data must be of high quality
E) the data may be aggregated
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
7
A data mart (DM) ____.
A) must be the sum of all of a company's enterprise data warehouses
B) is designed to support a department or a related group of departments
C) cannot exist without an enterprise data warehouse being present
D) must be fully normalized
E) supports an entire company or a major part of one
A) must be the sum of all of a company's enterprise data warehouses
B) is designed to support a department or a related group of departments
C) cannot exist without an enterprise data warehouse being present
D) must be fully normalized
E) supports an entire company or a major part of one
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
8
A ____ design in a data warehouse data structure refers to the situation in which one dimension table leads to another dimension table.
A) network
B) hierarchical
C) snowflake
D) raindrop
E) solar
A) network
B) hierarchical
C) snowflake
D) raindrop
E) solar
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
9
____ is the process of copying data from the transactional databases in preparation for loading it into the data warehouse.
A) Data extraction
B) Data manipulation
C) Data cleaning
D) Data transformation
E) Data loading
A) Data extraction
B) Data manipulation
C) Data cleaning
D) Data transformation
E) Data loading
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
10
Data warehouses are often referred to as multi_____ databases because each occurrence of the subject is referenced by an occurrence of each of several characteristics of the subject.
A) dimensional
B) faceted
C) partitioned
D) determined
E) normal
A) dimensional
B) faceted
C) partitioned
D) determined
E) normal
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
11
____ is a decision support methodology based on viewing data in multiple dimensions.
A) Data mining
B) Market basket analysis
C) Neural networks
D) Online analytic processing
E) Data transformation
A) Data mining
B) Market basket analysis
C) Neural networks
D) Online analytic processing
E) Data transformation
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
12
Generally, one of the dimensions of any data warehouse is ____.
A) distance
B) time
C) energy
D) information
E) pallets
A) distance
B) time
C) energy
D) information
E) pallets
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
13
All of the following are characteristics of a data warehouse, except the data ____.
A) must be high quality
B) may be aggregated
C) is often denormalized
D) is not necessarily absolutely current
E) is volatile
A) must be high quality
B) may be aggregated
C) is often denormalized
D) is not necessarily absolutely current
E) is volatile
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
14
An enterprise data warehouse (EDW) ____.
A) must be the sum of all of a company's data marts
B) is designed to support a department or a related group of departments
C) cannot exist without data marts being present
D) must be fully normalized
E) supports an entire company or a major part of one
A) must be the sum of all of a company's data marts
B) is designed to support a department or a related group of departments
C) cannot exist without data marts being present
D) must be fully normalized
E) supports an entire company or a major part of one
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
15
There is a(n) _____ relationship between each dimension table entity and the fact table entity in a star schema data warehouse structure.
A) standard
B) structured
C) unstructured
D) many-to-many
E) one-to-many
A) standard
B) structured
C) unstructured
D) many-to-many
E) one-to-many
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
16
Of the following applications or application systems, the one that would be considered a decision support system is ____.
A) a system that links companies in a supply chain
B) a hospital application that makes patient records available to all doctors and other health care professionals who are treating patients
C) a university course and grade record keeping system
D) a system that provides the management of a retail chain with data helpful in choosing sites for new stores
E) a payroll system
A) a system that links companies in a supply chain
B) a hospital application that makes patient records available to all doctors and other health care professionals who are treating patients
C) a university course and grade record keeping system
D) a system that provides the management of a retail chain with data helpful in choosing sites for new stores
E) a payroll system
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
17
All of the following are categories or techniques of data transformation during the building of a data warehouse, except ____.
A) coding scheme conversion
B) unit of measure conversion
C) aggregation
D) questionable data conversion
E) combining values from different attributes in transactional databases into a single attribute in the data warehouse
A) coding scheme conversion
B) unit of measure conversion
C) aggregation
D) questionable data conversion
E) combining values from different attributes in transactional databases into a single attribute in the data warehouse
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
18
All of the following are characteristics of a data warehouse, except the data is ____.
A) subject oriented
B) always current
C) integrated
D) non-volatile
E) time variant
A) subject oriented
B) always current
C) integrated
D) non-volatile
E) time variant
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
19
If the enterprise data warehouse was created first in a company and then later data was extracted from it to create one or more data marts, this is known as ____ development.
A) minor-to-major
B) major-to-minor
C) bottom-up
D) top-down
E) sideways
A) minor-to-major
B) major-to-minor
C) bottom-up
D) top-down
E) sideways
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
20
A way to store data warehouse data in a relational database structure is with a model known as the ____ schema.
A) network
B) multi-pronged
C) pointer
D) star
E) ring
A) network
B) multi-pronged
C) pointer
D) star
E) ring
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
21
All of the following are desirable types of background knowledge for a data warehouse administrator to have, except an understanding of ____.
A) the company's business processes that underlies an understanding of the company's transactional data and databases
B) the company's transactional data and databases for selection and integration into the data warehouse
C) OLAP and data mining techniques so that the data warehouse design will properly support these processes
D) how to handle very large databases, in general
E) how to handle very small databases, in general
A) the company's business processes that underlies an understanding of the company's transactional data and databases
B) the company's transactional data and databases for selection and integration into the data warehouse
C) OLAP and data mining techniques so that the data warehouse design will properly support these processes
D) how to handle very large databases, in general
E) how to handle very small databases, in general
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
22
The data mining application that looks for one occurrence of a type of data being associated with another in the stored data is called ____.
A) drill-down analysis
B) online analytic processing
C) market basket analysis
D) reversion analysis
E) data transformation
A) drill-down analysis
B) online analytic processing
C) market basket analysis
D) reversion analysis
E) data transformation
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
23
Legacy DSS applications were not oriented towards file sharing.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
24
All of the following are data mining techniques, except ____.
A) case-based learning
B) decision trees
C) data enrichment
D) neural networks
E) genetic algorithms
A) case-based learning
B) decision trees
C) data enrichment
D) neural networks
E) genetic algorithms
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
25
Bottom-up development implies that the EDW was created first and then later data was extracted from the EDW to create one or more Data marts, initially and on an ongoing basis.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
26
A data mart is a small-scale data warehouse that is designed to support a small part of an organization, say a department or a related group of departments.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
27
In the star schema design, the dimensions are in the middle and the subjects radiate outwards as the rays of a star.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
28
Data warehouse data cannot be aggregated.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
29
Decision support systems are designed to make decisions for managers.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
30
All of the following are desirable types of background knowledge for a data warehouse administrator to have, except ____.
A) business expertise
B) data expertise
C) technical expertise
D) All of the above.
E) None of the above.
A) business expertise
B) data expertise
C) technical expertise
D) All of the above.
E) None of the above.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
31
Data warehouse data is characterized as volatile.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
32
The star schema approach to multidimensional data design is incompatible with relational database structures.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
33
A star schema has a "fact table," which represents the data warehouse "subject".
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
34
Data warehouse data is characterized as subject oriented.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
35
Data warehouses are often referred to as multidimensional databases because each occurrence of the subject is referenced by an occurrence of each of several dimensions or characteristics of the subject.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
36
Data warehouse data is time variant or historic.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
37
In the OLAP environment ____ refers to going back to the database and retrieving finer levels of data detail than you have already retrieved.
A) data mining
B) drill-down
C) slice
D) pivot
E) rotation
A) data mining
B) drill-down
C) slice
D) pivot
E) rotation
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
38
A data mart is a large-scale data warehouse that incorporates the data of an entire company or of a major division, site, or activity of a company.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
39
Due to the nature of DSS applications, data warehouse data must be up-to-the-moment current.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
40
Data warehouse data is less problematic if denormalized than operational data would be.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
41
A data warehouse administrator should have a combination of business, data, and technical expertise.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
42
Interchanging the data dimensions is known as "taking a slice" in the OLAP environment.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
43
Data cleaning or cleansing or scrubbing refers to fixing errors in data as it is being loaded into the operational database.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
44
Market basket analysis tries, for example, to match product purchase patterns with customer demographics.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
45
One type of data cleaning is to determine the correct value for missing data.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
46
Adding data acquired from outside of the company to the data warehouse is known as data extraction.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
47
Data in the data warehouse may be a perpetually moving target because of increasing and shifting user demands.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
48
A snowflake design is a feature of a star schema in which one dimension table leads to another.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
49
Data warehouses are not associated with the concept of metadata.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
50
Data loading takes place after data transformation and before data cleaning.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
51
Data transformation takes place after the data is loaded into the data warehouse.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
52
One type of data cleaning is to determine the correct value for impossible data.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
53
Usually, one of the dimensions of a star schema is time.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
54
Data mining would be impossible for people to do manually because of the huge amount of data present in the company's data warehouse.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
55
Drill-down refers to going to the data warehouse and retrieving courser or broader levels of data.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
56
Coding schemes used for attributes in different transactional databases must be reconciled as they are being merged into common data warehouse tables.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
57
Neural networks and genetic algorithms are among the techniques for data mining.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
58
Two major data warehouse usage approaches are on-line analytic processing and data mining.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
59
Online analytic processing refers to the searching out of hidden knowledge in the company's data that can give the company a competitive advantage in its marketplace.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
60
Data enrichment is the process of copying the data from the transactional databases in preparation for loading it into the data warehouse.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
61
Consider the following relational database for Grand Travel Airlines.
Grand Travel Airlines has to keep track of its flight and airplane history. A flight is uniquely identified by the combination of a flight number and a date. Every passenger who has flown on Grand Travel has a unique passenger number. For a particular passenger who has taken a particular flight, the company wants to keep track of the fare that she paid for it and the date that she made the reservation for it. Clearly, a passenger may have taken many flights (he must have taken at least one to be in the database) and every flight has had many passengers on it.
A pilot is identified by a unique pilot (or employee) number. A flight on a particular date has exactly one pilot. Each pilot has typically flown many flights but a pilot may be new to the company, is in training, and has not flown any flights, yet. Each airplane has a unique serial number. A flight on a particular date used one airplane. Each airplane has flown on many flights and dates, but a new airplane may not have been used at all, yet.
PILOT Relation
FLIGHT Relation
PASSENGER Relation
RESERVATION Relation
AIRPLANE Relation
a. Design a multidimensional database using a star schema for a data warehouse for the Grand Travel Airlines business environment. The subject will be "reservation" which represents a particular passenger reservation on a particular flight. Be sure to keep track of the fare that the passenger paid for the flight and the date of the reservation.
b. Describe three OLAP uses of this data warehouse.
c. Describe one data mining use of this data warehouse.
Grand Travel Airlines has to keep track of its flight and airplane history. A flight is uniquely identified by the combination of a flight number and a date. Every passenger who has flown on Grand Travel has a unique passenger number. For a particular passenger who has taken a particular flight, the company wants to keep track of the fare that she paid for it and the date that she made the reservation for it. Clearly, a passenger may have taken many flights (he must have taken at least one to be in the database) and every flight has had many passengers on it.
A pilot is identified by a unique pilot (or employee) number. A flight on a particular date has exactly one pilot. Each pilot has typically flown many flights but a pilot may be new to the company, is in training, and has not flown any flights, yet. Each airplane has a unique serial number. A flight on a particular date used one airplane. Each airplane has flown on many flights and dates, but a new airplane may not have been used at all, yet.





a. Design a multidimensional database using a star schema for a data warehouse for the Grand Travel Airlines business environment. The subject will be "reservation" which represents a particular passenger reservation on a particular flight. Be sure to keep track of the fare that the passenger paid for the flight and the date of the reservation.
b. Describe three OLAP uses of this data warehouse.
c. Describe one data mining use of this data warehouse.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck
62
Consider the following relational database for the Central Zoo. Central Zoo wants to maintain information about its animals, the enclosures in which they live, and its zookeepers and the services they perform for the animals. In addition, Central Zoo has a program by which people can be sponsor of animals. Central Zoo wants to track its sponsors, their dependents, and associated data.
Each animal has a unique animal number and each enclosure has a unique enclosure number. An animal can live in only one enclosure. An enclosure can have several animals in it or it can be currently empty. A zookeeper has a unique employee number. Every animal has been cared for by at least one and generally many zookeepers; each zookeeper has cared for at least one and generally many animals. Each time a zookeeper performs a specific, significant service for an animal the service type, date, and time are recorded. A zookeeper may perform a particular service on a particular animal more than once on a given day.
A sponsor, who has a unique sponsor number and a unique social security number, sponsors at least one and possibly several animals. An animal may have several sponsors or none. For each animal that a particular sponsor sponsors, the zoo wants to track the annual sponsorship contribution and renewal date. In addition, Central Zoo wants to keep track of each sponsor's dependents. A sponsor may have several dependents or none. A dependent is associated with exactly one sponsor.
ENCLOSURE Relation
ANIMAL Relation
ZOOKEEPER Relation
CARES FOR Relation
SPONSOR Relation
CONTRIBUTION Relation
DEPENDENT Relation
a. Design a multidimensional database using a star schema for a data warehouse for the Central Zoo business environment. The subject will be "sponsorship" which represents a particular sponsor sponsoring a particular animal. Be sure to keep track of the annual contribution and the renewal date for the sponsorship. Assume that the annual contribution can change with the renewal date, i.e. it can be different in different years.
b. Describe three OLAP uses of this data warehouse.
c. Describe one data mining use of this data warehouse.
Each animal has a unique animal number and each enclosure has a unique enclosure number. An animal can live in only one enclosure. An enclosure can have several animals in it or it can be currently empty. A zookeeper has a unique employee number. Every animal has been cared for by at least one and generally many zookeepers; each zookeeper has cared for at least one and generally many animals. Each time a zookeeper performs a specific, significant service for an animal the service type, date, and time are recorded. A zookeeper may perform a particular service on a particular animal more than once on a given day.
A sponsor, who has a unique sponsor number and a unique social security number, sponsors at least one and possibly several animals. An animal may have several sponsors or none. For each animal that a particular sponsor sponsors, the zoo wants to track the annual sponsorship contribution and renewal date. In addition, Central Zoo wants to keep track of each sponsor's dependents. A sponsor may have several dependents or none. A dependent is associated with exactly one sponsor.







a. Design a multidimensional database using a star schema for a data warehouse for the Central Zoo business environment. The subject will be "sponsorship" which represents a particular sponsor sponsoring a particular animal. Be sure to keep track of the annual contribution and the renewal date for the sponsorship. Assume that the annual contribution can change with the renewal date, i.e. it can be different in different years.
b. Describe three OLAP uses of this data warehouse.
c. Describe one data mining use of this data warehouse.
Unlock Deck
Unlock for access to all 62 flashcards in this deck.
Unlock Deck
k this deck