Exam 5: Text and Web Mining
Exam 1: Business Intelligence65 Questions
Exam 2: Data Warehousing70 Questions
Exam 3: Business Performance Management70 Questions
Exam 4: Data Mining for Business Intelligence70 Questions
Exam 5: Text and Web Mining70 Questions
Exam 6: Business Intelligence Implementation: Integration and Emerging Trends70 Questions
Select questions type
Compare and contrast text mining and data mining.
Free
(Essay)
4.8/5
(43)
Correct Answer:
Text mining is the semi-automated process of extracting patterns (useful information and knowledge) from large amounts of unstructured data sources. Data mining is the process of identifying valid, novel, potentially useful, and understandable patterns in data stored in structured databases, where the data are organized in records structured by categorical, ordinal, or continuous variables. Text mining is the same as data mining in that it has the same purpose and uses the same processes, but with text mining the input to the process is a collection of unstructured data files such as Word documents, PDF files, and so on.
Text mining can be used to increase cross-selling and up-selling by analyzing the unstructured data generated by call centers.
Free
(True/False)
4.8/5
(29)
Correct Answer:
True
Using ________ as a rich source of knowledge and a strategic weapon, Kodak not only survives but excels in its market segment defined by innovation and constant change.
Free
(Multiple Choice)
4.8/5
(33)
Correct Answer:
C
Which of the following refers to developing useful information from the links included in the Web documents?
(Multiple Choice)
4.9/5
(38)
At a very high level, the text mining process consists of each of the following tasks except:
(Multiple Choice)
4.8/5
(33)
A vast majority of business data are stored in text documents that are ________.
(Multiple Choice)
4.7/5
(27)
The term "stop-words" are used by text mining to ________ commonly used words.
(Short Answer)
4.8/5
(37)
________ applications focus on "who and how" questions by gathering and reporting direct feedback from site visitors, by benchmarking against other sites and offline channels, and by supporting predictive modeling of future visitor behavior.
(Short Answer)
4.9/5
(31)
________ is a branch of the field of linguistics and a part of natural language processing that studies the internal structure of words.
(Multiple Choice)
4.7/5
(26)
Which of the following is not one of the three main areas of Web mining?
(Multiple Choice)
4.8/5
(41)
Stemming is the process of reducing inflected words to their base or root form.
(True/False)
4.8/5
(34)
A(n) ________ is one or more Web pages that provide a collection of links to authoritative pages.
(Short Answer)
4.8/5
(39)
The two main approaches to text classification are ________ and ________.
(Multiple Choice)
4.8/5
(24)
One of the main approaches to text classification is ________ in which an expert's knowledge is encoded into the system either declaratively or in the form of procedural classification rules.
(Short Answer)
4.9/5
(36)
By applying a learning algorithm to parsed text, researchers from Stanford University's NLP lab have
developed methods that can automatically identify the concepts and relationships between those concepts in the text.
(True/False)
4.9/5
(42)
________ is the grouping of similar documents without having a predefined set of categories.
(Short Answer)
4.8/5
(32)
Why will computers probably not be able to understand natural language the same way and with the same accuracy that humans do?
(Essay)
4.8/5
(26)
In ________, the problem is to group an unlabelled collection of objects, such as documents, customer comments, and Web pages into meaningful groups without any prior knowledge.
(Multiple Choice)
5.0/5
(40)
________ is the process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases, where the data are organized in records structured by categorical, ordinal, or continuous variables.
(Short Answer)
4.8/5
(38)
Stop words, such as a, am, the, and was, are words that are filtered out prior to or after processing of natural language data.
(True/False)
4.8/5
(44)
Showing 1 - 20 of 70
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)