Deck 5: Text,web,and Social Analytics
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/70
Play
Full screen (f)
Deck 5: Text,web,and Social Analytics
1
In sentiment analysis,it is hard to classify some subjects such as news as good or bad,but easier to classify others,e.g.,movie reviews,in the same way.
True
2
Categorization and clustering of documents during text mining differ only in the preselection of categories.
True
3
Since little can be done about visitor Web site abandonment rates,organizations have to focus their efforts on increasing the number of new visitors.
False
4
In text mining,if an association between two concepts has 7% support,it means that 7% of the documents had both concepts represented in the same document.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
5
Descriptive analytics for social media feature such items as your followers as well as the content in online conversations that help you to identify themes and sentiments.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
6
Consistent high quality,higher publishing frequency,and longer time lag are all attributes of industrial publishing when compared to Web publishing.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
7
Regional accents present challenges for natural language processing.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
8
Web site visitors who critique and create content are more engaged than those who join networks and spectate.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
9
Search engine optimization (SEO)techniques play a minor role in a Web site's search ranking because only well-written content matters.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
10
In sentiment analysis,sentiment suggests a transient,temporary opinion reflective of one's feelings.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
11
Companies understand that when their product goes "viral," the content of the online conversations about their product does not matter,only the volume of conversations.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
12
Articles and auxiliary verbs are assigned little value in text mining and are usually filtered out.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
13
Clickstream analysis does not need users to enter their perceptions of the Web site or other feedback directly to be useful in determining their preferences.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
14
Generally,making a search engine more efficient makes it less effective.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
15
Decentralization,the need for specialized skills,and immediacy of output are all attributes of Web publishing when compared to industrial publishing.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
16
Current use of sentiment analysis in voice of the customer applications allows companies to change their products or services in real time in response to customer sentiment.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
17
In the Hong Kong government case study,reporting time was the main benefit of using SAS Business Analytics to generate reports.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
18
In the patent analysis case study,text mining of thousands of patents held by the firm and its competitors helped improve competitive intelligence,but was of little use in identifying complementary products.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
19
In the financial services firm case study,text analysis for associate-customer interactions were completely automated and could detect whether they met the company's standards.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
20
Text analytics is the subset of text mining that handles information retrieval and extraction,plus data mining.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
21
What are the two main types of Web analytics?
A) old-school and new-school Web analytics
B) Bing and Google Web analytics
C) off-site and on-site Web analytics
D) data-based and subjective Web analytics
A) old-school and new-school Web analytics
B) Bing and Google Web analytics
C) off-site and on-site Web analytics
D) data-based and subjective Web analytics
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
22
What types of documents are BEST suited to semantic labeling and aggregation to determine sentiment orientation?
A) medium- to large-sized documents
B) small- to medium-sized documents
C) large-sized documents
D) collections of documents
A) medium- to large-sized documents
B) small- to medium-sized documents
C) large-sized documents
D) collections of documents
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
23
Web site usability may be rated poor if
A) the average number of page views on your Web site is large.
B) the time spent on your Web site is long.
C) Web site visitors download few of your offered PDFs and videos.
D) users fail to click on all pages equally.
A) the average number of page views on your Web site is large.
B) the time spent on your Web site is long.
C) Web site visitors download few of your offered PDFs and videos.
D) users fail to click on all pages equally.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
24
What does Web content mining involve?
A) analyzing the universal resource locator in Web pages
B) analyzing the unstructured content of Web pages
C) analyzing the pattern of visits to a Web site
D) analyzing the PageRank and other metadata of a Web page
A) analyzing the universal resource locator in Web pages
B) analyzing the unstructured content of Web pages
C) analyzing the pattern of visits to a Web site
D) analyzing the PageRank and other metadata of a Web page
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
25
What is one major way in which Web-based social media differs from traditional publishing media?
A) Most Web-based media are operated by the government and large firms.
B) They use different languages of publication.
C) They have different costs to own and operate.
D) Web-based media have a narrower range of quality.
A) Most Web-based media are operated by the government and large firms.
B) They use different languages of publication.
C) They have different costs to own and operate.
D) Web-based media have a narrower range of quality.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
26
In text mining,tokenizing is the process of
A) categorizing a block of text in a sentence.
B) reducing multiple words to their base or root.
C) transforming the term-by-document matrix to a manageable size.
D) creating new branches or stems of recorded paragraphs.
A) categorizing a block of text in a sentence.
B) reducing multiple words to their base or root.
C) transforming the term-by-document matrix to a manageable size.
D) creating new branches or stems of recorded paragraphs.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
27
In the research literature case study,the researchers analyzing academic papers extracted information from which source?
A) the paper abstract
B) the paper keywords
C) the main body of the paper
D) the paper references
A) the paper abstract
B) the paper keywords
C) the main body of the paper
D) the paper references
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
28
What does advanced analytics for social media do?
A) It helps identify your followers.
B) It identifies links between groups.
C) It examines the content of online conversations.
D) It identifies the biggest sources of influence online.
A) It helps identify your followers.
B) It identifies links between groups.
C) It examines the content of online conversations.
D) It identifies the biggest sources of influence online.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
29
What do voice of the market (VOM)applications of sentiment analysis do?
A) They examine customer sentiment at the aggregate level.
B) They examine employee sentiment in the organization.
C) They examine the stock market for trends.
D) They examine the "market of ideas" in politics.
A) They examine customer sentiment at the aggregate level.
B) They examine employee sentiment in the organization.
C) They examine the stock market for trends.
D) They examine the "market of ideas" in politics.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
30
In the opening vignette,the architectural system that supported Watson used all the following elements EXCEPT
A) massive parallelism to enable simultaneous consideration of multiple hypotheses.
B) an underlying confidence subsystem that ranks and integrates answers.
C) a core engine that could operate seamlessly in another domain without changes.
D) integration of shallow and deep knowledge.
A) massive parallelism to enable simultaneous consideration of multiple hypotheses.
B) an underlying confidence subsystem that ranks and integrates answers.
C) a core engine that could operate seamlessly in another domain without changes.
D) integration of shallow and deep knowledge.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
31
In sentiment analysis,which of the following is an implicit opinion?
A) The hotel we stayed in was terrible.
B) The customer service I got for my TV was laughable.
C) The cruise we went on last summer was a disaster.
D) Our new mayor is great for the city.
A) The hotel we stayed in was terrible.
B) The customer service I got for my TV was laughable.
C) The cruise we went on last summer was a disaster.
D) Our new mayor is great for the city.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
32
Search engine optimization (SEO)is a means by which
A) Web site developers can negotiate better deals for paid ads.
B) Web site developers can increase Web site search rankings.
C) Web site developers index their Web sites for search engines.
D) Web site developers optimize the artistic features of their Web sites.
A) Web site developers can negotiate better deals for paid ads.
B) Web site developers can increase Web site search rankings.
C) Web site developers index their Web sites for search engines.
D) Web site developers optimize the artistic features of their Web sites.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
33
Breaking up a Web page into its components to identify worthy words/terms and indexing them using a set of rules is called
A) preprocessing the documents.
B) document analysis.
C) creating the term-by-document matrix.
D) parsing the documents.
A) preprocessing the documents.
B) document analysis.
C) creating the term-by-document matrix.
D) parsing the documents.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
34
In the Whirlpool case study,the company sought to better understand information coming from which source?
A) customer transaction data
B) delivery information
C) customer e-mails
D) goods moving through the internal supply chain
A) customer transaction data
B) delivery information
C) customer e-mails
D) goods moving through the internal supply chain
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
35
All of the following are challenges associated with natural language processing EXCEPT
A) dividing up a text into individual words in English.
B) understanding the context in which something is said.
C) distinguishing between words that have more than one meaning.
D) recognizing typographical or grammatical errors in texts.
A) dividing up a text into individual words in English.
B) understanding the context in which something is said.
C) distinguishing between words that have more than one meaning.
D) recognizing typographical or grammatical errors in texts.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
36
What data discovery process,whereby objects are categorized into predetermined groups,is used in text mining?
A) clustering
B) association
C) classification
D) trend analysis
A) clustering
B) association
C) classification
D) trend analysis
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
37
Understanding which keywords your users enter to reach your Web site through a search engine can help you understand
A) the hardware your Web site is running on.
B) the type of Web browser being used by your Web site visitors.
C) most of your Web site visitors' wants and needs.
D) how well visitors understand your products.
A) the hardware your Web site is running on.
B) the type of Web browser being used by your Web site visitors.
C) most of your Web site visitors' wants and needs.
D) how well visitors understand your products.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
38
In text analysis,what is a lexicon?
A) a catalog of words, their synonyms, and their meanings
B) a catalog of customers, their words, and phrase
C) a catalog of letters, words, phrases and sentences
D) a catalog of customers, products, words, and phrase
A) a catalog of words, their synonyms, and their meanings
B) a catalog of customers, their words, and phrase
C) a catalog of letters, words, phrases and sentences
D) a catalog of customers, products, words, and phrase
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
39
How is objectivity handled in sentiment analysis?
A) It is ignored because it does not appear in customer sentiment.
B) It is incorporated as a type of sentiment.
C) It is clarified with the customer who expressed it.
D) It is identified and removed as facts are not sentiment.
A) It is ignored because it does not appear in customer sentiment.
B) It is incorporated as a type of sentiment.
C) It is clarified with the customer who expressed it.
D) It is identified and removed as facts are not sentiment.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
40
Which of the following statements about Web site conversion statistics is FALSE?
A) Web site visitors can be classed as either new or returning.
B) Visitors who begin a purchase on most Web sites must complete it.
C) The conversion rate is the number of people who take action divided by the number of visitors.
D) Analyzing exit rates can tell you why visitors left your Web site.
A) Web site visitors can be classed as either new or returning.
B) Visitors who begin a purchase on most Web sites must complete it.
C) The conversion rate is the number of people who take action divided by the number of visitors.
D) Analyzing exit rates can tell you why visitors left your Web site.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
41
In the Lotte.com retail case,the company deployed SAS for Customer Experience Analytics to better understand the quality of customer traffic on their Web site,classify order rates,and see which ________ had the most visitors.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
42
In the Social Network Analysis (SNA)for Telecommunications case,SNA can be used to detect ________,i.e.,those visitors who about to leave the website and persuade them to stay with you.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
43
________,also called homonyms,are syntactically identical words with different meanings.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
44
A(n)________ engine is a software program that searches for Web sites or files based on keywords.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
45
A(n)________ is one or more Web pages that provide a collection of links to authoritative Web pages.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
46
________ is a technique used to detect favorable and unfavorable opinions toward specific products and services using large numbers of textual data sources.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
47
Web ________ are used to automatically read through the contents of Web sites.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
48
________ is a connections metric for social networks that measures the ties that actors in a network have with others that are geographically close.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
49
IBM's Watson utilizes a massively parallel,text mining-focused,probabilistic evidence-based computational architecture called ________.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
50
In the Mining for Lies case study,a text based deception-detection method used by Fuller and others in 2008 was based on a process known as ________,which relies on elements of data and text mining techniques.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
51
When a word has more than one meaning,selecting the meaning that makes the most sense can only be accomplished by taking into account the context within which the word is used.This concept is known as ________.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
52
Because the term-document matrix is often very large and rather sparse,an important optimization step is to reduce the ________ of the matrix.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
53
A ________ Web site contains links that send traffic directly to your Web site.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
54
Web pages contain both unstructured information and ________,which are connections to other Web pages.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
55
________ is mostly driven by sentiment analysis and is a key element of customer experience management initiatives,where the goal is to create an intimate relationship with the customer.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
56
________ statistics help you understand whether your specific marketing objective for a Web page is being achieved.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
57
When viewed as a binary feature,________ classification is the binary classification task of labeling an opinionated document as expressing either an overall positive or an overall negative opinion.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
58
At a very high level,the text mining process can be broken down into three consecutive tasks,the first of which is to establish the ________.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
59
________ Web analytics refers to measurement and analysis of data relating to your company that takes place outside your Web site.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
60
________ is a segmentation metric for social networks that measures the strength of the bonds between actors in a social network.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
61
In what ways does the Web pose great challenges for effective and efficient knowledge discovery through data mining?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
62
How would you describe information extraction in text mining?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
63
What is the difference between white hat and black hat SEO activities?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
64
In the security domain,one of the largest and most prominent text mining applications is the highly classified ECHELON surveillance system.What is ECHELON assumed to be capable of doing?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
65
Describe the query-specific clustering method as it relates to clustering.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
66
Why are the users' page views and time spent on your Web site important metrics?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
67
Identify,with a brief description,each of the four steps in the sentiment analysis process.
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
68
Natural language processing (NLP),a subfield of artificial intelligence and computational linguistics,is an important component of text mining.What is the definition of NLP?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
69
What is search engine optimization (SEO)and why is it important for organizations that own Web sites?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck
70
What are the three categories of social media analytics technologies and what do they do?
Unlock Deck
Unlock for access to all 70 flashcards in this deck.
Unlock Deck
k this deck