Exam 9: Big Data, Cloud Computing, and Location Analytics: Concepts and Tool
In most cases, Hadoop is used to replace data warehouses.
False
In what ways can communications companies use geospatial analysis to harness their data effectively?
Communication companies often generate massive amounts of data every day. The ability to analyze the data quickly with a high level of location-specific granularity can better identify the customer churn and help in formulating strategies specific to locations for increasing operational efficiency, quality of service, and revenue.
Why are some portions of tape backup workloads being redirected to Hadoop clusters today?
• First, while it may appear inexpensive to store data on tape, the true cost comes with the difficulty of retrieval. Not only is the data stored offline, requiring hours if not days to restore, but tape cartridges themselves are also prone to degradation over time, making data loss a reality and forcing companies to factor in those costs. To make matters worse, tape formats change every couple of years, requiring organizations to either perform massive data migrations to the newest tape format or risk the inability to restore data from obsolete tapes.
• Second, it has been shown that there is value in keeping historical data online and accessible. As in the clickstream example, keeping raw data on a spinning disk for a longer duration makes it easy for companies to revisit data when the context changes and new constraints need to be applied. Searching thousands of disks with Hadoop is dramatically faster and easier than spinning through hundreds of magnetic tapes. Additionally, as disk densities continue to double every 18 months, it becomes economically feasible for organizations to hold many years' worth of raw or refined data in HDFS.
The direction of the industry over the next few years will likely be moving toward more tightly coupled Hadoop and relational DBMS-based data warehouse technologies.
Cloud computing originates from a reference to the Internet as a "cloud" and is a combination of several information technology components as services.
The analytics layer of the Big Data stack is experiencing what type of development currently?
MapReduce can be easily understood by skilled programmers due to its procedural nature.
What are the differences between stream analytics and perpetual analytics? When would you use one or the other?
Hadoop was designed to handle petabytes and extabytes of data distributed over multiple nodes in parallel.
Openshift is Google's cloud application platform based on a PaaS model.
Managers should be more concerned that data is stored in a structured data warehouse or a Hadoop cluster, and less about the actually the insights that can be derived from the data.
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)