Multiple Choice
A company wants to run analytics on its Elastic Load Balancing logs stored in Amazon S3. A data analyst needs to be able to query all data from a desired year, month, or day. The data analyst should also be able to query a subset of the columns. The company requires minimal operational overhead and the most cost-effective solution. Which approach meets these requirements for optimizing and querying the log data?
A) Use an AWS Glue job nightly to transform new log files into .csv format and partition by year, month, and day. Use AWS Glue crawlers to detect new partitions. Use Amazon Athena to query data.
B) Launch a long-running Amazon EMR cluster that continuously transforms new log files from Amazon S3 into its Hadoop Distributed File System (HDFS) storage and partitions by year, month, and day. Use Apache Presto to query the optimized format.
C) Launch a transient Amazon EMR cluster nightly to transform new log files into Apache ORC format and partition by year, month, and day. Use Amazon Redshift Spectrum to query the data.
D) Use an AWS Glue job nightly to transform new log files into Apache Parquet format and partition by year, month, and day. Use AWS Glue crawlers to detect new partitions. Use Amazon Athena to query data.
Correct Answer:

Verified
Correct Answer:
Verified
Q59: Once a month, a company receives a
Q60: A market data company aggregates external data
Q61: A marketing company is storing its campaign
Q62: A mobile gaming company wants to capture
Q63: An insurance company has raw data in
Q65: A company has a data lake on
Q66: A team of data scientists plans to
Q67: A company wants to collect and process
Q68: An IoT company wants to release a
Q69: A company is planning to do a