Multiple Choice
You are designing a service that aggregates clickstream data in batch and delivers reports to subscribers via email only once per week. Data is extremely spikey, geographically distributed, high-scale, and unpredictable. How should you design this system?
A) Use a large RedShift cluster to perform the analysis, and a fleet of Lambdas to perform record inserts into the RedShift tables. Lambda will scale rapidly enough for the traffic spikes.
B) Use a CloudFront distribution with access log delivery to S3. Clicks should be recorded as querystring GETs to the distribution. Reports are built and sent by periodically running EMR jobs over the access logs in S3.
C) Use API Gateway invoking Lambdas which PutRecords into Kinesis, and EMR running Spark performing GetRecords on Kinesis to scale with spikes. Spark on EMR outputs the analysis to S3, which are sent out via email.
D) Use AWS Elasticsearch service and EC2 Auto Scaling groups. The Autoscaling groups scale based on click throughput and stream into the Elasticsearch domain, which is also scalable. Use Kibana to generate reports periodically.
Correct Answer:

Verified
Correct Answer:
Verified
Q600: An application runs on Amazon EC2 instances
Q601: What needs to be done in order
Q602: What is the expected behavior if Ansible
Q603: Customers have recently been complaining that your
Q604: You manage a three-tier web application consisting
Q605: When thinking of AWS Elastic Beanstalk's model,
Q606: A user is creating a new EBS
Q608: A Development team is working on a
Q609: A company runs a production application workload
Q610: You run a SIP-based telephony application that