AWS Lambda is one of the best solutions for managing a data collection pipeline and for implementing a serverless architecture. In this post, we’ll discover how to build a serverless data pipeline in three simple steps using AWS Lambda Functions, Kinesis Streams, Amazon Simple Queue Services (SQS), and Amazon API Gateway! How to build a serverless..
In the first article about Amazon EMR, in our two-part series, we learned to install Apache Spark and Apache Zeppelin on Amazon EMR. We also learned ways of using different interactive shells for Scala, Python, and R, to program for Spark. Let’s continue with the final part of this series. We’ll learn to perform simple..
Amazon EMR (Elastic MapReduce) provides a platform to provision and manage Amazon EC2-based data processing clusters. Amazon EMR clusters are installed with different supported projects in the Apache Hadoop and Apache Spark ecosystems. You can either choose to install from a predefined list of software, or pick and choose the ones that make the most..