Want to learn more about using Apache Spark and Zeppelin on Dataproc via the Google Cloud Platform? You’ve come to the right place. Cloud Dataproc is Google’s answer to Amazon EMR (Elastic MapReduce). Like EMR, Cloud Dataproc provisions and manage Compute Engine-based Apache Hadoop and Spark data processing clusters. If you are not familiar with Amazon EMR,..
Like a jigsaw puzzle, there are many components in the AWS big data ecosystem. Read this article and see how the components fit together to form a beautiful whole. If you are a data engineer, wouldn’t it be great if you could easily scale your existing infrastructure on-demand to support your real-time data pipelines? If you are..
Google Cloud Platform (GCP) has training and there are smart ways of preparing for the Google Cloud Certification Exams You might have read the recent news about Spotify building their new event delivery system on Google Cloud Platform (GCP). To scale with their huge volume of content, they have made numerous software architecture design changes to..
In the first article about Amazon EMR, in our two-part series, we learned to install Apache Spark and Apache Zeppelin on Amazon EMR. We also learned ways of using different interactive shells for Scala, Python, and R, to program for Spark. Let’s continue with the final part of this series. We’ll learn to perform simple..
Amazon EMR (Elastic MapReduce) provides a platform to provision and manage Amazon EC2-based data processing clusters. Amazon EMR clusters are installed with different supported projects in the Apache Hadoop and Apache Spark ecosystems. You can either choose to install from a predefined list of software, or pick and choose the ones that make the most..
SELinux provides tools to more finely control the activities allowed to users, processes, and daemons to limit the potential damage from vulnerabilities. In the third and final part of our server security series, we will look at how we can enhance the security of Linux-based AWS EC2 instances with SELinux. We will learn how to..
While AWS EC2 instances should be well protected by VPC security tools, you may still need to implement protection at the OS-level, and that means firewalld. This is the second part of our server security series. In this article, we will look at configuring firewall rules via firewalld on Red Hat Enterprise Linux. While Amazon..
Enhance the server security of a Red Hat Enterprise Linux EC2 instance by monitoring and applying system updates. This is the first part of our Server Security on AWS series. In this series, we will explore some ways to enhance the security of a Red Hat Enterprise Linux EC2 instance. We may also touch on..
This is the third and final part of our SystemTap series. This article assumes that you are familiar with SystemTap basics and that you have installed Docker on your AWS EC2 instance with a minimal Red Hat Enterprise Linux 7 platform container. Now we’ll explore working with actual SystemTap scripts to monitor processes and events…
In the first article in our SystemTap series, we learned how to install the powerful diagnostic tool, SystemTap, on an AWS EC2 instance and then wrote our very first “Hello World” script. We now need to explore some of the interesting (and more useful) scripts that come with SystemTap. Building a SystemTap target environment To..