From its official site
Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications
It groups containers that make up an application into logical units for easy management and discovery. K8s is developed by Google.
Need of Container Orchestration tool:
From It’s official website,
Apache Kafka is an open-source distributed event streaming platform.
Kafka is a distributed platform consists clients and servers. It runs as a cluster with one or more servers which can span multiple data centers/cloud regions.
Servers acts as storage layers(Brokers) and also they run as Kafka Connect tool to export and import data continuously to integrate with out sources.
Clients allows to write distributed applications and microservices that read, write, and process streams of events in parallel, at scale, and in a fault-tolerant manner.
In this post, Lets have our fingers dirty with Kubernetes. There are few ways available to have the Kubernetes setup.
Cloud based cluster setup is suitable for multi node cluster which is obviously not necessary to get initial hands-on. Installing Minikube & Kubectl is a good option so that we can have a single node cluster where both master & worker nodes will be…
In the first part, we have seen some important components in the K8s cluster. In this part lets talk about the components in the Architecture.
Let’s have a basic setup of one name node with two application Pods like below.
From its official documentation,
ClickHouse is a fast open-source OLAP database management system, column-oriented and allows to generate analytical reports using SQL queries in real-time.
ClickHouse manages extremely large volumes of data in a stable and sustainable manner. It currently powers Yandex.Metrica, the world’s second-largest web analytics platform, with over 13 trillion database records and over 20 billion events a day.
What ClickHouse is for and not for?
Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
what is event streaming?
Capturing data in real-time from multiple sources in the form of streams of events. Storing these streamings can be used for later retrieval, manipulating, and processing. Even It is possible to react to the events in real-time.
Kafka is event streaming, how?
1. Kafka can read and write stream of events
2. Kafka can store data as long as needed.
3. To process streams of events as they occur or retrospectively.
Before continue, There is Introduction to Apache Spark from me. You can head to the link if you are new to Apache Spark
Welcome to some practical explanations to Apache Spark with Scala. There is even Python supported Spark is available which is PySpark. For the sake of this post, I am continuing with Scala with my windows Apache Spark installation.
RDD (Resilient Distributed Datasets) are the fundamental building blocks of Apache Spark. Spark stores the data in RDD format.
The growth of the current technology world is something we cant estimate. Now we are in Artificial Intelligence evolution. Tomorrow it will change to something advance. For these advanced technologies, data needed to stored and processed is huge. Companies like Google processes Petabytes of data every day. The…
Apache Spark is an open-sourced, distributed data processing system for big data applications that follows the in-memory caching technique for fast response almost against any data size. From Its official site,
Apache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs
Four advantages of Apache Spark from its developers,
Runs approx. 100X faster than its competitor Hadoop Eco. It achieves high performance for both Batch and Streaming data.
Supports over 80 high-level operators to build parallel apps including industry rulers…
Nowadays, for anything, for any business, Mobile Application is the best way to reach out to the customers. Businesses can be anything from Photo Editor to Bank Applications. As an Automation Python Developer, I was thinking is there a way to create Mobile Applications in Python. And then, I came to know KIVY. Let’s dive to get some knowledge about it. From its website,
Kivy is an Open source Python library for rapid development of applications that make use of innovative user interfaces, such as multi-touch apps that is Cross platform and Free.
Yet another Pythonic-Automation guy.