We used to think writing (near) real-time applications that process multiple data streams was for the high IQ crowd and well-funded teams. This belief was probably strengthened by the fact that we (at Qxf2) love Python … and our favorite language lacked good complex event processors, stream processors. So we were excited to discover Wallaroo has set out to fill that gap.
Wallaroo describes itself as “a fast, elastic data processing engine that rapidly takes you from prototype to production by eliminating infrastructure complexity”. The charm for us lay in the fact that it could do stream processing and gave us an easy way to designate nodes, write business logic for them and define the structure of a distributed, stream processing application.
We decided to explore Wallaroo. We used a Docker image with sample applications that Wallaroo Labs published. We were able to do the tutorials fairly easily. In this post, we will walk you through one of the examples Wallaroo Labs has provided.
Wait! Are we using somebody else’s code for a post?
Yeah, we will be using Wallaroo Labs’s Docker image and their own code in this post. So no new code from our end. We know that seems lazy but we have a few good reasons:
a) The excellent posts on Wallaroo Labs’ blog are written for a far more technical audience. But we know there is a class of engineers who want to first see something work before they actually spend time learning something new. This post is meant for such engineers.
b) Google is struggling to rank Wallaroo Labs’ material well. So there is no simple “30-minute guide to Wallaroo” that a reader can scan and grok without having to try things out.
c) This post helped us articulate our understanding (or lack thereof!) of Wallaroo
Getting started with Wallaroo using Docker
In this blog, we will show you how to set up Wallaroo environment using Docker image and then run a sample “Celsius to Fahrenheit” application. We will be using a Linux box. Below are the steps involved
A) Install Docker Community Edition(CE)
B) Get the official Wallaroo Docker image
C) Run a Wallaroo application in Docker
D) Start the Metrics UI
E) Run Giles Receiver
F) Run the “Celsius to Fahrenheit” application
G) Send Data to the application
H) Open “Celsius To Fahrenheit” application
I) Shut down the Cluster
J) Shut down Giles Sender and Giles Receiver
K) Shut down the Metrics UI
L) Shut down the Wallaroo container
A) Install Docker Community Edition(CE)
Get Docker Community Edition (CE) to set up Docker in your system so that we can install the Wallaroo Docker image.
B) Get the official Wallaroo image
Once Docker is installed on your system pull the official Wallaroo image.
Open a new terminal and run the following command.
docker pull wallaroo-labs-docker-wallaroolabs.bintray.io/release/wallaroo:0.3.1
Wallaroo Labs uses a number of names/words to describe the various components of Wallaroo. We are beginners but we will take a shot at giving you some sort of idea (even if it is inaccurate) about the different components. Following things are included in the Docker image :
- Machida: this is Wallaroo’s run time Python environment. Technology that makes it easy to write applications that handle distributed data streams understandably needs its own run-time environment.
- Giles Sender: mimics an incoming data stream and supplies data to Wallaroo applications over TCP.
- Giles Receiver: mimics a data sink and receives data from Wallaroo over TCP.
- Cluster Shutdown tool: notifies the cluster to shut down cleanly.
- Metrics UI: receives and displays metrics for running Wallaroo applications.
- Wallaroo Source Code: full source code is provided, including Python example applications.
C) Run a Wallaroo application in Docker
Run the Wallaroo application in docker by using the following command in new terminal.
docker run --rm -it --privileged -p 4000:4000 \ -v /tmp/wallaroo-docker/wallaroo-src:/src/wallaroo \ -v /tmp/wallaroo-docker/python-virtualenv:/src/python-virtualenv \ --name wally \ wallaroo-labs-docker-wallaroolabs.bintray.io/release/wallaroo:0.3.1
D) Start the Metrics UI
Start the Metrics UI to receive and display metrics for running Wallaroo application. Open a new terminal and run the following commands.
1. Enter the Wallaroo Docker container
docker exec -it wally environment-setup.sh
2. Start the Metrics UI
metrics_reporter_ui start
Verify it started up correctly by visiting http://localhost:4000
E) Run Giles Receiver
Run Giles Receiver to receives data from Wallaroo over TCP. Open a new terminal and run the following commands.
1. Enter the Wallaroo Docker container
docker exec -it wally environment-setup.sh
2. listen for data from Wallaroo application
receiver --listen 127.0.0.1:5555 --no-write --ponythreads=1 --ponynoblock
You should see the line Listening for data that indicates that Giles receiver is up and running.
F) Run the “Celsius to Fahrenheit” application
To run the “Celsius to Fahrenheit” application. This is a stateless application that takes a floating point Celsius value and sends out a floating point Fahrenheit value.Open a new terminal and run the following commands.
1. Enter the Wallaroo Docker container
docker exec -it wally environment-setup.sh
2. Go to the python Celsius example directory
cd /src/wallaroo/examples/python/celsius
3. Run the celsius to fahrenheit application
machida --application-module celsius --in 127.0.0.1:7000 \ --out 127.0.0.1:5555 --metrics 127.0.0.1:5001 --control 127.0.0.1:6000 \ --data 127.0.0.1:6001 --name worker-name --external 127.0.0.1:5050 \ --cluster-initializer --ponythreads=1 --ponynoblock
This tells the “Celsius to Fahrenheit” application that it should listen on port 7000 for incoming data, write outgoing data to port 5555, and send metrics data to port 5001.
G) Send Data to the application
To send the data to the application, open a new terminal and run the following commands.
1. Enter the Wallaroo Docker container
docker exec -it wally environment-setup.sh
2. Start the sender with the following command
sender --host 127.0.0.1:7000 --messages 25000000 --binary \ --batch-size 50 --interval 10_000_000 --repeat --no-write \ --msg-size 8 --ponythreads=1 --ponynoblock \ --file /src/wallaroo/examples/python/celsius/celsius.msg
A pre-generated data file will repeatedly send messages via Giles Sender until application reach 25,000,000 messages.
If the sender is working correctly, you should see Connected printed to the screen.
H) Open “Celsius To Fahrenheit” application
To open the application “Celsius To Fahrenheit” in browser visit http://localhost:4000. You can look at different metrics related to pipeline, worker and computations by clicking on the hyperlinks. The metric stats will get updated as data is processed through the application
I) Shut down the Cluster
To shut down the cluster cleanly, open a new terminal and run the following commands.
1. Enter the Wallaroo Docker container
docker exec -it wally environment-setup.sh
2. Shut down the cluster
cluster_shutdown 127.0.0.1:5050
J) Shut down Giles Sender and Giles Receiver
Press Ctrl-c from Giles Sender and Giles Receiver shells.
K) Shut down the Metrics UI
metrics_reporter_ui stop
L) Shut down the Wallaroo container
docker stop wally
Getting set up with Wallaroo using Docker was fairly simple. Our next step is to try and build an application using Wallaroo. Stay tuned for further updates…
If you liked what you read, know more about Qxf2.
References:
Here are some useful references which we followed.
1. Setting Up Your Environment for Wallaroo in Docker
2. Run a Wallaroo Application
I am a dedicated quality assurance professional with a true passion for ensuring product quality and driving efficient testing processes. Throughout my career, I have gained extensive expertise in various testing domains, showcasing my versatility in testing diverse applications such as CRM, Web, Mobile, Database, and Machine Learning-based applications. What sets me apart is my ability to develop robust test scripts, ensure comprehensive test coverage, and efficiently report defects. With experience in managing teams and leading testing-related activities, I foster collaboration and drive efficiency within projects. Proficient in tools like Selenium, Appium, Mechanize, Requests, Postman, Runscope, Gatling, Locust, Jenkins, CircleCI, Docker, and Grafana, I stay up-to-date with the latest advancements in the field to deliver exceptional software products. Outside of work, I find joy and inspiration in sports, maintaining a balanced lifestyle.