Skip to content

Month: January 2017

Using Apache Flink with Java 8

JDK 8 has introduced a lot of long-anticipated features to Java language. Among those, the most notable was the introduction of lambda functions. They allowed adding new frameworks such as Java 8 Streams, as well as, new features to existing frameworks like JUnit 5.

Apache Flink also supports lambda functions, and in this post, I’ll show how to enable them and how to use them in your applications.

Leave a Comment

Calculating movies ratings distribution with Apache Flink

If you’ve been following recent news in the Big Data world, you’ve probably heard about Apache Flink. This platform for batch and stream processing, which is built on a few significant technical innovations, can become a real game changer and it is starting to compete with existing products like Apache Spark.

In this post, I would like to show how to implement a simple batch processing algorithm using Apache Flink. We will work with a dataset of movie ratings and will produce a distribution of user ratings. In the process, I’ll show few tricks that you can use to improve the performance of your Flink applications.

Leave a Comment

Apache Flink: A New Landmark on the Big Data Landscape

There is no shortage of Big Data applications and frameworks nowadays, and sometimes it may even seem that all niches have already been filled. That’s not how creators of Apache Flink see it, though.Even though their project is not yet as well known as Spark or Hadoop, it has brought enough innovations to become a real game-changer in the world of Big Data.

In this article, I would like to introduce Apache Flink, describe what its main features are, and why is it different from other available solutions. I’ll end the article with an example of a simple stream processing application using Flink.

Leave a Comment

Generators in Python

In previous articles I’ve wrote about how to create an iterator in Python by implementing iterator protocolor using the yield keyword. In this article I’ll describe generators: a piece of Python syntax that can turn many iterators into one-liners.

Leave a Comment

Anatomy of a Python Iterator

Iterator is a powerful pattern that was recognised at least as early as 1994 and since then it was incorporated in syntax of almost every modern programming language.

Python also implements this pattern providing a pithy and concise syntax to iterate over lists, maps, dictionaries and other data structures:

for i in [1, 2, 3, 4]:
    print i

In this article I will write about how an iterator is used in Python, how to implement your own iterator and what types of iterators exist in Python.

Leave a Comment

Python in One Hour. Part 1

This is a relatively short and concise article that will give you all that you need to know to start reading/writing Python code. It will start with defining a variable and will discuss types, data structures, if-else statements, loops, and functions.

This tutorial does not require any prior knowledge of Python. The only prerequisite for this article is knowledge of basics of any other object-oriented programming language.

Most of the examples in this article are applicable in both Python 2 and Python 3. I point to differences between Python versions where applicable.

Leave a Comment

How to implement string interpolation in Python

String interpolation is a process of substituting values of local variables into placeholders in a string.

It is implemented in many programming languages such as Scala:

//Scala 2.10+
var name = "John";
println(s"My name is $name")
>>> My name is John


my $name = "John";
print "My name is $name";
>>> My name is John


name = "John"
console.log "My name is #{name}"
>>> My name is John

and many others.

On the first sight, it doesn’t seem that it’s possible to use string interpolation in Python. However, we can implement it with just 2 lines of Python code.

Leave a Comment