Menu Sidebar

Graphs processing with Apache Flink

Graphs are everywhere. Internet, maps, and social networks to name just a few are all examples of massive graphs that contains vast amounts of useful information. Since the size of these networks is growing and processing them become more and more ubiquitous, we need better tools to do the job.

In this article, I’ll describe how we can use Flink Gelly library to process large graphs and will provide the simple example of how we can find a shortest path between two users in the Twitter graph.

Read More

Implementing Flink batch data connector

Apache Flink has a versatile set of connectors for externals data sources. It can read and write data from databases, local and distributed file systems. However, sometimes what Flink provides is not enough, and we need to read some uncommon data format.

In this article, I will show you how to implement a custom connector for reading a dataset in Flink.

Read More

Using Apache Flink with Java 8

JDK 8 has introduced a lot of long-anticipated features to Java language. Among those, the most notable was the introduction of lambda functions. They allowed adding new frameworks such as Java 8 Streams, as well as, new features to existing frameworks like JUnit 5.

Apache Flink also supports lambda functions, and in this post, I’ll show how to enable them and how to use them in your applications.

Read More

Calculating movies ratings distribution with Apache Flink

If you’ve been following recent news in the Big Data world, you’ve probably heard about Apache Flink. This platform for batch and stream processing, which is built on a few significant technical innovations, can become a real game changer and it is starting to compete with existing products like Apache Spark.

In this post, I would like to show how to implement a simple batch processing algorithm using Apache Flink. We will work with a dataset of movie ratings and will produce a distribution of user ratings. In the process, I’ll show few tricks that you can use to improve the performance of your Flink applications.

Read More

Apache Flink logo

Apache Flink: A New Landmark on the Big Data Landscape

There is no shortage of Big Data applications and frameworks nowadays, and sometimes it may even seem that all niches have already been filled. That’s not how creators of Apache Flink see it, though.Even though their project is not yet as well known as Spark or Hadoop, it has brought enough innovations to become a real game-changer in the world of Big Data.

In this article, I would like to introduce Apache Flink, describe what its main features are, and why is it different from other available solutions. I’ll end the article with an example of a simple stream processing application using Flink.

Read More

Efficient Iterators in Python

While you can write iterators in Python by implementing iterator protocol it usually requires a lot of code and looks cumbersome. To facilitate this task Python provides a powerful syntax to create iterators. By using these constructions we can write complex iterators using just few lines of code.

Read More

Anatomy of a Python Iterator

Iterator is a powerful pattern that was recognised at least as early as 1994 and since then it was incorporated in syntax of almost every modern programming language.

Python also implements this pattern providing a pithy and concise syntax to iterate over lists, maps, dictionaries and other data structures:

for i in [1, 2, 3, 4]:
    print i

In this article I will write about how an iterator is used in Python, how to implement your own iterator and what types of iterators exist in Python.

Read More

Python in One Hour. Part 2

This is the second part of the “Python in 1 hour” tutorial. It will go into more advanced Python features that will help you to develop complex and robust applications.

The only prerequisite for this article is that you should be familiar with the content of the part of this tutorial that you can find here.

Read More

Newer Posts
Older Posts

Follow me on Twitter

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.