Category: Apache Flink

Graphs processing with Apache Flink

Graphs are everywhere. Internet, maps, and social networks to name just a few are all examples of massive graphs that contains vast amounts of useful information. Since the size of these networks is growing and processing them become more and more ubiquitous, we need better tools to do the job.

In this article, I’ll describe how we can use Flink Gelly library to process large graphs and will provide the simple example of how we can find a shortest path between two users in the Twitter graph.

Continue reading

Using Apache Flink with Java 8

JDK 8 has introduced a lot of long-anticipated features to Java language. Among those, the most notable was the introduction of lambda functions. They allowed adding new frameworks such as Java 8 Streams, as well as, new features to existing frameworks like JUnit 5.

Apache Flink also supports lambda functions, and in this post, I’ll show how to enable them and how to use them in your applications.

Continue reading

Calculating movies ratings distribution with Apache Flink

If you’ve been following recent news in the Big Data world, you’ve probably heard about Apache Flink. This platform for batch and stream processing, which is built on a few significant technical innovations, can become a real game changer and it is starting to compete with existing products like Apache Spark.

In this post, I would like to show how to implement a simple batch processing algorithm using Apache Flink. We will work with a dataset of movie ratings and will produce a distribution of user ratings. In the process, I’ll show few tricks that you can use to improve the performance of your Flink applications.

Continue reading

Apache Flink: A New Landmark on the Big Data Landscape

There is no shortage of Big Data applications and frameworks nowadays, and sometimes it may even seem that all niches have already been filled. That’s not how creators of Apache Flink see it, though.Even though their project is not yet as well known as Spark or Hadoop, it has brought enough innovations to become a real game-changer in the world of Big Data.

In this article, I would like to introduce Apache Flink, describe what its main features are, and why is it different from other available solutions. I’ll end the article with an example of a simple stream processing application using Flink.

Continue reading

© 2017 Brewing Codes

Theme by Anders NorenUp ↑