Standalone vs YARN cluster for Flink

Flink offers two options to setup a cluster, one is standalone cluster, and the other is based on YARN.

Here I’ll list the pros/cons, to do a comparison.

Standalone mode

pros

  • no dependency on external components;
  • easy to add/remove TaskManager in the cluster;
  • easy for debug, and log retrieve;

cons

  • No job isolation as slots share the same JVM, refer to Job Isolation on Flink;
  • Need to have a zookeeper for node failure recovery;

YARN mode

More specifically, you have two choices with YARN, see yarn setup

  • set up Flink session, similar as a virtual cluster;
  • run Flink job directly on YARN

pros

  • job isolation provided by YARN;
  • node failure auto-recovery;
  • flexible resource capacity per TaskManager for different jobs;

cons

  • external cost for YARN;
  • So far YARN is tied closed with a distribution file system, HDFS/AWS/GoogleCloud;

In our environment, we decide to go with YARN finally. As we value the isolation feature much more than others, to support multiple tenants.

One thought on “Standalone vs YARN cluster for Flink

Comments are closed.