Hadoop MapReduce (MR)

Combiner (mini-reducer)

Goal is to aggregate the key val pairs ([apple,1],[apple,1]) -> ([apple,2])


Apache Spark

Architecture

spark-submit

./bin/spark-submit --master <url> --deploy-mode <mode> app.py [args]

Deploy modes: