Big data notions

Some machine learning

Strenghts of map reduce

Bottlenecks of map reduce

Some Big Data Implementations

Impala : analytical engine (analyze big data interactively)

Spark : in memory resilient distributed datasets

Good for iterative algorithms since it has an in cache mechanism to store previous results distributed algorithms for machine learning

Hadoop : Map reduce implementation in Java

Hadoop_Cluster.svg