For long Hadoop and MapReduce were linked together. Additional data manipulation libraries, like Hive, were added to query the stored data more easily. But with the growth of the amount of data and cluster sizes, MapReduce became too slow and with the release of Hadoop2 YARN was introduced to schedule the resources in a Hadoop cluster. Parallel with the creation of YARN Hortonworks, Microsofts Hadoop partner, launched the Stinger initiative to speedup Hive a 100x in three waves: the first van the introduction of the ORC files, the second wave optimized the query engine and with the third wave Tez was released. In this session we will look at all the different aspects of Hive/Stinger/Tez, from the history and future to the internals and practical use.
(no tags)