Session abstract:
The genesis of Hadoop was in analyzing massive amounts of data with a mapreduce framework. SQL-on-Hadoop has followed shortly after that, paving a way to the whole schemaonread notion. Discovering graph relationship in your data is the next logical step. Apache Giraph (modeled on Google’s Pregel) lets you apply the power of BSP approach to the unstructured data. In this talk we will focus on practical advice of how to get up and running with Apache Giraph in minutes, start analyzing simple data sets with built-in algorithms and finally how to implement your own graph processing using the APIs provided by the project. We will then dive into how Giraph integrates with the Hadoop ecosystem projects (Hive, HBase, Accumulo, etc.) and will also provide a whirlwind tour of Giraph architecture.