We discussed our reading about Apache Hadoop, MapReduce, and general distributed computing topics. Concluding that our time should be spent setting up an Apache Hadoop cluster (with the hope of writing some of our own software later), we moved on to topics related to actually setting up our test cluster in the ITL.
As we began to discuss setting up our test cluster, we spent some time talking about ITL usability. An important aspect of our using the ITL to create our cluster is that the ITL must remain usable at all reasonable times. We discussed some possible methods for allowing nodes to be removed by users, automated systems for submitting job requests, and the possibility of running some aspects of cluster in virtual machines to avoid some issues.
We finished our meeting by discussing our focus for the next week. We will all read the Apache Hadoop Docs general section, and the MapReduce by Example presentation, and we will meet next week to find a time to meet (hopefully next weekend), to set up a few test clusters to begin learning to use Apache Hadoop. The links for the reading are below and if you want to read anything else about Hadoop or distributed computing in general, feel free.
See you all next Tuesday at 6PM!
Links:
- MapReduce By Examples: MapReduce By Examples
- Apache Hadoop Docs: Apache Hadoop Docs