Pro Hadoop Data Analytics

A book by Kerry Koitzsch, ISBN



The Apache Hadoop software library has come into it’s own. It is the basis for advanced distributed development for a host of companies, government institutions, and scientific research facilities. The Hadoop ecosystem now contains dozens of components for everything from search, databases, and data warehousing to image processing, deep learning, and natural language processing. With the advent of Hadoop 2, different resource managers may be used to provide an even greater level of sophistication and control than previously possible. Competitors, replacements, as well as successors and mutations of the Hadoop technologies and architectures abound. These include Apache Flink, Apache Spark, and many others. The “death of Hadoop” has been announced many times by software experts and commentators. We have to face the question squarely: is Hadoop dead? It depends on the perceived boundaries of Hadoop itself. Do we consider Apache Spark, the in-memory successor to Hadoop’s batch file approach, a part of the Hadoop family simply because it also uses HDFS, the Hadoop file system? Many other examples of “gray areas” exist in which newer technologies replace or enhance the original “Hadoop classic” features. Distributed computing is a moving target and the boundaries of Hadoop and its ecosystem have changed remarkably over a few short years. In this book, we attempt to show some of the diverse and dynamic aspects of Hadoop and its associated ecosystem, and to try to convince you that, although changing, Hadoop is still very much alive, relevant to current software development, and particularly interesting to data analytics programmers.

📗 See more Analysis books

Pro Hadoop Data Analytics by Kerry Koitzsch

304
Pages
2017
Published in
$ Free
Average price
239
Times purchased

Pro Hadoop Data Analytics book PDF free download


Kerry Koitzsch has had more than twenty years of experience in the computer science, image processing, and software engineering fields, and has worked extensively with Apache Hadoop and Apache Spark technologies in particular. Kerry specializes in software consulting involving customized big data applications including distributed search, image analysis, stereo vision, and intelligent image retrieval systems. Kerry currently works for Kildane Software Technologies, Inc., a robotic systems and image analysis software provider in Sunnyvale, California.