Programming trend No. 10: Real parallelism begins to get practical for all
Computer architects have been talking about machines with true parallel architectures for years, but the programmers in the trenches are just starting to get the tools that make it possible.
The parallelism is appearing in two prominent areas: multinode databases and Hadoop jobs. Some mix the two.
Most NoSQL data stores offer to help spread the workload over multiple machines. Some offer automatic sharding, which splits the data set into pieces, synchronizes the machines that host a given piece, and directs queries to the right machines as necessary. Some offer duplication or backup, a feature that's a bit older; some do both.
Hadoop is an open source framework that will coordinate a number of machines working on a problem and compile their work into a single answer. The project imitates some of the Map/Reduce framework developed by Google to help synchronize Web crawling efforts, but the project has grown well beyond these roots.
Tools like this make it easier than ever to toss more than one machine at a problem. The infrastructure is now solid enough that the enterprise architects can rely on deploying racks of machines with only a bit of hand-holding and fussing.
No comments:
Post a Comment