Untitled 1

Massively Parallel Processing systems such as Teradata and Netezza are database appliances which have major advantages over other solutions

The data is stored in an organised fashion distributed across many nodes

The defined distribution - which is data dependant is key to performance

This should not be confused with distributed system like Big Data (Hadoop) that does not distribute data in an organised way

An appreciation of the technology and principles behind it are key to developing and tuning on MPP systems

MPP can relate data in a similar (on the face of it) as a traditional database Big Data cannot

IBM market Netezza as an appliance that requires no tuning - while this is true in some respects a well tuned solution can be orders of magnitude faster and more scalable than any alternative

MPP appliances have both a high initial investment and (when done well) initial development cost and a low long term maintenance cost

Big Data by comparison has a low initial investment but a high development cost and a staggeringly high maintenance cost

Long term, if done well initially, MPP systems are more economical flexible and considerably higher performance

That said Big Data has its place as cheap offline storage and when it is neccesary to deal with unstructured data