Now I preparing to teach a student about Data Warehouses. These are technologies that aggregate data from multiple sources so they can be compared and analyzed for various purposes. They can:
There are many types of data warehouses like Vertica, Teradata, Oracle, and IBM. There is Apache Hive, a new open-source warehouse and the main one my student and I will be going over for this session. It part of the larger Hadoop ecosystem.
*Hadoop: distributed computing framework for processing millions of records. The process for Hadoop goes like this:
You must be signed in to post a comment!