Virtualization and the Data Warehouse
By Tony Marasco
Virtualization is an option available to IT staff to maintain a lower TCO and higher availability of servers in their environment. More enterprise planning is based on this concept. Precision.BI would like to address virtualization within the framework of the current data warehouse product.
Historically, I was an earlier adopter of virtualization. In 2001, I constructed a virtual network of NT 4.0 servers running a Web reporting tool that performed well compared to the physical servers. The system allowed for high availability – automatic failover if one of the servers was taken offline for any reason. The virtual network was stable. Virtualization worked well in this environment and was recommended as an alterative to various physical machines. The element that was not virtualized was the database server.
With Precision.BI PresentationCenter, virtualization is a recommended option. The Web servers require limited storage access so virtual disks perform well in this environment. Multiple 4 GB RAM servers may easily run on a single host with shared processors/cores maximizing the use of the host system’s hardware. Scaling of the server farm is adding another virtual server to the host.
Precision.BI is an ad hoc system. Data are accessed as-needed with no pre-aggregation or selection of data as would be found in a cube-based environment. Therefore, maximum throughput of the CPU, RAM, and storage are critical for optimum performance. For these reasons, it is not recommended to run SQL Server in a virtual environment if optimal throughput is desired.
Precision.BI does use virtualization for quality assurance testing, including the database server. Optimal data throughput is not critical in such an environment. Likewise, the ability to rollback changes (via snapshots) is a great tool for testing changes. If your site is looking to deploy a test environment, virtualization of the test server is an option to consider.
As computing continues to increase in performance and virtualization improvements continue, a virtual database server for data warehousing may be a strong consideration. At this time, virtualization of the database server is not recommended for optimal database access and throughput.