Saturday, February 2, 2008

Database Normalization: Performance & Storage Trade-offs

Database Normalization: Performance & Storage Trade-offs

Normal form (reduced redundancy) tends to penalize efficient retrieval of data, since data which may have been retrievable from one table in an unnormalized design may have to be retrieved from several tables in the normalized form.  Normal form, however, usually improves the performance of updates and deletions.

Although normalization results in more tables, less data storage is typically required because non-normalized data has more value redundancy.

For example, moving the attribute WAREHOUSE IDENTIFICATION CODE from part records means that a separate record is needed for each separate WAREHOUSE.  However, data storage is reduced because each part record does not contain a value for WAREHOUSE IDENTIFICATION CODE (assuming that the additional storage overhead of the additional tables is insignificant).

Normalization enhances the integrity of the data by minimizing data redundancy and inconsistency.  However, there may be additional performance costs for retrieval of data in certain applications.  As a result, the physical database design may deviate from the normalized form due to performance considerations.  Consider an application that wants the addresses of all warehouses that stock a certain part.  In the unnormalized form, the application searches for one record type.  With the normalized design, the application has to search for and join two record types.

No comments: