Wednesday, February 6, 2008

Database Normalization: First Normal Form

Database Normalization: First Normal Form

Data is in first normal form when organized into relations (records) that have no repeating groups of data items.

Description of a Relation

A relationis a flat file, or a two dimensional table of data elements, where each row is equivalent to a logical record, and each column contains the specific values for an attribute of the relation.  A relation should only contain records of identical format.  Relations are often referred to as records when describing normalization.
 
Note:  There is no connection between the terms "relation" and "relationship."

First Rule of Normalization

A relation may not contain any repeating groups of attributes.

Second Rule of Normalization

Internal structures within data elements of a relation must be split into separate attributes.

Applying the Rules to Achieve First Normal Form

Under first normal form, all occurrences of a record type must contain the same number of fields.  First normal form includes derived attributes but excludes variable repeating fields and groups.

To normalize data into first normal form, apply the first two rules by:

·        moving attributes which repeat multiple times for an entity to a separate entity.  Ensure attributes do not repeat in the new entity,

·        splitting internal structures into multiple attributes.

Most programming languages allow programmers to create records that are not flat, i.e., contain repeating groups.  For example, the "OCCURS" statement in COBOL allows repeating groups.  Records that contain repeating groups do not allow a specific occurrence to be uniquely identified by the key of the complete record.  In general, a non-flat record is normalized by converting it into two or more flat records.
 
For example, to place this record in first normal form, the repeating group is removed and a separate record is created.

No comments: