Tuesday, July 7, 2009

Normalization / De-Normalization

Normalization: It's the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminate redundant data (for example, storing the same data in more than one table) and ensure data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.
1. Eliminate Repeating Groups - Make a separate table for each set of related attributes, and give each table a primary key.
2. Eliminate Redundant Data - If an attribute depends on only part of a multi-valued key, remove it to a separate table.
3. Eliminate Columns Not Dependent On Key - If attributes do not contribute to a description of the key, remove them to a separate table.
4. Isolate Independent Multiple Relationships - No table may contain two or more 1:n or n:m relationships that are not directly related.
5. Isolate Semantically Related Multiple Relationships - There may be practical constrains on information that justify separating logically related many-to-many relationships.
1st Normal Form (1NF)
Def: A table (relation) is in 1NF if
1. There are no duplicated rows in the table.
2. Each cell is single-valued (i.e., there are no repeating groups or arrays).
3. Entries in a column (attribute, field) are of the same kind.
Note: The order of the rows is immaterial; the order of the columns is immaterial.
Note: The requirement that there be no duplicated rows in the table means that the table has a key (although the key might be made up of more than one column—even, possibly, of all the columns).
2nd Normal Form (2NF)
Def: A table is in 2NF if it is in 1NF and if all non-key attributes are dependent on all of the key.
Note: Since a partial dependency occurs when a non-key attribute is dependent on only a part of the (composite) key, the definition of 2NF is sometimes phrased as, "A table is in 2NF if it is in 1NF and if it has no partial dependencies."
3rd Normal Form (3NF)
Def: A table is in 3NF if it is in 2NF and if it has no transitive dependencies.
Boyce-Codd Normal Form (BCNF)
Def: A table is in BCNF if it is in 3NF and if every determinant is a candidate key.
4th Normal Form (4NF)
Def: A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies.
5th Normal Form (5NF)
Def: A table is in 5NF, also called "Projection-Join Normal Form" (PJNF), if it is in 4NF and if every join dependency in the table is a consequence of the candidate keys of the table.
Domain-Key Normal Form (DKNF)
Def: A table is in DKNF if every constraint on the table is a LOGICAL consequence of the definition of keys and domains.

De-Normalization:
Denormalization is a technique to move from higher to lower normal forms of database modeling in order to speed up database access. You may apply Denormalization in the process of deriving a physical data model from a logical form.

No comments:

Post a Comment