Introduction to Normalization in Databases

Normalization helps eliminate redundancies in data design. Redundancies are systematically removed using a three or four step approach. The four important normal forms are 1NF, 2NF, 3NF, and BCNF. This tutorial will cover the first three normal shapes.

The three normal forms apply only to Dr. Codd’s relational model, also known as the relational data model.

In first normal form (1NF), multivalued columns are not allowed. If there are multiple value columns, such as multiple subjects recorded by students in a course registration system or multiple phone numbers used to call patients in a patient registration system, they should be split into multiple records, one for each value. in the multi-value column. In this form of normalization, each record should be uniquely identifiable and thus would imply the use of the primary key.

In second normal form (2NF), in addition to the database being in first normal form, all non-key attributes must be functionally dependent only on the primary key. In the case of a composite key, the non-key attributes must fully depend on the entire composite key. For example, consider the following customer registration system. The following tuple (Customer ID, Order ID, Order Date, Store Location) has the primary key as Customer ID and Order ID. The two non-key attributes are Order Date and Store Location. Here, the store location is not dependent on the primary key, or in other words, this non-key attribute is not functionally dependent on the primary key. So this table needs to be split into two tables to make it 2NF.

Customer -Order (Customer ID, Order ID, Location ID) Location (Location ID, Location Name).

In third normal form, the database must be in second normal form, and in addition, normal form ensures that redundancy is further reduced by ensuring that each non-key attribute has no dependency on any other non-key attribute or have no transitive functional dependencies. In popular relational database terminology, this means that the data model has referential integrity constraints enabled by design.

An example of a database that is not in 3NF is the tuple with the attributes (Customer ID, Name, Address Line 1, Address Line 2, Address Line 3, PIN Code). This is not in 3NF since the address line elements depend only on the PIN code). Address line 3). Here, the Pincode attribute in the client tuple is the foreign key in the table that matches the primary key in the Pincode tuple. This results in what is called a referential integrity constraint where the foreign key in one table corresponds to the primary key in another table.

Leave a Reply

Your email address will not be published. Required fields are marked *