MS SQL Server 1st 2nd and 3rd Normal forms

Hello friends, welcome to shrash studio learning, in this article we will learn about three of the most asked database design topics — First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF). These are the core rules of database normalization — a set of guidelines that help you design clean, efficient, and bug-free tables. We will use one small example throughout, keep the English simple, and show how a messy table slowly becomes a perfectly designed schema.

What Is Normalization?

Normalization is the process of organizing the columns and tables of a database to remove duplicate data, avoid update problems, and make your queries simpler and faster. It is done in stages called normal forms. In this article we cover the first three — 1NF, 2NF, and 3NF — which are enough for almost every real project and almost every interview question.

Our Starting Messy Table

Let's imagine a small library that tracks book loans in one single table called BookLoans. The designer was in a hurry and dumped everything into one place.

LoanId BorrowerName BorrowerPhones LibraryBranch BookTitle AuthorName DaysBorrowed FinePerDay TotalFine
1 Anita Sharma 9876, 9888 Mumbai Central Atomic Habits James Clear 5 10 50
2 Rohit Verma 9123 Mumbai Central Deep Work Cal Newport 3 10 30
3 Anita Sharma 9876, 9888 Pune North Atomic Habits James Clear 2 15 30

This table works, but it hides many problems. "Mumbai Central" is repeated. Phone numbers are stuffed into one column. Author information is duplicated whenever the same book is loaned. And TotalFine could be calculated, not stored. Let's fix all of this, one normal form at a time.

First Normal Form (1NF)

1NF has two simple rules.

THE TWO RULES OF 1NF

Rule 1

Break data into the smallest logical pieces
No comma-separated lists or combined names

Rule 2

Avoid repeating groups & duplicate data
Move repeating values into master tables

Applying Rule 1: Make Columns Atomic

Our BorrowerName mixes first and last name, and BorrowerPhones stores multiple phones in one cell. Searching by last name or by one phone becomes painful and slow. Split them up.


-- Before (bad)
BorrowerName  = "Anita Sharma"
BorrowerPhones = "9876, 9888"

-- After (1NF Rule 1)
FirstName = "Anita"
LastName  = "Sharma"

-- And phones move to their own table:
Phones(BorrowerId, Phone)

Applying Rule 2: Remove Repeating Data

"Mumbai Central" and "Pune North" keep repeating for every loan from those branches. If "Mumbai Central" is ever renamed, we'd have to update dozens of rows. Move branches into their own master table and refer to them by an ID.


-- Master table
Branches(BranchId PK, BranchName)
  1, "Mumbai Central"
  2, "Pune North"

-- Loan table now just stores BranchId
BookLoans(LoanId, ..., BranchId FK)

Quick check: A table is in 1NF when every cell holds exactly one value and there are no repeating groups. Once done, your SELECT queries stop needing string-splitting tricks.

Second Normal Form (2NF)

2NF builds on 1NF. It says: every non-key column must depend on the WHOLE primary key, not just a part of it.

This rule only matters when your primary key is composite — made of two or more columns. In our example let's imagine the primary key of BookLoans is (LoanId, BookTitle). Now look at AuthorName. It depends only on BookTitle — not on LoanId. That's a partial dependency, and it breaks 2NF.

The Problem with Partial Dependencies

Because author data is glued to every loan row, if James Clear updates his pen name, we must update dozens of rows. One typo and suddenly the same author has two names in our data.

The Fix: Move Partial Columns Into Their Own Table

Put each book's own details in a Books master table and reference it from BookLoans via a BookId.


-- Book master table
Books(BookId PK, Title, Author, Genre)
  1, "Atomic Habits", "James Clear", "Self-help"
  2, "Deep Work",     "Cal Newport", "Productivity"

-- Loan table now just stores BookId
BookLoans(LoanId PK, BorrowerId, BranchId, BookId, DaysBorrowed, FinePerDay)

Now the author and genre live in one row inside Books. Updating them is a one-liner, not a bulk operation. That is 2NF in action.

Quick check: A table is in 2NF when (a) it is already in 1NF, and (b) no non-key column depends on just a part of the composite primary key. If your table has a single-column primary key, 2NF is automatically satisfied.

Third Normal Form (3NF)

3NF adds one more rule: no non-key column should depend on another non-key column. In simple words — no calculated fields and no indirect dependencies.

Look back at our loan row:


DaysBorrowed = 5
FinePerDay   = 10
TotalFine    = 50     ←  depends on the other two columns

TotalFine can always be calculated as DaysBorrowed * FinePerDay. Storing it is risky — if one of the two inputs changes, the stored total becomes wrong unless we remember to update it. That is a classic 3NF violation.

The Fix: Drop Calculated Columns

Delete TotalFine from the table. Whenever you need it, compute it on the fly in your query.


SELECT LoanId,
       DaysBorrowed,
       FinePerDay,
       DaysBorrowed * FinePerDay AS TotalFine
FROM BookLoans;

Another common 3NF violation is an indirect dependency. For example, if FinePerDay really depends on the branch (Mumbai charges 10, Pune charges 15), then FinePerDay should sit in the Branches table — not on every loan row.

Quick check: A table is in 3NF when (a) it is in 2NF, and (b) every non-key column depends only on the primary key — not on another non-key column.

Our Final, Fully-Normalized Design

After applying 1NF, 2NF, and 3NF, the messy single table has become a clean small schema:


Borrowers(BorrowerId PK, FirstName, LastName)

Phones(PhoneId PK, BorrowerId FK, Phone)

Branches(BranchId PK, BranchName, FinePerDay)

Books(BookId PK, Title, Author, Genre)

BookLoans(LoanId   PK,
          BorrowerId FK,
          BranchId   FK,
          BookId     FK,
          DaysBorrowed)

Every piece of information now lives in exactly one place. Updates touch one row. Queries stay simple. Duplicates are impossible.

Quick Comparison

Form Main Rule Fix
1NF Atomic values + no repeating groups Split columns; move repeated values to master tables
2NF No partial dependency on a composite PK Move partial-dependent columns to their own table
3NF No column depends on another non-key column Remove calculated fields; move indirect columns out

Common Mistakes

Mistake 1: Storing comma-separated values in a single column. It looks compact but breaks 1NF and kills query performance.

Mistake 2: Keeping a column that depends only on part of the key. Always ask — "does this column depend on the whole primary key?" If no, move it out.

Mistake 3: Storing calculated totals. They drift out of sync the moment inputs change. Calculate at query time.

Mistake 4: Over-normalizing. Breaking a phone number into three fields (country code, area code, number) is usually pointless. Normalize only where it actually helps.

Balance tip: Higher normal forms reduce duplication but add more joins. In read-heavy analytics systems, it is often fine to denormalize a little for speed. In transactional systems (OLTP), stick to 3NF.

The Interview Answer

If an interviewer asks "Explain 1NF, 2NF, and 3NF":

"1NF means every column should hold a single atomic value with no repeating groups — split combined fields and move repeating data into master tables. 2NF builds on 1NF and says no non-key column should depend on just a part of a composite primary key. 3NF builds on 2NF and says no non-key column should depend on another non-key column — that means no calculated fields and no indirect dependencies."

Then walk them through one example like our BookLoans table.

Summary

Database normalization is the art of cleaning up your tables so data lives in only one place. The first three normal forms — 1NF, 2NF, and 3NF — solve most real-world design problems.

1NF forces atomic values and removes repeating groups. 2NF removes partial dependencies on composite keys. 3NF removes calculated fields and indirect dependencies. Apply them in order and you will almost always end up with a clean, efficient schema.

In interviews, remember the pattern — explain each form in one line, then show it with a single small example. That single example shows the interviewer that you truly understand why each rule exists, not just what it is called.

Concept Key Takeaway
Normalization Organize data so each fact lives in one place
1NF Rule 1 Break columns into atomic values
1NF Rule 2 Move repeating values into master tables
2NF No partial dependency on a composite key
3NF No dependency between non-key columns
Composite key Primary key made of two or more columns
Calculated field Should be computed, not stored
Master table Stores a repeated value only once
Interview line Atomic → no partial → no transitive

Chakrapani U

Hi, I’m Chakrapani Upadhyaya, an IT professional with 15+ years of industry experience. Over the years, I have worked on web development, enterprise applications, database systems, and cloud-based solutions. Through this blog, I aim to simplify complex technical concepts and help learners grow from beginners to confident, industry-ready developers.

Previous Post Next Post

نموذج الاتصال