DBMS Concepts Interview Questions and Answers for Placements

Last Updated: March 2026

Introduction to DBMS

Database Management Systems (DBMS) form the backbone of modern software applications. Understanding DBMS concepts is crucial for technical interviews, as databases are integral to virtually every application. This guide covers essential DBMS topics from basic concepts to advanced transaction management and distributed systems.

Why DBMS is Essential for Placements

Core CS Subject: Standard part of computer science curriculum
Interview Staple: Asked in 80%+ of technical interviews
System Design Prerequisite: Foundation for designing scalable systems
Real-world Impact: Database decisions affect application performance

Key Concepts and Theory

ACID Properties

Property	Description	Ensured By
Atomicity	All operations complete or none	Transaction logs, rollback
Consistency	Database remains valid after transaction	Constraints, triggers, cascades
Isolation	Concurrent transactions don't interfere	Locking, MVCC
Durability	Committed data survives failures	Write-ahead logging, replication

Database Architecture Levels

┌─────────────────────────────────────────┐
│           EXTERNAL LEVEL                │  ← User views
│     (View 1)    (View 2)    (View N)    │
├─────────────────────────────────────────┤
│           CONCEPTUAL LEVEL              │  ← Logical schema
│      Tables, Relationships, Constraints │
├─────────────────────────────────────────┤
│           INTERNAL LEVEL                │  ← Physical storage
│      Indexes, File organization, Blocks │
└─────────────────────────────────────────┘

Normalization Forms

1NF (First Normal Form)

Atomic values (no multi-valued attributes)
Each cell contains single value
No repeating groups

2NF (Second Normal Form)

Must be in 1NF
No partial dependency (non-prime attributes depend on full candidate key)

3NF (Third Normal Form)

Must be in 2NF
No transitive dependency (non-prime attributes depend only on candidate key)

BCNF (Boyce-Codd Normal Form)

Stricter version of 3NF
For every FD X → Y, X must be a superkey

Practice Questions with Solutions

Question 1: What is a Transaction? Explain ACID properties. ⭐ [Easy]

A transaction is a logical unit of work that consists of one or more database operations. It represents a complete business operation that must succeed or fail as a whole.

ACID Properties Explained:

Atomicity (All or Nothing):

All operations in transaction complete successfully, or none do
If any operation fails, entire transaction rolls back
Example: Bank transfer - debit and credit both happen, or neither does

Consistency (Valid State to Valid State):

Database must remain in consistent state before and after transaction
All integrity constraints must be satisfied
Example: Account balance can't go negative if business rule forbids it

Isolation (Concurrent Transactions Don't Interfere):

Each transaction executes as if it's the only one
Intermediate states are invisible to other transactions
Achieved through locking or MVCC (Multi-Version Concurrency Control)

Durability (Permanent Changes):

Once committed, changes persist even if system crashes
Achieved through write-ahead logging (WAL) and checkpointing

Question 2: Explain different types of JOINs with examples. ⭐⭐ [Medium]

INNER JOIN: Returns only matching rows from both tables

SELECT e.name, d.dept_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.id;
-- Only employees with valid departments

LEFT JOIN: Returns all rows from left table, matching from right

SELECT e.name, d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.id;
-- All employees, NULL for those without departments

RIGHT JOIN: Returns all rows from right table, matching from left

SELECT e.name, d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.id;
-- All departments, NULL for departments without employees

FULL OUTER JOIN: Returns all rows when match in either table

SELECT e.name, d.dept_name
FROM employees e
FULL OUTER JOIN departments d ON e.dept_id = d.id;
-- All employees and all departments

CROSS JOIN: Cartesian product of both tables

SELECT e.name, d.dept_name
FROM employees e
CROSS JOIN departments d;
-- Every employee paired with every department

SELF JOIN: Table joined with itself

SELECT e.name as employee, m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
-- Employee-manager hierarchy

Question 3: Difference between Primary Key, Unique Key, and Foreign Key? ⭐⭐ [Medium]

Feature	Primary Key	Unique Key	Foreign Key
Purpose	Uniquely identifies each row	Ensures no duplicates in column(s)	Maintains referential integrity
NULL Allowed	No	Yes (one)	Yes
Number per Table	One	Multiple	Multiple
Index Created	Yes (clustered)	Yes (non-clustered)	Yes
References	Cannot	Cannot	References PK/UK of other table

Primary Key:

CREATE TABLE students (
    id INT PRIMARY KEY,  -- Unique + NOT NULL
    email VARCHAR(100)
);

Unique Key:

CREATE TABLE students (
    id INT PRIMARY KEY,
    email VARCHAR(100) UNIQUE,  -- Unique but can be NULL once
    roll_number VARCHAR(20) UNIQUE
);

Foreign Key:

CREATE TABLE enrollments (
    id INT PRIMARY KEY,
    student_id INT,
    course_id INT,
    FOREIGN KEY (student_id) REFERENCES students(id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

Question 4: What is Normalization? Explain 1NF, 2NF, 3NF, BCNF. ⭐⭐⭐ [Hard]

Normalization is the process of organizing data to minimize redundancy and dependency. It involves dividing large tables into smaller ones and defining relationships.

1NF (First Normal Form):

Eliminate repeating groups
Each cell contains atomic (indivisible) value
Each column contains values of single type

Violation Example:

Student: {id: 1, name: "John", subjects: "Math, Physics, Chem"}

1NF Compliant:

Student_Subjects: {student_id: 1, subject: "Math"}
Student_Subjects: {student_id: 1, subject: "Physics"}
Student_Subjects: {student_id: 1, subject: "Chem"}

2NF (Second Normal Form):

Must be in 1NF
No partial dependency (non-key attributes depend on full candidate key)
Relevant for composite keys

Example: Enrollment(student_id, course_id, student_name, course_name, grade)

student_name depends only on student_id (partial dependency)
Solution: Split into Students, Courses, Enrollments

3NF (Third Normal Form):

Must be in 2NF
No transitive dependency (A → B → C, so A → C indirectly)

Example: Employee(id, name, dept_id, dept_name, dept_location)

dept_name depends on dept_id, not directly on id
Solution: Split into Employees and Departments

BCNF (Boyce-Codd Normal Form):

Stricter than 3NF
For every functional dependency X → Y, X must be a superkey
Eliminates all anomalies based on functional dependencies

Question 5: Explain Indexing and its types. ⭐⭐⭐ [Hard]

What is Indexing?

Data structure that improves speed of data retrieval
Trade-off: Faster reads, slower writes (index maintenance)
Similar to book index - points to data location

Types of Indexes:

1. B-Tree Index (Default):

Balanced tree structure
Excellent for range queries and equality
Most common index type

CREATE INDEX idx_name ON employees(name);

2. Hash Index:

Uses hash function on indexed column
O(1) lookup for equality
Cannot support range queries

CREATE INDEX idx_hash ON employees USING HASH(email);

3. Bitmap Index:

Uses bitmaps for each distinct value
Efficient for columns with low cardinality
Common in data warehouses

4. Clustered Index:

Determines physical order of data
Only one per table (usually primary key)
Faster for range queries on indexed column

5. Non-Clustered Index:

Separate structure from data
Contains pointers to actual data
Multiple allowed per table

When to Use Indexes:

Columns frequently used in WHERE clauses
Join columns (foreign keys)
Columns used in ORDER BY

When NOT to Use:

Small tables
Frequently updated columns
Columns with low selectivity

Question 6: What is Deadlock? How to prevent it? ⭐⭐⭐ [Hard]

Deadlock is a situation where two or more transactions are waiting for each other to release locks, creating a circular wait.

Deadlock Example:

Transaction A: Locks Row 1, Requests Row 2
Transaction B: Locks Row 2, Requests Row 1
Result: Both wait forever

Four Necessary Conditions (Coffman Conditions):

Mutual Exclusion: Resources are non-shareable
Hold and Wait: Process holds resource while waiting for another
No Preemption: Resources cannot be forcibly taken
Circular Wait: Circular chain of processes waiting

Deadlock Prevention Strategies:

1. Lock Ordering:

Always acquire locks in consistent order
Prevents circular wait

// Always lock Account with lower ID first
if (account1.id < account2.id) {
    lock(account1);
    lock(account2);
} else {
    lock(account2);
    lock(account1);
}

2. Timeout:

Abort transaction if can't acquire lock within time
Simple but may cause unnecessary aborts

3. Wait-Die (Non-preemptive):

Older transaction waits for younger
Younger transaction dies (aborts) if requests lock held by older
Based on timestamps

4. Wound-Wait (Preemptive):

Older transaction wounds (aborts) younger holding needed lock
Younger waits if older holds the lock

5. Graph-Based Detection:

Build wait-for graph
Detect cycles (deadlocks)
Choose victim transaction to abort

Question 7: Difference between WHERE and HAVING clause? ⭐ [Easy]

Feature	WHERE	HAVING
Filters	Individual rows	Groups of rows
Execution	Before grouping	After grouping
Can use aggregates?	No	Yes
Applies to	Raw data	Result of GROUP BY

Example:

-- WHERE filters rows before grouping
SELECT department, AVG(salary) as avg_salary
FROM employees
WHERE salary > 30000        -- Filter individual employees
GROUP BY department
HAVING AVG(salary) > 50000; -- Filter groups

Execution Order:

FROM (get tables)
WHERE (filter rows)
GROUP BY (group rows)
HAVING (filter groups)
SELECT (choose columns)
ORDER BY (sort results)

Remember: WHERE filters rows, HAVING filters groups. WHERE can't use aggregate functions, HAVING can.

Question 8: Explain Database Sharding and Partitioning. ⭐⭐⭐ [Hard]

Partitioning: Dividing large table into smaller, manageable pieces within same database

Types of Partitioning:

1. Horizontal Partitioning (Row-based):

Split rows across partitions
Each partition has same columns, fewer rows

-- Range partitioning
CREATE TABLE sales (
    id INT,
    sale_date DATE,
    amount DECIMAL
) PARTITION BY RANGE (YEAR(sale_date)) (
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025)
);

2. Vertical Partitioning (Column-based):

Split columns across tables
Frequently accessed columns in one partition

Table: employees_main (id, name, email, department)
Table: employees_details (id, address, phone, bio)

Sharding: Distributing data across multiple database instances (servers)

Sharding Strategies:

1. Hash Sharding:

Hash of key determines shard
Even distribution
Hard to add/remove shards

shard = hash(customer_id) % number_of_shards

2. Range Sharding:

Data divided by key ranges
Easy range queries
Can cause hot spots

Shard 1: Customer IDs 1-1000
Shard 2: Customer IDs 1001-2000

3. Directory-Based Sharding:

Lookup service maps key to shard
Flexible, can rebalance
Additional lookup overhead

Partitioning vs Sharding:

Partitioning: Same server, different files/tables
Sharding: Different servers, distributed system

Question 9: What are Views, Materialized Views, and Stored Procedures? ⭐⭐ [Medium]

Views:

Virtual table based on SQL query
No physical storage of data
Always shows current data

CREATE VIEW high_earners AS
SELECT name, salary, department
FROM employees
WHERE salary > 100000;

Materialized Views:

Physical copy of query result
Data stored on disk
Must be refreshed to see updates
Better for complex, infrequently changing data

CREATE MATERIALIZED VIEW dept_summary AS
SELECT department, COUNT(*) as emp_count, AVG(salary) as avg_salary
FROM employees
GROUP BY department;

-- Refresh command
REFRESH MATERIALIZED VIEW dept_summary;

Stored Procedures:

Precompiled SQL statements stored in database
Accept parameters, can contain logic
Reduces network traffic, improves security

CREATE PROCEDURE GetEmployeeByDept(IN dept_name VARCHAR(50))
BEGIN
    SELECT * FROM employees
    WHERE department = dept_name;
END;

-- Call
CALL GetEmployeeByDept('Engineering');

Comparison:

Feature	View	Materialized View	Stored Procedure
Storage	Virtual	Physical	Code only
Data Freshness	Real-time	Stale until refresh	N/A
Performance	Query time	Fast read, slow refresh	Precompiled
Use Case	Simplify queries	Reporting, analytics	Business logic

Question 10: Explain CAP Theorem. ⭐⭐⭐ [Hard]

CAP Theorem states that a distributed data system can only guarantee two of the three properties:

C - Consistency: All nodes see the same data at the same time A - Availability: Every request receives a response (success or failure) P - Partition Tolerance: System continues to operate despite network partitions

The Trade-offs:

CP (Consistency + Partition Tolerance):

Sacrifices availability during partition
Examples: HBase, MongoDB (configured), Redis Cluster
Good for: Banking, stock trading

AP (Availability + Partition Tolerance):

Sacrifices consistency during partition
Examples: Cassandra, DynamoDB, CouchDB
Good for: Social media, analytics

CA (Consistency + Availability):

No partition tolerance
Only possible in single-node systems
Examples: Traditional RDBMS on single server

Practical Implications:

Scenario 1: Network partition occurs between data centers

CP System: Reject writes to maintain consistency
AP System: Accept writes, reconcile later

PACELC Theorem (Extension of CAP):

If Partitioned (P), choose Availability (A) or Consistency (C)
Else (E), choose Latency (L) or Consistency (C)

Real-world Systems:

Banking: CP (consistency over availability)
E-commerce: AP during sales, CP for inventory
Social Media: AP (eventual consistency acceptable)

Common Interview Follow-Up Questions

"When would you denormalize?" → Read-heavy workloads, reporting, when joins are too expensive
"Difference between DELETE and TRUNCATE?" → DELETE is DML (rollback, slow), TRUNCATE is DDL (fast, resets identity)
"How do you optimize slow queries?" → EXPLAIN plan, add indexes, rewrite queries, analyze table stats
"What is connection pooling?" → Reuse database connections to reduce overhead
"Explain eventual consistency" → AP systems guarantee consistency eventually, not immediately

Companies Asking DBMS Questions

Company	Frequency	Focus Areas
Oracle	Very High	Advanced SQL, Performance tuning
Amazon	Very High	DynamoDB, Distributed systems
Microsoft	Very High	SQL Server, Azure databases
Google	High	Spanner, Bigtable, Distributed DBs
Meta	High	MySQL at scale, Sharding
Uber	High	PostgreSQL, Schemaless
Netflix	High	Cassandra, NoSQL

Preparation Tips

Practice ER Diagrams: Design databases from requirements
Understand Trade-offs: Normalization vs performance
Learn Query Plans: Read and optimize execution plans
Know When to Use What: SQL vs NoSQL, different database types
Replication and Sharding: Essential for system design interviews
Transaction Isolation Levels: READ UNCOMMITTED to SERIALIZABLE
Indexing Strategies: When composite indexes help

FAQ

Q: When should I use NoSQL over SQL?
A: NoSQL for unstructured data, horizontal scaling needs, flexible schemas. SQL for complex queries, ACID requirements, structured relationships.

Q: What's the difference between clustering and non-clustering index?
A: Clustered index determines physical order of data (one per table). Non-clustered is separate structure with pointers (multiple allowed).

Q: Explain eventual consistency with example.
A: In AP systems, writes may not be immediately visible everywhere. Example: You post on social media, some friends see it instantly, others after few seconds.

Q: What's a covering index?
A: Index that contains all columns needed for a query, so database doesn't need to access the table. Very fast.

Q: Difference between optimistic and pessimistic locking?
A: Pessimistic locks data upfront (prevents conflicts). Optimistic checks for conflicts at commit time (better concurrency).

Strong DBMS fundamentals separate good engineers from great ones! 🗃️

Dbms Concepts Questions Placement