Dbms Concepts Questions Placement
DBMS Concepts Interview Questions and Answers for Placements
Last Updated: March 2026
Introduction to DBMS
Database Management Systems (DBMS) form the backbone of modern software applications. Understanding DBMS concepts is crucial for technical interviews, as databases are integral to virtually every application. This guide covers essential DBMS topics from basic concepts to advanced transaction management and distributed systems.
Why DBMS is Essential for Placements
- Core CS Subject: Standard part of computer science curriculum
- Interview Staple: Asked in 80%+ of technical interviews
- System Design Prerequisite: Foundation for designing scalable systems
- Real-world Impact: Database decisions affect application performance
Key Concepts and Theory
ACID Properties
| Property | Description | Ensured By |
|---|---|---|
| Atomicity | All operations complete or none | Transaction logs, rollback |
| Consistency | Database remains valid after transaction | Constraints, triggers, cascades |
| Isolation | Concurrent transactions don't interfere | Locking, MVCC |
| Durability | Committed data survives failures | Write-ahead logging, replication |
Database Architecture Levels
┌─────────────────────────────────────────┐
│ EXTERNAL LEVEL │ ← User views
│ (View 1) (View 2) (View N) │
├─────────────────────────────────────────┤
│ CONCEPTUAL LEVEL │ ← Logical schema
│ Tables, Relationships, Constraints │
├─────────────────────────────────────────┤
│ INTERNAL LEVEL │ ← Physical storage
│ Indexes, File organization, Blocks │
└─────────────────────────────────────────┘
Normalization Forms
1NF (First Normal Form)
- Atomic values (no multi-valued attributes)
- Each cell contains single value
- No repeating groups
2NF (Second Normal Form)
- Must be in 1NF
- No partial dependency (non-prime attributes depend on full candidate key)
3NF (Third Normal Form)
- Must be in 2NF
- No transitive dependency (non-prime attributes depend only on candidate key)
BCNF (Boyce-Codd Normal Form)
- Stricter version of 3NF
- For every FD X → Y, X must be a superkey
Practice Questions with Solutions
Question 1: What is a Transaction? Explain ACID properties. ⭐ [Easy]
A transaction is a logical unit of work that consists of one or more database operations. It represents a complete business operation that must succeed or fail as a whole.
ACID Properties Explained:
Atomicity (All or Nothing):
- All operations in transaction complete successfully, or none do
- If any operation fails, entire transaction rolls back
- Example: Bank transfer - debit and credit both happen, or neither does
Consistency (Valid State to Valid State):
- Database must remain in consistent state before and after transaction
- All integrity constraints must be satisfied
- Example: Account balance can't go negative if business rule forbids it
Isolation (Concurrent Transactions Don't Interfere):
- Each transaction executes as if it's the only one
- Intermediate states are invisible to other transactions
- Achieved through locking or MVCC (Multi-Version Concurrency Control)
Durability (Permanent Changes):
- Once committed, changes persist even if system crashes
- Achieved through write-ahead logging (WAL) and checkpointing
Question 2: Explain different types of JOINs with examples. ⭐⭐ [Medium]
INNER JOIN: Returns only matching rows from both tables
SELECT e.name, d.dept_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.id;
-- Only employees with valid departments
LEFT JOIN: Returns all rows from left table, matching from right
SELECT e.name, d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.id;
-- All employees, NULL for those without departments
RIGHT JOIN: Returns all rows from right table, matching from left
SELECT e.name, d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.id;
-- All departments, NULL for departments without employees
FULL OUTER JOIN: Returns all rows when match in either table
SELECT e.name, d.dept_name
FROM employees e
FULL OUTER JOIN departments d ON e.dept_id = d.id;
-- All employees and all departments
CROSS JOIN: Cartesian product of both tables
SELECT e.name, d.dept_name
FROM employees e
CROSS JOIN departments d;
-- Every employee paired with every department
SELF JOIN: Table joined with itself
SELECT e.name as employee, m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
-- Employee-manager hierarchy
Question 3: Difference between Primary Key, Unique Key, and Foreign Key? ⭐⭐ [Medium]
| Feature | Primary Key | Unique Key | Foreign Key |
|---|---|---|---|
| Purpose | Uniquely identifies each row | Ensures no duplicates in column(s) | Maintains referential integrity |
| NULL Allowed | No | Yes (one) | Yes |
| Number per Table | One | Multiple | Multiple |
| Index Created | Yes (clustered) | Yes (non-clustered) | Yes |
| References | Cannot | Cannot | References PK/UK of other table |
Primary Key:
CREATE TABLE students (
id INT PRIMARY KEY, -- Unique + NOT NULL
email VARCHAR(100)
);
Unique Key:
CREATE TABLE students (
id INT PRIMARY KEY,
email VARCHAR(100) UNIQUE, -- Unique but can be NULL once
roll_number VARCHAR(20) UNIQUE
);
Foreign Key:
CREATE TABLE enrollments (
id INT PRIMARY KEY,
student_id INT,
course_id INT,
FOREIGN KEY (student_id) REFERENCES students(id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
Question 4: What is Normalization? Explain 1NF, 2NF, 3NF, BCNF. ⭐⭐⭐ [Hard]
Normalization is the process of organizing data to minimize redundancy and dependency. It involves dividing large tables into smaller ones and defining relationships.
1NF (First Normal Form):
- Eliminate repeating groups
- Each cell contains atomic (indivisible) value
- Each column contains values of single type
Violation Example:
Student: {id: 1, name: "John", subjects: "Math, Physics, Chem"}
1NF Compliant:
Student_Subjects: {student_id: 1, subject: "Math"}
Student_Subjects: {student_id: 1, subject: "Physics"}
Student_Subjects: {student_id: 1, subject: "Chem"}
2NF (Second Normal Form):
- Must be in 1NF
- No partial dependency (non-key attributes depend on full candidate key)
- Relevant for composite keys
Example: Enrollment(student_id, course_id, student_name, course_name, grade)
- student_name depends only on student_id (partial dependency)
- Solution: Split into Students, Courses, Enrollments
3NF (Third Normal Form):
- Must be in 2NF
- No transitive dependency (A → B → C, so A → C indirectly)
Example: Employee(id, name, dept_id, dept_name, dept_location)
- dept_name depends on dept_id, not directly on id
- Solution: Split into Employees and Departments
BCNF (Boyce-Codd Normal Form):
- Stricter than 3NF
- For every functional dependency X → Y, X must be a superkey
- Eliminates all anomalies based on functional dependencies
Question 5: Explain Indexing and its types. ⭐⭐⭐ [Hard]
What is Indexing?
- Data structure that improves speed of data retrieval
- Trade-off: Faster reads, slower writes (index maintenance)
- Similar to book index - points to data location
Types of Indexes:
1. B-Tree Index (Default):
- Balanced tree structure
- Excellent for range queries and equality
- Most common index type
CREATE INDEX idx_name ON employees(name);
2. Hash Index:
- Uses hash function on indexed column
- O(1) lookup for equality
- Cannot support range queries
CREATE INDEX idx_hash ON employees USING HASH(email);
3. Bitmap Index:
- Uses bitmaps for each distinct value
- Efficient for columns with low cardinality
- Common in data warehouses
4. Clustered Index:
- Determines physical order of data
- Only one per table (usually primary key)
- Faster for range queries on indexed column
5. Non-Clustered Index:
- Separate structure from data
- Contains pointers to actual data
- Multiple allowed per table
When to Use Indexes:
- Columns frequently used in WHERE clauses
- Join columns (foreign keys)
- Columns used in ORDER BY
When NOT to Use:
- Small tables
- Frequently updated columns
- Columns with low selectivity
Question 6: What is Deadlock? How to prevent it? ⭐⭐⭐ [Hard]
Deadlock is a situation where two or more transactions are waiting for each other to release locks, creating a circular wait.
Deadlock Example:
Transaction A: Locks Row 1, Requests Row 2
Transaction B: Locks Row 2, Requests Row 1
Result: Both wait forever
Four Necessary Conditions (Coffman Conditions):
- Mutual Exclusion: Resources are non-shareable
- Hold and Wait: Process holds resource while waiting for another
- No Preemption: Resources cannot be forcibly taken
- Circular Wait: Circular chain of processes waiting
Deadlock Prevention Strategies:
1. Lock Ordering:
- Always acquire locks in consistent order
- Prevents circular wait
// Always lock Account with lower ID first
if (account1.id < account2.id) {
lock(account1);
lock(account2);
} else {
lock(account2);
lock(account1);
}
2. Timeout:
- Abort transaction if can't acquire lock within time
- Simple but may cause unnecessary aborts
3. Wait-Die (Non-preemptive):
- Older transaction waits for younger
- Younger transaction dies (aborts) if requests lock held by older
- Based on timestamps
4. Wound-Wait (Preemptive):
- Older transaction wounds (aborts) younger holding needed lock
- Younger waits if older holds the lock
5. Graph-Based Detection:
- Build wait-for graph
- Detect cycles (deadlocks)
- Choose victim transaction to abort
Question 7: Difference between WHERE and HAVING clause? ⭐ [Easy]
| Feature | WHERE | HAVING |
|---|---|---|
| Filters | Individual rows | Groups of rows |
| Execution | Before grouping | After grouping |
| Can use aggregates? | No | Yes |
| Applies to | Raw data | Result of GROUP BY |
Example:
-- WHERE filters rows before grouping
SELECT department, AVG(salary) as avg_salary
FROM employees
WHERE salary > 30000 -- Filter individual employees
GROUP BY department
HAVING AVG(salary) > 50000; -- Filter groups
Execution Order:
- FROM (get tables)
- WHERE (filter rows)
- GROUP BY (group rows)
- HAVING (filter groups)
- SELECT (choose columns)
- ORDER BY (sort results)
Remember: WHERE filters rows, HAVING filters groups. WHERE can't use aggregate functions, HAVING can.
Question 8: Explain Database Sharding and Partitioning. ⭐⭐⭐ [Hard]
Partitioning: Dividing large table into smaller, manageable pieces within same database
Types of Partitioning:
1. Horizontal Partitioning (Row-based):
- Split rows across partitions
- Each partition has same columns, fewer rows
-- Range partitioning
CREATE TABLE sales (
id INT,
sale_date DATE,
amount DECIMAL
) PARTITION BY RANGE (YEAR(sale_date)) (
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025)
);
2. Vertical Partitioning (Column-based):
- Split columns across tables
- Frequently accessed columns in one partition
Table: employees_main (id, name, email, department)
Table: employees_details (id, address, phone, bio)
Sharding: Distributing data across multiple database instances (servers)
Sharding Strategies:
1. Hash Sharding:
- Hash of key determines shard
- Even distribution
- Hard to add/remove shards
shard = hash(customer_id) % number_of_shards
2. Range Sharding:
- Data divided by key ranges
- Easy range queries
- Can cause hot spots
Shard 1: Customer IDs 1-1000
Shard 2: Customer IDs 1001-2000
3. Directory-Based Sharding:
- Lookup service maps key to shard
- Flexible, can rebalance
- Additional lookup overhead
Partitioning vs Sharding:
- Partitioning: Same server, different files/tables
- Sharding: Different servers, distributed system
Question 9: What are Views, Materialized Views, and Stored Procedures? ⭐⭐ [Medium]
Views:
- Virtual table based on SQL query
- No physical storage of data
- Always shows current data
CREATE VIEW high_earners AS
SELECT name, salary, department
FROM employees
WHERE salary > 100000;
Materialized Views:
- Physical copy of query result
- Data stored on disk
- Must be refreshed to see updates
- Better for complex, infrequently changing data
CREATE MATERIALIZED VIEW dept_summary AS
SELECT department, COUNT(*) as emp_count, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
-- Refresh command
REFRESH MATERIALIZED VIEW dept_summary;
Stored Procedures:
- Precompiled SQL statements stored in database
- Accept parameters, can contain logic
- Reduces network traffic, improves security
CREATE PROCEDURE GetEmployeeByDept(IN dept_name VARCHAR(50))
BEGIN
SELECT * FROM employees
WHERE department = dept_name;
END;
-- Call
CALL GetEmployeeByDept('Engineering');
Comparison:
| Feature | View | Materialized View | Stored Procedure |
|---|---|---|---|
| Storage | Virtual | Physical | Code only |
| Data Freshness | Real-time | Stale until refresh | N/A |
| Performance | Query time | Fast read, slow refresh | Precompiled |
| Use Case | Simplify queries | Reporting, analytics | Business logic |
Question 10: Explain CAP Theorem. ⭐⭐⭐ [Hard]
CAP Theorem states that a distributed data system can only guarantee two of the three properties:
C - Consistency: All nodes see the same data at the same time A - Availability: Every request receives a response (success or failure) P - Partition Tolerance: System continues to operate despite network partitions
The Trade-offs:
CP (Consistency + Partition Tolerance):
- Sacrifices availability during partition
- Examples: HBase, MongoDB (configured), Redis Cluster
- Good for: Banking, stock trading
AP (Availability + Partition Tolerance):
- Sacrifices consistency during partition
- Examples: Cassandra, DynamoDB, CouchDB
- Good for: Social media, analytics
CA (Consistency + Availability):
- No partition tolerance
- Only possible in single-node systems
- Examples: Traditional RDBMS on single server
Practical Implications:
Scenario 1: Network partition occurs between data centers
- CP System: Reject writes to maintain consistency
- AP System: Accept writes, reconcile later
PACELC Theorem (Extension of CAP):
- If Partitioned (P), choose Availability (A) or Consistency (C)
- Else (E), choose Latency (L) or Consistency (C)
Real-world Systems:
- Banking: CP (consistency over availability)
- E-commerce: AP during sales, CP for inventory
- Social Media: AP (eventual consistency acceptable)
Common Interview Follow-Up Questions
- "When would you denormalize?" → Read-heavy workloads, reporting, when joins are too expensive
- "Difference between DELETE and TRUNCATE?" → DELETE is DML (rollback, slow), TRUNCATE is DDL (fast, resets identity)
- "How do you optimize slow queries?" → EXPLAIN plan, add indexes, rewrite queries, analyze table stats
- "What is connection pooling?" → Reuse database connections to reduce overhead
- "Explain eventual consistency" → AP systems guarantee consistency eventually, not immediately
Companies Asking DBMS Questions
| Company | Frequency | Focus Areas |
|---|---|---|
| Oracle | Very High | Advanced SQL, Performance tuning |
| Amazon | Very High | DynamoDB, Distributed systems |
| Microsoft | Very High | SQL Server, Azure databases |
| High | Spanner, Bigtable, Distributed DBs | |
| Meta | High | MySQL at scale, Sharding |
| Uber | High | PostgreSQL, Schemaless |
| Netflix | High | Cassandra, NoSQL |
Preparation Tips
- Practice ER Diagrams: Design databases from requirements
- Understand Trade-offs: Normalization vs performance
- Learn Query Plans: Read and optimize execution plans
- Know When to Use What: SQL vs NoSQL, different database types
- Replication and Sharding: Essential for system design interviews
- Transaction Isolation Levels: READ UNCOMMITTED to SERIALIZABLE
- Indexing Strategies: When composite indexes help
FAQ
Q: When should I use NoSQL over SQL?
A: NoSQL for unstructured data, horizontal scaling needs, flexible schemas. SQL for complex queries, ACID requirements, structured relationships.
Q: What's the difference between clustering and non-clustering index?
A: Clustered index determines physical order of data (one per table). Non-clustered is separate structure with pointers (multiple allowed).
Q: Explain eventual consistency with example.
A: In AP systems, writes may not be immediately visible everywhere. Example: You post on social media, some friends see it instantly, others after few seconds.
Q: What's a covering index?
A: Index that contains all columns needed for a query, so database doesn't need to access the table. Very fast.
Q: Difference between optimistic and pessimistic locking?
A: Pessimistic locks data upfront (prevents conflicts). Optimistic checks for conflicts at commit time (better concurrency).
Strong DBMS fundamentals separate good engineers from great ones! 🗃️