Course: COMP 163
Time: 08:00 AM - 09:20 AM on Monday, Wednesday, Friday
Location: John T Chambers Technology Center 114 (CTC 114)
Week | Topic | Assignment (Canvas + GitHub) | Database | Comment |
---|---|---|---|---|
1 (Jan 13, 15, 17) | Introduction to Databases Management System | Lab 1 + Homework 1 (2w) | SQLite | |
2 (Jan 22, 24) | ERM and SQL | Holiday on Monday | ||
3 (Jan 27, 29, 31) | Relational Algebra | Lab 2 + Homework 2 (2w) | ||
4 (Feb 3, 5, 7) | Relational Database Interfaces | PostgreSQL | ||
5 (Feb 10, 12, 14) | Relational Database Advanced SQL | Lab 3 + Homework 3 (2w), Project Part 1 (4w) | ||
6 (Feb 19, 21) | Relational Database Normalization | Holiday on Monday | ||
7 (Feb 24, 26, 28) | Relational Database Table Indexing | MariaDB | ||
8 (Mar 3, 5, 7) | Midterm Preparation | Midterm (Mar 7) | ||
9 ( - ) | Holiday Week | |||
10 (Mar 17, 19, 21) | Relational Database Transactions | Lab 4 + Homework 4 (2w), Project Part 2 (4w) | OracleSQL | |
11 (Mar 24, 26, 28) | Relational Database Security | |||
12 (Mar 31, Apr 2, Apr 4) | Relational Database and Data Distribution | Lab 5, Homework 5 (2w) | ||
13 (Apr 14, 16, 18) | Relational Database and Data Aggregation | ClickHouse | ||
14 (Apr 21, 23, 25) | Project Presentation | Project Presentation (15 min) | ||
15 (Apr 28) | Final Preparation | Classes End on Tuesday (Apr 29) | ||
16 (May 7) | Final (May 7, 10-11 AM) |
Databases solve problems of managing, storing, and retrieving vast amounts of data efficiently.
The system should allow users to rent bikes easily
The system should allow registered users to rent a bike by selecting a location, choosing an available bike, and completing payment through the application within three steps
Purpose:Bike Rental Database System to Manage rentals and track customers, bikes, and transactions
Entities are nouns
Examples: Customer, Bike, Rental Transaction
Definition: Relationships define how entities are connected
Relationships are verbs
Examples: Rents, Is part of
Conceptually (ERD) yes, technically no
Does it matter if you name an entity Customer or customer
Conceptually (ERD) no, technically it depends (Windows MySQL: no, Unix MySQL: yes)
Question: A "Customer rents a Bike."
Answer: This is a Relationship connecting the "Customer" and "Bike" entities.
Question: A "Bike."
Answer: This is an Entity representing a real-world object with attributes like "BikeID," "Model," and "Color."
A travel booking system should allow users to search for, compare, and book flights, hotels, and rental cars. Users must be able to filter results by price, date, and location, and securely complete payments
A travel booking system should allow users to search for, compare, and book flights, hotels, and rental cars. Users must be able to filter results by price, date, and location, and securely complete payments
ID Attributes: IDs are technically mandatory and serve as the primary key to uniquely identify each entity record.
Description: Optional attributes provide additional details about an entity but are not required to uniquely identify it
Data Type | Description | Example |
---|---|---|
INTEGER | A whole number, no decimal points | 42 (Max: ±2,147,483,647) |
BIGINT | A large integer value | 9223372036854775807 (Max: ±9,223,372,036,854,775,807) |
DECIMAL(p, s) | Exact numeric value with specified precision (p) and scale (s) | DECIMAL(10, 2): 12345.67 (Max: 10 digits, 2 after decimal) |
FLOAT | Approximate numeric value with floating-point precision | 3.14159 (Precision depends on implementation) |
CHAR(n) | Fixed-length character string with length n | CHAR(5): 'Hello' (Max: Defined by n) |
VARCHAR(n) | Variable-length character string with maximum length n | VARCHAR(50): 'Mountain Bike' (Max: Defined by n) |
TEXT | Variable-length text data | 'This is a long text.' (Max: Implementation dependent, e.g., 4GB in MySQL) |
DATE | Date value (YYYY-MM-DD) | 2025-01-14 (Range: 1000-01-01 to 9999-12-31) |
TIME | Time value (HH:MM:SS) | 14:30:00 (Range: -838:59:59 to 838:59:59 in MySQL) |
TIMESTAMP | Date and time value (YYYY-MM-DD HH:MM:SS) | 2025-01-14 14:30:00 (Range: 1970-01-01 to 2038-01-19 in UNIX-based systems) |
BOOLEAN | Logical value (TRUE or FALSE) | TRUE (Stored as 1 or 0 in most databases) |
BLOB | Binary Large Object for storing binary data | [Binary data] (Max: Implementation dependent, e.g., 4GB in MySQL) |
INTEGER, BIGINT, DECIMAL(p, s), FLOAT, CHAR(n), VARCHAR(n), TEXT, DATE, TIME, TIMESTAMP, BOOLEAN, BLOB
Constraint | Description |
---|---|
PRIMARY KEY | Uniquely identifies each record in a table. Automatically implies NOT NULL |
FOREIGN KEY | Ensures referential integrity by linking a column to the primary key in another table |
NOT NULL | Ensures a column cannot have NULL values |
UNIQUE | Ensures all values in a column are distinct |
DEFAULT | Sets a default value for a column when no value is provided |
INDEX | Improves the speed of data retrieval operations on a table |
AUTO_INCREMENT | Automatically generates a unique value for each new record in a column |
UNIQUE + NOT NULL | Combination often used for candidate keys or alternate unique identifiers |
Conceptual Term | Technical Term |
---|---|
Diagram | Schema |
Entity | Table |
Attribute | Column |
Record | Row |
Identifier | Primary Key |
Relationship | Foreign Key |
Optional Attribute | Nullable Column |
Mandatory Attribute | NOT NULL Column |
CREATE TABLE Customer (
CustomerID INTEGER PRIMARY KEY AUTOINCREMENT,
Name VARCHAR(100) NOT NULL,
Email VARCHAR(100) UNIQUE,
Phone VARCHAR(15)
);
CREATE TABLE Bike (
BikeID INTEGER PRIMARY KEY AUTOINCREMENT,
Model VARCHAR(100) NOT NULL,
Color VARCHAR(50),
Condition VARCHAR(50),
Availability BOOLEAN DEFAULT TRUE
);
CREATE TABLE RentalTransaction (
RentalID INTEGER PRIMARY KEY AUTOINCREMENT,
CustomerID INTEGER NOT NULL,
BikeID INTEGER NOT NULL,
BookingDate DATE NOT NULL,
ReturnDate DATE,
FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID),
FOREIGN KEY (BikeID) REFERENCES Bike(BikeID)
);
CREATE TABLE Customer (
CustomerID INTEGER PRIMARY KEY AUTOINCREMENT,
Name VARCHAR(100) NOT NULL,
Email VARCHAR(100) UNIQUE,
Phone VARCHAR(15)
);
CREATE TABLE Bike (
BikeID INTEGER PRIMARY KEY AUTOINCREMENT,
Model VARCHAR(100) NOT NULL,
Color VARCHAR(50),
Condition VARCHAR(50),
Availability BOOLEAN DEFAULT TRUE
);
CREATE TABLE RentalTransaction (
RentalID INTEGER PRIMARY KEY AUTOINCREMENT,
CustomerID INTEGER NOT NULL,
BikeID INTEGER NOT NULL,
BookingDate DATE NOT NULL,
ReturnDate DATE,
FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID),
FOREIGN KEY (BikeID) REFERENCES Bike(BikeID)
);
Why must Front-End, Back-End, and Database data types be aligned?
Front-End | Back-End | Database | Result |
---|---|---|---|
Text input for phone number | String validation for format | INTEGER type | Formatted numbers (e.g., +1-123-456) are rejected by the database |
Dropdown allows decimal selection | Expects integer values | FLOAT type | Back-end fails due to unexpected decimal input |
Date picker with local format | Assumes ISO 8601 format | DATE type | Date conversion errors lead to failed database insertion |
Question 1: What does an entity represent in an ERD?
Correct Answer: B. A real-world object or concept
Question 2: What is the purpose of a relationship in an ERD?
Correct Answer: C. To show how entities connect
Question 3: What is the purpose of the primary key in a database?
Correct Answer: B. To uniquely identify each row in a table
Question 1: What does a table in a database represent?
Correct Answer: B. A structured collection of rows and columns
Question 2: What does a table attribute represent?
Correct Answer: C. A column that defines a property of the entity
Question 3: Which of the following is a valid SQL data type?
Correct Answer: A. INTEGER
Question 1: What does a table in a database represent?
Correct Answer: B. A structured collection of rows and columns
Question 1: What is the purpose of a primary key?
Correct Answer: B. To uniquely identify each row in a table
Question 2: What does the NOT NULL constraint ensure?
Correct Answer: B. A column cannot have NULL values
Question 1: How do you select all bicycles that are available?
Correct Answer: A. SELECT * FROM Bicycle WHERE Available = TRUE;
Question 2: How do you update bicycle 2 to set it as unavailable?
Correct Answer: B. UPDATE Bicycle SET Available = FALSE WHERE BicycleID = 2;
Plant Watering Database Application
@startuml entity "Entity1" as E1 { +Attribute1 : +Attribute2 : +Attribute3 : } entity "Entity2" as E2 { +Attribute1 : +Attribute2 : +Attribute3 : } entity "Entity3" as E3 { +Attribute1 : +Attribute2 : +Attribute3 : } @enduml
sqlite3 database_name.db
CREATE TABLE table_name ( column1 datatype, column2 datatype, ... );
INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);
sqlite3 database_name.db < script.sql
sqlite3 database_name.db > .read script.sql > SELECT * FROM table_name;
SELECT PlantID, Name, WaterLevel, WaterThreshold FROM SensorData WHERE WaterLevel < (SELECT WaterThreshold FROM Plant WHERE Plant.PlantID = SensorData.PlantID) ORDER BY Name;
PlantUML | SQL Create Query |
---|---|
entity "Plant" as Plant { +PlantID : INTEGER +Name : VARCHAR(100) +WaterThreshold : FLOAT } |
CREATE TABLE Plant ( PlantID INTEGER PRIMARY KEY, Name VARCHAR(100) NOT NULL, WaterThreshold FLOAT ); |
PlantUML | SQL Create Query |
---|---|
entity "Sensor" as Sensor { +SensorID : INTEGER +Location : VARCHAR(100) +Type : VARCHAR(50) } |
CREATE TABLE Sensor ( SensorID INTEGER PRIMARY KEY, Location VARCHAR(100), Type VARCHAR(50) ); |
entity "SensorData" as SensorData { +SensorDataID : INTEGER +SensorID : INTEGER +PlantID : INTEGER +WaterLevel : FLOAT +Timestamp : DATETIME } |
CREATE TABLE SensorData ( SensorDataID INTEGER PRIMARY KEY, SensorID INTEGER NOT NULL, PlantID INTEGER NOT NULL, WaterLevel FLOAT, Timestamp DATETIME, FOREIGN KEY (SensorID) REFERENCES Sensor(SensorID), FOREIGN KEY (PlantID) REFERENCES Plant(PlantID) ); |
PlantUML | SQL Create Query |
---|---|
entity "WateringLog" as WateringLog { +LogID : INTEGER +PlantID : INTEGER +WateredAt : DATETIME } |
CREATE TABLE WateringLog ( LogID INTEGER PRIMARY KEY, PlantID INTEGER NOT NULL, WateredAt DATETIME, FOREIGN KEY (PlantID) REFERENCES Plant(PlantID) ); |
What is a difference between SQL and SQLite?
Correct Answers: A. and B
Which SQLite command is used to display the table schema?
Correct Answer: C. .schema table_name;
Which SQLite command executes SQL statements from a file?
Correct Answer: D. .read file_name.sql;
Which is the correct syntax to create a table in SQLite?
Correct Answer: B
Which syntax updates the age of student 'Alice' in the table?
Correct Answer: A
How do you write an SQL query to create a table named 'Book' with columns for ID, Title, and Author?
Please write answer here
How to model relationships between entities in ERD?
1 Entity
0 Relationship
2 Entity
1 Relationship
3 Entity
2 Relationship
Each record links to one record
Entities: 1
One record links to many records
Entities: 2
Many records link to many records
Entities: 3
Cardinality | When to Use | Example |
---|---|---|
1:1 | When each record in one table corresponds to exactly one record in another table | User and Profile |
1:N | When one record can be associated with multiple records in another table | Library and Books |
N:N | When multiple records in both tables can be related to each other | Students and Courses |
How to query relationships between entities in ERD?
SELECT Book.title FROM Book WHERE Book.isbnId = '978-1-';
SELECT Book.title FROM Book, ISBN WHERE Book.bookId = ISBN.bookId AND ISBN.isbnId = '978-1-';
SELECT Book.title FROM Book, ISBN, Book_ISBN WHERE Book.bookId = Book_ISBN.bookId AND ISBN.isbnId = Book_ISBN.isbnId AND ISBN.isbnId = '978-1-';
# | Scenario | Which Cardinality? |
---|---|---|
1 | Book and ISBN | |
2 | Author and Book | |
3 | Student and borrow Book | |
4 | Student and Student ID | |
5 | Book and Publisher | |
6 | Publisher and Books | |
7 | Library and Book | |
8 | Book and Format (e.g., eBook, hardcover) | |
9 | Book and Catalog Number in the Library System | |
10 | Reader and Book Review |
Property | Description |
---|---|
Uniqueness | Each value must be unique across all records |
Non-null | Primary key values cannot be NULL |
Immutability | Once assigned, the primary key value should not change |
Single-column or Composite | Can consist of one column or a combination of multiple columns |
Indexed | Primary keys are automatically indexed for faster lookups |
CREATE TABLE Book (
bookId INTEGER PRIMARY KEY,
title VARCHAR(255),
libraryId INTEGER FOREIGN KEY REFERENCES Library(libraryId)
);
CREATE TABLE Library (
libraryId INTEGER PRIMARY KEY,
name VARCHAR(255),
location VARCHAR(255)
);
libraryId INTEGER FOREIGN KEY REFERENCES Library(libraryId) -- establishes a one-to-many relationship:
-- Insert data into Library
INSERT INTO Library (libraryId, name, location)
VALUES (1, 'Central Library', 'Downtown');
-- Insert data into Book
INSERT INTO Book (bookId, title, libraryId)
VALUES (101, 'Database Design', 1);
-- Retrieve books and their library details
SELECT Book.title, Library.name, Library.location
FROM Book, Library
WHERE Book.libraryId = Library.libraryId;
bookId | title | libraryId |
---|---|---|
1 | Introduction to SQL | 101 |
2 | Database Design Principles | 101 |
3 | Advanced Database Systems | 102 |
4 | Big Data Analytics | 103 |
5 | Machine Learning with SQL | 101 |
Query Type | Description | Example |
---|---|---|
Data Definition (DDL) | Used to define and modify database schema and attributes | CREATE TABLE Library (...); |
Data Control (DCL) | Used to control access to data within the database | GRANT SELECT ON Book TO user; |
Data Manipulation (DML) | Used to retrieve, insert, update, and delete data in tables | SELECT * FROM Book; |
Transaction Control (TCL) | Used to manage transactions and ensure data integrity | COMMIT; |
# | SQL Keyword | Type (DDL, DML, DCL, TCL) |
---|---|---|
1 | SELECT | |
2 | INSERT | |
3 | UPDATE | |
4 | DELETE | |
5 | CREATE | |
6 | ALTER | |
7 | DROP | |
9 | GRANT | |
10 | REVOKE | |
11 | COMMIT | |
12 | ROLLBACK | |
15 | BEGIN |
Lab Exercise: Relationships and Foreign Keys in SQLite
How to create a table with a foreign key reference in SQLite?
A. CREATE TABLE Orders (
orderId INTEGER,
customerId INTEGER FOREIGN KEY REFERENCES Customers(customerId)
);
B. CREATE TABLE Orders (
orderId INTEGER PRIMARY KEY,
customerId INTEGER REFERENCES Customers(customerId)
);
C. CREATE TABLE Orders (
orderId INTEGER,
customerId INTEGER,
FOREIGN KEY (customerId) REFERENCES Customers(customerId)
);
Correct Answer: B, C
What happens (default) if a referenced row in the parent table?
Correct Answer: B
How to add a foreign key constraint to an existing SQLite table?
A. ALTER TABLE Orders ADD FOREIGN KEY (customerId) REFERENCES Customers(customerId);
B. MODIFY TABLE Orders ADD FOREIGN KEY (customerId) REFERENCES Customers(customerId);
C. SQLite does not support adding foreign keys to existing tables;
D. UPDATE TABLE Orders SET FOREIGN KEY (customerId) REFERENCES Customers(customerId);
Correct Answer: C
Which of the following table names are permitted in SQLite?
A. CREATE TABLE customer_data (...);
B. CREATE TABLE order (...);
C. CREATE TABLE _sales_record (...);
D. CREATE TABLE Sales-Report (...);
Correct Answer: C, A (Table names should not use SQL keywords like 'order', and should avoid special characters)
Concept | Description | Example |
---|---|---|
Union (⋃) | Combines elements from two sets, removing duplicates | ![]() |
Intersection (∩) | Finds common elements between two sets | ![]() |
Set Difference (−) | Elements present in one set but not in the other | ![]() |
In set theory, a set is a collection of any distinct elements
Relational Algebra ⊆ Set Theory
In relational algebra, a tuple is a single unique row in a relation
It was created to logically analyze relationships between elements
IBM had vast data but struggled with fast and accurate queries
@startuml entity Bicycles_Available { bikeId INTEGER brand VARCHAR(50) } entity Bicycles_Sold { bikeId INTEGER brand VARCHAR(50) } Bicycles_Available ||--o{ Bicycles_Sold : Difference @enduml
@startuml entity Bicycles { bikeId INTEGER brand VARCHAR(50) } entity Stores { storeId INTEGER location VARCHAR(100) } Bicycles ||--|| Stores : Product @enduml
@startuml entity Bicycles { bikeId INTEGER brand VARCHAR(50) } entity Sales { saleId INTEGER bikeId INTEGER } Bicycles ||--|| Sales : Join @enduml
@startuml entity Bicycles { bikeId INTEGER model VARCHAR(50) } entity Stores { storeId INTEGER } Bicycles ||--o{ Stores : Division @enduml
@startuml entity Sales { saleId INTEGER price INTEGER } @enduml
@startuml entity BikeInventory { bikeId INTEGER brand VARCHAR(50) } @enduml
Operator | Symbol | Description | Example |
---|---|---|---|
Selection | σ | Filters rows based on a condition | σ(Price < 500)(Bicycles) |
Projection | π | Selects specific columns | π(brand, price)(Bicycles) |
Union | ⋃ | Combines two relations, removing duplicates | Bicycles_Available ⋃ Bicycles_Sold |
Set Difference | − | Finds tuples in one relation but not the other | Bicycles_Available − Bicycles_Sold |
Cartesian Product | × | Combines all tuples from two relations | Bicycles × Stores |
Intersection | ∩ | Finds common tuples between relations | Bicycles_Available ∩ Bicycles_Sold |
Join | ⨝ | Combines tuples based on common attributes | Bicycles ⨝ Stores |
Division | ÷ | Finds tuples matching all values in another relation | Projects ÷ RequiredParts |
Renaming | ρ | Renames a relation or attributes | ρ(BikeInventory)(Bicycles) |
Before Optimization:
π(brand) (σ(price < 500) (Bicycles × Stores))
After Optimization:
σ(price < 500) (π(brand) (Bicycles × Stores))
Before Optimization:
σ(Bicycles.storeId = Stores.storeId) (Bicycles × Stores)
After Optimization:
Bicycles ⨝ Stores
Before Optimization:
π(brand) (π(bikeId, brand) (Bicycles))
After Optimization:
π(brand) (Bicycles)
Before Optimization:
π(brand) (Bicycles ⨝ Stores ⨝ Suppliers)
After Optimization:
π(brand) (Bicycles ⨝ Stores)
Before Optimization:
(Bicycles ⨝ Stores) ⨝ Suppliers
After Optimization:
(Stores ⨝ Suppliers) ⨝ Bicycles
Before Optimization:
σ(price < 500) (π(brand, price) (Bicycles ⨝ Stores))
After Optimization:
σ(price < 500) (Bicycles) ⨝ π(brand) (Stores)
Before Optimization:
Bicycles ⨝ σ(DiscountedPrice = price * 0.9) (Stores)
After Optimization:
Precompute DiscountedPrice column and then join
σ Price > 10 ∧ Name = 'Rose' ∧ Location = 'Local' (Flowers ⨝ Suppliers)
Flowers ⨝ Suppliers → σ Filters
Comparisons: 1,000 × 500 = 500K
(σ Location = 'Local' (Suppliers)) ⨝ (σ Name = 'Rose' ∧ Price > 10 (Flowers))
Comparisons: 100 × 50 = 5K
Filtering first reduces data size, making both the join and select the result faster
┌───────────────┐ ┌────────────┐ ┌───────────────┐ │ Relation │ ───► │ Operator │ ───► │ Relation │ └───────────────┘ └────────────┘ └───────────────┘
π Name ( σ Price > 10 ( Flowers ⨝ ( σ Location = 'Local' ( Suppliers ⨝ ( σ Revenue > 1000 ( Companies ) ) ) ) ) )
π Name ( -- (6) Project only the "Name" attribute σ Price > 10 ( -- (5) Select flowers priced over $10 Flowers ⨝ ( -- (4) Join Flowers with filtered Suppliers σ Location = 'Local' ( -- (3) Select only local suppliers Suppliers ⨝ ( -- (2) Join Suppliers with filtered Companies σ Revenue > 1000 ( -- (1) Select companies with revenue > $1000 Companies -- Input relation: Companies ) -- Output relation: High-revenue companies ) -- Output relation: Local suppliers from revenue companies ) -- Output relation: Flowers with local suppliers ) -- Output relation: Flowers priced over $10 ) -- Output relation: Names of flowers )
Before | After -------------------------------------- | -------------------------------------- π Name ( | π Name ( σ Price > 10 ( | (σ Price > 10 (Flowers)) ⨝ ( Flowers ⨝ ( | (σ Location = 'Local' (Suppliers)) ⨝ ( σ Location = 'Local' ( | σ Revenue > 1000 (Companies) Suppliers ⨝ ( | ) σ Revenue > 1000 ( | ) Companies | ) ) | ) | ) | ) | ) | ) |
SELECT f.Name
FROM (
(SELECT * FROM Flowers WHERE Price > 10) f
JOIN (
(SELECT * FROM Suppliers WHERE Location = 'Local') s
JOIN (SELECT * FROM Companies WHERE Revenue > 1000) c
ON s.CompanyID = c.CompanyID
)
ON f.SupplierID = s.SupplierID
);
Aspect | Relational Algebra | SQL |
---|---|---|
Definition | Theoretical foundation for database queries | Practical language for interacting with relational databases |
Operators/Syntax |
Uses mathematical operators: σ (Selection), π (Projection), ∪ (Union), ∩ (Intersection), − (Difference), ⋈ (Join). |
Uses English-like commands: SELECT, INSERT, DELETE, JOIN. |
Purpose | Conceptual understanding of query mechanisms | Execution of queries in real databases |
Usage | Primarily in academic and theoretical contexts | Widely used in industry for database management |
CREATE TABLE Orders (OrderID INT PRIMARY KEY, CustomerID INT FOREIGN KEY REFERENCES Customers(CustomerID)); -- A
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)); -- B
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY REFERENCES Customers(CustomerID)); -- C
CREATE TABLE Orders (OrderID INT PRIMARY KEY, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)) -- D;
CREATE TABLE Table (ID INT PRIMARY KEY, Name VARCHAR(50));
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID))
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY REFERENCES Customers(CustomerID));
CREATE TABLE Orders (OrderID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID), CustomerID INT);
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID));
Interface | Role | Purpose | Technology |
---|---|---|---|
Admin Interface (GUI) | Database Administrators | User management, schema modifications, monitoring | SQLite Browser |
Command-Line Interface (CLI) | Developers & DBAs | Direct SQL execution, automation, scripting | SQLite CLI |
Application Programming Interface (API) | Developers | Programmatic database interaction, integrating apps | Neon Tech |
Web-Based UI | Regular Users | Accessing, viewing, and interacting with data in a user-friendly way | - |
postgresql://user:password@host:port/database?sslmode=require
import psycopg2
conn = psycopg2.connect("postgresql://user:pass@host/db?sslmode=require")
cur = conn.cursor()
cur.execute("SELECT version();")
print(cur.fetchone())
git clone https://github.com/SE4CPS/DMS.git
cd DMS
Interface | Role | Purpose | Technology |
---|---|---|---|
Admin Interface (GUI) | Database Administrators | User management, schema modifications, monitoring | SQLite Browser |
Command-Line Interface (CLI) | Developers & DBAs | Direct SQL execution, automation, scripting | SQLite CLI |
Application Programming Interface (API) | Developers | Programmatic database interaction, integrating apps | Neon Tech |
Web-Based UI | Regular Users | Accessing, viewing, and interacting with data in a user-friendly way | - |
Database | License | Restrictions |
---|---|---|
SQLite | Public Domain | No restrictions, free for any use |
PostgreSQL | PostgreSQL License (BSD-style) | Requires keeping copyright notice |
PostgreSQL Database Management System Copyright (c) 1996-2024, The PostgreSQL Global Development Group Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph appear in all copies. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
@app.route('/flowers')
def manage_flowers():
conn = get_db_connection()
cur = conn.cursor()
cur.execute("SELECT * FROM flowers")
flowers = cur.fetchall()
cur.close()
conn.close()
return render_template('flowers.html', flowers=flowers)
Is this a Backend Database Developer Task?
Is this a Backend Database Developer Task?
Is this a Backend Database Developer Task?
Is fetching data from an API a frontend or backend task?
Who is responsible for validating user input before storing it in the database?
Does the frontend directly modify the database?
Which component is responsible for ensuring data validation and access control?
Which component plays a key role in optimizing system performance and preventing security breaches?
Which component is essential for managing user requests, processing data, and ensuring secure communication?
Which component typically handles REST API requests and responses?
Which component is primarily responsible for handling database connection strings?
CREATE TABLE Book (
book_id SERIAL PRIMARY KEY,
uuid UUID DEFAULT gen_random_uuid(),
electronic_article_number VARCHAR(50) UNIQUE NOT NULL,
isbn VARCHAR(512) UNIQUE NOT NULL,
title VARCHAR(200) NOT NULL,
author VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL,
stock INT NOT NULL
);
UUID (Universally Unique Identifier) is a 128-bit unique identifier
Example: 550e8400-e29b-41d4-a716-446655440000
-- Supported in PostgreSQL, MySQL (8.0+), MariaDB, and others
CREATE TABLE Users (
user_id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
import uuid
# Generate a new UUID
user_id = uuid.uuid4()
print("Generated UUID:", user_id)
import { v4 as uuidv4 } from 'uuid';
// Generate a new UUID
const userId = uuidv4();
console.log("Generated UUID:", userId);
// Generate a UUID using the browser's crypto API
function generateUUID() {
return crypto.randomUUID();
}
console.log("Generated UUID:", generateUUID());
SELECT * FROM flowers;
SELECT * FROM flowers ORDER BY price DESC LIMIT 1;
SELECT * FROM customers;
SELECT orders.order_id, customers.name, orders.order_date, orders.total
FROM orders
JOIN customers ON orders.customer_id = customers.customer_id;
SELECT DISTINCT customers.name, customers.email
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id;
SELECT COUNT(*) FROM orders;
SELECT SUM(total) FROM orders;
SELECT * FROM order_details WHERE order_id = 1;
SELECT * FROM flowers WHERE stock > 0;
SELECT * FROM orders WHERE order_date > '2024-01-01';
SELECT * FROM flowers WHERE price > 10.00;
SELECT * FROM customers WHERE phone IS NOT NULL;
SELECT * FROM orders WHERE total > 50.00;
SELECT * FROM customers WHERE email LIKE '%@gmail.com';
SELECT * FROM orders WHERE DATE(order_date) = '2024-02-01';
SELECT * FROM flowers WHERE color = 'Red';
SELECT * FROM flowers WHERE LOWER(color) = 'red';
SELECT * FROM customers WHERE address IS NULL;
Which of the following best describes an entity in an ER Model?
Correct Answer: A real-world object with attributes
Which SQL clause is used to filter records based on a condition?
Correct Answer: WHERE
Which of the following is a major limitation of SQLite compared to PostgreSQL?
Correct Answer: No support for full outer joins
Which relational algebra operation is used to retrieve specific attributes from a relation?
Correct Answer: Projection (π)
What is the primary purpose of a foreign key in a relational database?
Correct Answer: To enforce referential integrity
The bike store needs a database to manage its inventory, sales, and customer interactions
The store wants to keep track of all bikes and accessories, along with their suppliers
They need to see which products have suppliers and which do not
Customers can place orders for bikes and accessories
The store wants a report showing all customers and their orders, but also customers who have never ordered
The store needs to generate reports comparing all products, including those that have never been sold
Employees handle sales and inventory updates
They need a way to see all employees and the sales they have made, while also seeing employees with no recorded sales
Returns only matching rows from both tables
SELECT *
FROM A
INNER JOIN B
ON A.id = B.id;
A.id | A.val | B.id | B.val |
---|---|---|---|
1 | X | 1 | Y |
Returns all rows from the left table and matching rows from the right
SELECT *
FROM A
LEFT JOIN B
ON A.id = B.id;
A.id | A.val | B.id | B.val |
---|---|---|---|
1 | X | 1 | Y |
2 | Z | NULL | NULL |
3 | M | 3 | N |
4 | P | NULL | NULL |
Returns all rows from the right table and matching rows from the left
SELECT *
FROM A
RIGHT JOIN B
ON A.id = B.id;
A.id | A.val | B.id | B.val |
---|---|---|---|
1 | X | 1 | Y |
NULL | NULL | 2 | Z |
3 | M | 3 | N |
NULL | NULL | 4 | P |
Automatically joins columns with the same name and identical values
SELECT *
FROM A
NATURAL JOIN B;
id | name | value |
---|---|---|
1 | Alpha | 10 |
2 | Beta | 20 |
id | name | score |
---|---|---|
1 | Alpha | 90 |
3 | Gamma | 85 |
Matches rows where both id and name are the same
id | name | value | score |
---|---|---|---|
1 | Alpha | 10 | 90 |
Returns all rows from both tables
Includes unmatched rows with NULL values where no match is found
SELECT *
FROM A
FULL OUTER JOIN B
ON A.id = B.id;
id | A.val |
---|---|
1 | X |
2 | Y |
id | B.val |
---|---|
2 | Z |
3 | W |
id | A.val | B.val |
---|---|---|
1 | X | NULL |
2 | Y | Z |
3 | NULL | W |
Includes unmatched rows from both tables
Generates the Cartesian product of two tables
SELECT * FROM A CROSS JOIN B;
id | name |
---|---|
1 | Alice |
2 | Bob |
id | color |
---|---|
10 | Red |
20 | Blue |
All possible combinations of rows
A.id | A.name | B.id | B.color |
---|---|---|---|
1 | Alice | 10 | Red |
1 | Alice | 20 | Blue |
2 | Bob | 10 | Red |
2 | Bob | 20 | Blue |
Consider the following use cases on a flower database
Use Case | Tables Involved | Expected Result |
---|---|---|
Get only flowers that have matching orders | Flowers, Orders | Only flowers that have been ordered |
List all flowers with their orders (if any) | Flowers, Orders | All flowers, showing NULL for those with no orders |
Show all flowers and all orders, even if unmatched | Flowers, Orders | All flowers and all orders, showing NULL where unmatched |
Feature | SQLite | PostgreSQL |
---|---|---|
Auto-increment ID | INTEGER PRIMARY KEY | SERIAL PRIMARY KEY |
Full Outer Join | Not supported (use UNION of LEFT and RIGHT JOIN) | FULL OUTER JOIN |
Foreign Keys | PRAGMA foreign_keys = ON | Enforced by default |
Command | Description |
---|---|
SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'your_table'; | List column names and data types for a specific table |
SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'; | Get all table names from the public schema |
SELECT * FROM pg_catalog.pg_tables; | Get detailed information about all tables |
SELECT * FROM pg_indexes WHERE tablename = 'your_table'; | Get all indexes of a specific table |
SELECT A.name, A.role, M.name AS mission_name
FROM Astronauts A
LEFT JOIN Missions M
ON A.missions = M.id;
SELECT M.name, M.destination
FROM Missions M
LEFT JOIN Astronauts A
ON M.id = A.missions
WHERE A.id IS NULL;
SELECT AE.species, AE.communication, M.name AS mission_name
FROM AlienEncounters AE
INNER JOIN Missions M
ON AE.missionId = M.id;
Query:
SELECT bikes.model, sales.price FROM bikes INNER JOIN sales ON bikes.id = sales.bike_id;
Bikes +----+-----------+------+ | id | model | type | +----+-----------+------+ | 1 | RoadX | Road | | 2 | SpeedPro | Road | | 3 | TrailMax | MTB | +----+-----------+------+
Sales +----+----------+--------+ | id | bike_id | price | +----+----------+--------+ | 1 | 1 | 1200 | | 2 | 2 | 1500 | +----+----------+--------+
Result +-----------+--------+ | model | price | +-----------+--------+ | RoadX | 1200 | | SpeedPro | 1500 | +-----------+--------+
Query:
SELECT bikes.model, sales.price FROM bikes LEFT JOIN sales ON bikes.id = sales.bike_id;
Bikes +----+-----------+------+ | id | model | type | +----+-----------+------+ | 1 | RoadX | Road | | 2 | SpeedPro | Road | | 3 | TrailMax | MTB | | 4 | CityRide | City | +----+-----------+------+
Sales +----+----------+--------+ | id | bike_id | price | +----+----------+--------+ | 1 | 1 | 1200 | | 2 | 2 | 1500 | +----+----------+--------+
Result +-----------+--------+ | model | price | +-----------+--------+ | RoadX | 1200 | | SpeedPro | 1500 | | TrailMax | NULL | | CityRide | NULL | +-----------+--------+
Query:
SELECT bikes.model, sales.price FROM bikes RIGHT JOIN sales ON bikes.id = sales.bike_id;
Bikes +----+-----------+------+ | id | model | type | +----+-----------+------+ | 1 | RoadX | Road | | 2 | SpeedPro | Road | | 3 | TrailMax | MTB | +----+-----------+------+
Sales +----+----------+--------+ | id | bike_id | price | +----+----------+--------+ | 1 | 1 | 1200 | | 2 | 2 | 1500 | | 3 | 5 | 800 | +----+----------+--------+
Result +-----------+--------+ | model | price | +-----------+--------+ | RoadX | 1200 | | SpeedPro | 1500 | | NULL | 800 | +-----------+--------+
A metric is a measurable value used to track performance
A KPI (Key Performance Indicator) is a metric that measures success
A target is a predefined goal for a metric or KPI
A benchmark is a standard used for comparison
Question | SQL Query |
---|---|
Total sales revenue (Metric) |
|
Average order value (KPI) |
|
Number of unique customers |
|
Highest sale recorded |
|
Compare sales with target (Benchmark) |
|
The GROUP BY clause groups rows with the same values in specified columns
SELECT column, AGG_FUNCTION(column)
FROM table
GROUP BY column;
SELECT product, SUM(sales) AS total_sales
FROM sales
GROUP BY product;
+----+------------+--------+ | id | product | sales | +----+------------+--------+ | 1 | Bike A | 500 | | 2 | Bike B | 700 | | 3 | Bike A | 300 | | 4 | Bike C | 900 | | 5 | Bike B | 400 | +----+------------+--------+
Result: +------------+-------------+ | product | total_sales | +------------+-------------+ | Bike A | 800 | | Bike B | 1100 | | Bike C | 900 | +------------+-------------+
SELECT column, AGG_FUNCTION(column)
FROM table
GROUP BY column
HAVING condition;
SELECT product, SUM(sales) AS total_sales
FROM sales
GROUP BY product
HAVING SUM(sales) > 900;
Before GROUP BY
+----+------------+--------+ | id | product | sales | +----+------------+--------+ | 1 | Bike A | 500 | | 2 | Bike B | 700 | | 3 | Bike A | 300 | | 4 | Bike C | 900 | | 5 | Bike B | 400 | +----+------------+--------+
After GROUP BY + HAVING
Result: +------------+-------------+ | product | total_sales | +------------+-------------+ | Bike B | 1100 | | Bike C | 900 | +------------+-------------+
The CHECK constraint enforces rules on column values
CREATE TABLE bikes (
id INT PRIMARY KEY,
model VARCHAR(50),
price DECIMAL(10,2) CHECK (price > 0),
stock INT CHECK (stock >= 0),
category VARCHAR(20) CHECK (category IN ('Road', 'MTB', 'City', 'Hybrid')),
discount DECIMAL(5,2) CHECK (discount BETWEEN 0 AND 50),
weight DECIMAL(5,2) CHECK (weight > 5)
);
SELECT A.role, COUNT(A.missions) AS total_missions
FROM Astronauts A
GROUP BY A.role;
SELECT name, missions FROM Astronauts
ORDER BY missions DESC LIMIT 1;
SELECT name, destination FROM Missions
WHERE status = 'Ongoing';
SELECT name FROM Astronauts
WHERE role = 'Scientist' AND missions = 0;
SELECT name, crew FROM Missions
ORDER BY crew DESC LIMIT 1;
SELECT name FROM Spaceship
WHERE fuelLevel = 0;
SELECT m.name, a.species FROM AlienEncounters a
JOIN Missions m ON a.missionId = m.id
WHERE a.communication = 'Hostile';
SELECT name, missions FROM Astronauts
WHERE role = 'Engineer';
SELECT team, score FROM Score
WHERE score > 0;
SELECT name, destination FROM Missions
WHERE destination LIKE 'P%';
SELECT m.name, a.species FROM Missions m
LEFT JOIN AlienEncounters a ON m.id = a.missionId;
SELECT name, crew FROM Missions
ORDER BY crew ASC LIMIT 1;
SELECT name, fuelLevel FROM Spaceship
WHERE fuelLevel > 10;
SELECT COUNT(*) FROM Missions
WHERE status = 'Completed';
SELECT name FROM Astronauts
WHERE role = 'Engineer' AND missions > 0;
SELECT team, score FROM Score
ORDER BY score DESC LIMIT 1;
SELECT name, missions FROM Astronauts
WHERE role = 'Scientist';
SELECT m.name, COUNT(a.id) AS encounters FROM Missions m
JOIN AlienEncounters a ON m.id = a.missionId
GROUP BY m.name
ORDER BY encounters DESC LIMIT 1;
SELECT name, shiptype, fuelLevel FROM Spaceship;
SELECT name, destination FROM Missions
WHERE LENGTH(destination) > 6;
+--------------+ | Requirements | --> +--------------+
+-----+ | ERD | --> +-----+
+----------+ | Database | --> +----------+
+-------+ | Table | --> +-------+
+------+ | Data | +------+
Characteristic | Structured Data | Unstructured Data |
---|---|---|
Format | Organized (tables) | Freeform (text, images) |
Storage | Relational databases (SQL) | NoSQL databases, file systems |
Querying | Easy to query | Difficult to analyze |
Examples | Spreadsheets, SQL databases | Emails, social media posts, videos |
+------+ | Data | --> +------+
+--------------+ | Requirements | --> +--------------+
+-----+ | ERD | --> +-----+
+----------+ | Database | --> +----------+
+-------+ | Table | +-------+
+------------+----------------------+---------------------+--------------------+ | customer | products | phone | address | +------------+----------------------+---------------------+--------------------+ | Alice | Laptop, Mouse | 123-456-7890 | 12 Main St, NY | | Bob | Phone, Charger | 987-654-3210 | 34 Elm St, CA | | Alice | Keyboard, Monitor | 123-456-7890 | 12 Main St, NY | +------------+----------------------+---------------------+--------------------+How to query all keyboard products?
+------------+----------------------+---------------------+--------------------+ | customer | products | phone | address | +------------+----------------------+---------------------+--------------------+ | Alice | Laptop, Mouse | 123-456-7890 | 12 Main St, NY | | Bob | Phone, Charger | 987-654-3210 | 34 Elm St, CA | | Alice | Keyboard, Monitor | 123-456-7890 | 12 Main St, NY | +------------+----------------------+---------------------+--------------------+How to query all products of the customer 1?
+------------+----------------------+---------------------+--------------------+ | customer | products | phone | address | +------------+----------------------+---------------------+--------------------+ | Alice | Laptop, Mouse | 123-456-7890 | 12 Main St, NY | | Bob | Phone, Charger | 987-654-3210 | 34 Elm St, CA | | Alice | Keyboard, Monitor | 123-456-7890 | 12 Main St, NY | +------------+----------------------+---------------------+--------------------+How to query all products of the customers in CA?
Problems:
+--------------+ | Unstructured | --> +--------------+
+------------+ | Structured | --> +------------+
+--------------+ | Unnormalized | --> +--------------+
+-----+ | NF1 | +-----+
Definition: A table is in First Normal Form when all its attributes contain only atomic (indivisible) values, and each record is unique
+----+-----------+-----------------+ | id | name | hobbies | +----+-----------+-----------------+ | 1 | Alice | Reading | | 2 | Bob | Cycling | | 3 | Charlie | Reading, Hiking | +----+-----------+-----------------+
+----+-----------+-----------------+ | id | name | hobbies | +----+-----------+-----------------+ | 1 | Alice | Reading | | 2 | Bob | Cycling | | 3 | Charlie | Reading | | 4 | Charlie | Hiking | +----+-----------+-----------------+
+------------+---------+----------------+-----------+------------+-------+ | customer_id| name | phone | products | city | state | +------------+---------+----------------+-----------+------------+-------+ | 1 | Alice | 123-456-7890 | Laptop | New York | NY | | 2 | Alice | 123-456-7890 | Mouse | New York | NY | | 3 | Bob | 987-654-3210 | Phone | California | CA | | 4 | Bob | 987-654-3210 | Charger | California | CA | | 5 | Alice | 123-456-7890 | Keyboard | New York | NY | | 6 | Alice | 123-456-7890 | Monitor | New York | NY | +------------+---------+----------------+-----------+------------+-------+How to query all keyboard products?
+------------+---------+----------------+-----------+------------+-------+ | customer_id| name | phone | products | city | state | +------------+---------+----------------+-----------+------------+-------+ | 1 | Alice | 123-456-7890 | Laptop | New York | NY | | 2 | Alice | 123-456-7890 | Mouse | New York | NY | | 3 | Bob | 987-654-3210 | Phone | California | CA | | 4 | Bob | 987-654-3210 | Charger | California | CA | | 5 | Alice | 123-456-7890 | Keyboard | New York | NY | | 6 | Alice | 123-456-7890 | Monitor | New York | NY | +------------+---------+----------------+-----------+------------+-------+How to query all products of the customer 1?
+------------+---------+----------------+-----------+------------+-------+ | customer_id| name | phone | products | city | state | +------------+---------+----------------+-----------+------------+-------+ | 1 | Alice | 123-456-7890 | Laptop | New York | NY | | 2 | Alice | 123-456-7890 | Mouse | New York | NY | | 3 | Bob | 987-654-3210 | Phone | California | CA | | 4 | Bob | 987-654-3210 | Charger | California | CA | | 5 | Alice | 123-456-7890 | Keyboard | New York | NY | | 6 | Alice | 123-456-7890 | Monitor | New York | NY | +------------+---------+----------------+-----------+------------+-------+How to query all products of the customers in CA?
Step | Description | Example |
---|---|---|
1. Identify repeating groups | Find attributes with multiple values in a single row | ❌ Alice → Laptop, Mouse |
2. Remove repeating groups | Split multi-valued attributes into separate rows | ✔ Alice → Laptop ✔ Alice → Mouse |
3. Ensure atomic values | Each field should hold a single value | ❌ "New York, NY 10001" → ✔ "New York" (city), "NY" (state), "10001" (zip) |
4. Add a unique identifier | Ensure each row has a primary key | ✔ Add "purchase_id" |
5. Remove duplicate rows | Ensure no duplicate records exist | ❌ Alice, Laptop (x2) → ✔ Remove extra entry |
Once data is in 1NF:
+------------+---------+----------------+-----------+------------+-------+ | customer_id| name | phone | products | city | state | +------------+---------+----------------+-----------+------------+-------+ | 1 | Alice | 123-456-7890 | Laptop | New York | NY | | 2 | Alice | 123-456-7890 | Mouse | New York | NY | | 3 | Bob | 987-654-3210 | Phone | California | CA | | 4 | Bob | 987-654-3210 | Charger | California | CA | | 5 | Alice | 123-456-7890 | Keyboard | New York | NY | | 6 | Alice | 123-456-7890 | Monitor | New York | NY | +------------+---------+----------------+-----------+------------+-------+
+------------+---------+----------------+-----------+------------+-------+ | customer_id| name | phone | products | city | state | +------------+---------+----------------+-----------+------------+-------+ | 1 | Alice | 123-456-7890 | Laptop | New York | NY | | 2 | Alice | 123-456-7890 | Mouse | New York | NY | | 3 | Bob | 987-654-3210 | Phone | California | CA | | 4 | Bob | 987-654-3210 | Charger | California | CA | | 5 | Alice | 123-456-7890 | Keyboard | New York | NY | | 6 | Alice | 123-456-7890 | Monitor | New York | NY | +------------+---------+----------------+-----------+------------+-------+
+------------+----------------+-----------+ | customer_id| phone | city | +------------+----------------+-----------+ | 1 | 123-456-7890 | New York | | 3 | 987-654-3210 | California| +------------+----------------+-----------+
+------------+-----------+ | product_id | product | +------------+-----------+ | 1 | Laptop | | 2 | Mouse | | 3 | Phone | +------------+-----------+
+----------+-------------+ | order_id | customer_id | +----------+-------------+ | 1 | 1 | | 2 | 3 | +----------+-------------+
CREATE TABLE OrderDetails (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id)
);
Step | Description | Example |
---|---|---|
1. Identify partial dependencies | Find attributes that depend on part of a composite key | ❌ (order_id, product_id) → product_name |
2. Remove partial dependencies | Move dependent attributes to a new table | ✔ Create "Products" table with product_id → product_name |
3. Ensure full dependency | All attributes must depend on the full primary key | ❌ order_id → customer_name → ✔ Store customer_name in "Customers" |
4. Split into separate tables | Break the table into independent entities | ✔ Orders, Customers, and Products as separate tables |
5. Use foreign keys | Link tables using primary and foreign keys | ✔ orders.customer_id → customers.customer_id |
+------------+----------------+-----------+ | customer_id| phone | city | +------------+----------------+-----------+ | 1 | 123-456-7890 | New York | | 3 | 987-654-3210 | California| +------------+----------------+-----------+
+------------+-----------+ | product_id | product | +------------+-----------+ | 1 | Laptop | | 2 | Mouse | | 3 | Phone | +------------+-----------+
+----------+-------------+ | order_id | customer_id | +----------+-------------+ | 1 | 1 | | 2 | 3 | +----------+-------------+
+------------+----------------+--------------+-------+ | customer_id| phone | city | state | +------------+----------------+--------------+-------+ | 1 | 123-456-7890 | New York | NY | | 3 | 987-654-3210 | Los Angeles | CA | +------------+----------------+--------------+-------+
State depends on City, not the other way around
+------------+----------------+---------+ | customer_id| phone | city_id | +------------+----------------+---------+ | 1 | 123-456-7890 | 101 | | 3 | 987-654-3210 | 102 | +------------+----------------+---------+
+---------+-------------+-------+ | city_id | city | state | +---------+-------------+-------+ | 101 | New York | NY | | 102 | Los Angeles| CA | +---------+-------------+-------+
A table in NF1 is in 3NF if all non-key attributes depend only on the primary key (NF2) and do not depend on other non-key attributes (NF3)
Step | Description | Example |
---|---|---|
1. Identify transitive dependencies | Find attributes that depend on a non-key attribute instead of the primary key | ❌ customer_id → city → state |
2. Remove transitive dependencies | Move dependent attributes to a separate table | ✔ Create "Cities" table with city → state |
3. Ensure direct dependency | All non-key attributes must depend only on the primary key | ❌ customer_id → city → state → ✔ Store city_id in "Customers" |
4. Split into separate tables | Move transitive dependencies to their own entities | ✔ Customers, Orders, and Cities as separate tables |
5. Use foreign keys | Link tables using primary and foreign keys | ✔ customers.city_id → cities.city_id |
Normal Form | Summmary |
---|---|
1NF | Eliminate repeating groups; ensure atomic values |
2NF | Remove partial dependencies; all attributes must depend on the whole primary key |
3NF | Remove transitive dependencies; non-key attributes must depend only on the primary key |
BCNF | Every determinant must be a candidate key; stricter than 3NF |
4NF | Eliminate multi-valued dependencies; each row represents a single fact |
5NF | Decompose tables to remove redundancy while preserving all joins |
+----------+------------+-----------------------+ | order_id | customer | products | +----------+------------+-----------------------+ | 1 | Alice | Laptop, Mouse | | 2 | Bob | Phone, Charger | +----------+------------+-----------------------+
+----------+-----------+-------------+ | order_id | customer | product_id | +----------+------------+------------+ | 1 | Alice | 101 | | 1 | Alice | 102 | | 2 | Bob | 103 | | 2 | Bob | 104 | +----------+-----------+-------------+
+----------+------------+----------+-------+ | order_id | customer | product | city | +----------+------------+----------+-------+ | 1 | Alice | Laptop | NY | | 1 | Alice | Mouse | NY | | 2 | Bob | Phone | CA | | 2 | Bob | Charger | CA | +----------+------------+----------+-------+
+------------+----------------+-------------+-------+---------+ | customer_id| phone | city | state | zip_code| +------------+----------------+-------------+-------+---------+ | 1 | 123-456-7890 | New York | NY | 10001 | | 3 | 987-654-3210 | Los Angeles | CA | 90001 | +------------+----------------+-------------+-------+---------+
+----------+-------------+ | order_id | customer_id | +----------+-------------+ | 1 | 1 | | 2 | 3 | +----------+-------------+
+------------+----------------+---------+ | customer_id| phone | city_id | +------------+----------------+---------+ | 1 | 123-456-7890 | 101 | | 3 | 987-654-3210 | 102 | +------------+----------------+---------+
+---------+-------------+---------+ | city_id | city | zip_code | +---------+-------------+---------+ | 101 | New York | 10001 | | 102 | Los Angeles| 90001 | +---------+-------------+---------+
+------+ | Data | --> +------+
+--------------+ | Requirements | --> +--------------+
+-----+ | ERD | --> +-----+
+----------+ | Database | --> +----------+
+-------+ | Table | +-------+
+---------+------------+---------------+ | part_id | name | parent_part_id| +---------+------------+---------------+ | 1 | Car | NULL | <-- Root (Top-Level Part) | 2 | Engine | 1 | <-- Engine belongs to Car | 3 | Wheel | 1 | <-- Wheel belongs to Car | 4 | Piston | 2 | <-- Piston belongs to Engine | 5 | Bolt | 4 | <-- Bolt belongs to Piston +---------+------------+---------------+
WITH RECURSIVE CarParts AS (
-- Start with the whole car (top-level part)
SELECT t1.part_id, t1.name AS part_name,
t1.parent_part_id AS parent_id,
NULL AS parent_name, 1 AS level
FROM parts AS t1
WHERE t1.parent_part_id IS NULL
UNION ALL
-- Recursively join parts to their parent parts and track levels
SELECT t2.part_id, t2.name, t2.parent_part_id,
c.part_name AS parent_name, c.level + 1
FROM parts AS t2
JOIN CarParts AS c ON t2.parent_part_id = c.part_id
)
SELECT * FROM CarParts;
+---------+------------+-----------+-------------+--------+
| part_id | part_name | parent_id | parent_name | level |
+---------+------------+-----------+-------------+--------+
| 1 | Car | NULL | NULL | 1 |
| 2 | Engine | 1 | Car | 2 |
| 3 | Wheel | 1 | Car | 2 |
| 4 | Piston | 2 | Engine | 3 |
| 5 | Bolt | 4 | Piston | 4 |
+---------+------------+-----------+-------------+--------+
+----------+------------+------------------+ | order_id | customer | products | +----------+------------+------------------+ | 1 | Alice | Laptop, Mouse | | 2 | Bob | Phone, Charger | +----------+------------+------------------+
Unnormalized (UNF) – The "products" contains multiple values
+----------+------------+----------+-------+ | order_id | customer | product | city | +----------+------------+----------+-------+ | 1 | Alice | Laptop | NY | | 1 | Alice | Mouse | NY | | 2 | Bob | Phone | CA | | 2 | Bob | Charger | CA | +----------+------------+----------+-------+
2NF – "city" depends on "customer"
Which of the following violates 1NF?
Answer: A
How do you fix a table that violates 1NF?
Answer: B
Which table violates 2NF?
Answer: B
How do you convert a 1NF table to 2NF?
Answer: B
Which table violates 3NF?
Answer: B
How do you fix a table that violates 3NF?
Answer: B
Which of these real-world cases breaks 1NF?
Answer: A
What is the key difference between 2NF and 3NF?
Answer: D
Which is NOT a benefit of normalization?
Answer: C
Which is a possible downside of normalization?
Answer: A
+------+ | Data | --> +------+
+-----+ | NF1 | --> +-----+
+--------------+ | Requirements | --> +--------------+
+-----------+ | NF2 + NF3 | --> +-----------+
+-------+ | Table | +-------+
+------+ | Data | --> +------+
+-----+ | NF1 | --> +-----+
+--------------+ | Requirements | --> +--------------+
+-----------+ | NF2 + NF3 | --> +-----------+
+-------+ | Table | +-------+
+------+ | Data | --> +------+
+-----+ | NF1 | --> +-----+
+--------------+ | Requirements | --> +--------------+
+-----------+ | NF2 + NF3 | --> +-----------+
+-------+ | Table | +-------+
❓ Which missions had the most hostile alien encounters?
+------+ | Data | --> +------+
+-----+ | NF1 | --> +-----+
+--------------+ | Requirements | --> +--------------+
+-----------+ | NF2 + NF3 | --> +-----------+
+-------+ | Table | +-------+
+------+ | Data | --> +------+
+-----+ | NF1 | --> +-----+
+--------------+ | Requirements | --> +--------------+
+-----------+ | NF2 + NF3 | --> +-----------+
+-------+ | Table | +-------+
Table: employees
employee_id | name | department | salary |
---|---|---|---|
1 | Alice | HR | 60000 |
2 | Bob | IT | 75000 |
3 | Charlie | Finance | 82000 |
Which SQL query retrieves the names of employees in the IT department?
Table: orders
order_id | customer_name | order_date | total_amount |
---|---|---|---|
101 | John | 2024-02-10 | 250.50 |
102 | Emma | 2024-02-11 | 180.75 |
103 | Michael | 2024-02-12 | 99.99 |
Write an SQL query to find the total sales amount for all orders placed after 2024-02-10
Table: students
student_id | name | age | gpa |
---|---|---|---|
1 | Jane | 21 | 3.8 |
2 | Tom | 22 | 3.5 |
3 | Lucy | 20 | 3.9 |
Write an SQL query to count how many students have a GPA greater than 3.6
Table: products
product_id | product_name | category | price |
---|---|---|---|
1 | Laptop | Electronics | 1200.00 |
2 | Headphones | Electronics | 150.00 |
3 | Shoes | Fashion | 80.00 |
Which SQL query retrieves the most expensive product in the Electronics category?
Table: customers
customer_id | name | city |
---|---|---|
1 | Emma | New York |
2 | Daniel | Los Angeles |
3 | Olivia | New York |
How would you write an SQL query to count how many customers are from New York?
Which action best helps a table comply with 3NF?
Answer: B
How do you resolve a table that violates 2NF?
Answer: B
What is the first step to ensure a table meets 1NF requirements?
Answer: B
What should you do to normalize a table from 1NF to 2NF?
Answer: A
What is the best way to choose a primary key in an entity-relationship diagram (ERD)?
Answer: A
Which SQL query retrieves all employees with a salary greater than 50,000?
Answer: B
Which SQL join returns only matching records from both tables?
Answer: C
Which statement best describes a foreign key?
Answer: B
Which SQL query correctly finds the total number of employees?
Answer: A
How do you fix a table that violates 3NF?
Answer: B
Which SQL query retrieves departments with more than 10 employees?
Answer: B
Which condition must be met for a table to be in 1st Normal Form (1NF)?
Answer: A
How does row count affect query performance?
ID | Name | Value |
---|---|---|
1 | Alice | 100 |
2 | Bob | 200 |
3 | Charlie | 150 |
4 | Dave | 300 |
5 | Eve | 250 |
ID | Name | Value |
---|---|---|
1 | Alice | 100 |
2 | Bob | 200 |
3 | Charlie | 150 |
4 | Dave | 300 |
5 | Eve | 250 |
...45 more rows |
ID | Name | Value |
---|---|---|
1 | Alice | 100 |
2 | Bob | 200 |
3 | Charlie | 150 |
4 | Dave | 300 |
5 | Eve | 250 |
...495 more rows |
Time estimates assuming 0.01ms (10µs) per row
Table | Row Count (n) | Operation | Time Complexity (Big O) | Total Time |
---|---|---|---|---|
Table A | 10,000 | Full Table Scan | O(n) = O(10,000) | 100ms |
Table B | 100,000 | Full Table Scan | O(n) = O(100,000) | 1,000ms (1s) |
Table C | 1,000,000 | Full Table Scan | O(n) = O(1,000,000) | 10,000ms (10s) |
Real-world examples of data generation
Example | Data Rate | Records per Second | Time to 1 Million Records |
---|---|---|---|
GPS Data (Driving, 1Hz) | 1 record per second | 1 | 1,000,000s (~11.6 days) |
GPS Data (Driving, 10Hz) | 10 records per second | 10 | 100,000s (~1.16 days) |
Social Media Posts (Busy Platform) | 1,000 posts per second | 1,000 | 1,000s (~16.7 minutes) |
IoT Sensor (High-Frequency) | 10,000 readings per second | 10,000 | 100s (~1.67 minutes) |
ID | Latitude | Longitude | Timestamp | PostText | SensorValue | Source |
---|---|---|---|---|---|---|
1 | 40.7128 | -74.0060 | 2025-02-23 10:00:00 | NULL | NULL | GPS (10Hz) |
2 | 40.7130 | -74.0058 | 2025-02-23 10:00:00.1 | NULL | NULL | GPS (10Hz) |
3 | NULL | NULL | 2025-02-23 10:00:01 | Great day! | NULL | Social Media |
4 | NULL | NULL | 2025-02-23 10:00:02.001 | NULL | 23.5 | IoT Sensor |
1000000 | 41.0000 | -73.0000 | 2025-02-24 15:23:45 | NULL | NULL | GPS (10Hz) |
Big O | n = 1 | n = 100 | n = 10,000 |
---|---|---|---|
O(1) | 1 | 1 | 1 |
O(log n) [base 2] | 0 | ~6.6 | ~13.3 |
O(n) | 1 | 100 | 10,000 |
O(n log n) [base 2] | 0 | ~660 | ~133,000 |
O(n²) | 1 | 10,000 | 100,000,000 |
FlowerID (Indexed) | Name | Color | Price |
---|---|---|---|
101 | Rose | Red | 2.50 |
305 | Tulip | Yellow | 1.75 |
204 | Lily | White | 3.00 |
FlowerID | Name | Color (Indexed) | Price |
---|---|---|---|
101 | Rose | Red | 2.50 |
102 | Tulip | Red | 1.75 |
204 | Lily | White | 3.00 |
205 | Daisy | Red | 1.25 |
Color is indexed, FlowerID varies widely
FlowerID (PK) | Name | Color (Indexed) | Price |
---|---|---|---|
101 | Rose | Red | 2.50 |
5000 | Tulip | Red | 1.75 |
999999 | Daisy | Red | 1.25 |
204 | Lily | White | 3.00 |
A B-tree groups Color values with pointers
B-tree Level | Structure |
---|---|
Root | [Blue, Red, White, Green] |
Leaf (Blue) | "Blue" → [300, 7500] |
Leaf (Red) | "Red" → [101, 5000, 999999] |
Leaf (White) | "White" → [204, 15000] |
Leaf (Green) | "Green" → [405, 6000, 888888] |
An indexed attribute can be found in log(n) steps using a B-tree
Year | Milestone |
---|---|
2009 | MariaDB project founded by Michael "Monty" Widenius after leaving MySQL |
2010 | MariaDB 5.1 released as a drop-in replacement for MySQL |
2012 | Debian and Fedora replace MySQL with MariaDB as the default database |
2013 | MariaDB 10 introduced with improved performance and replication |
2015 | MariaDB Corporation formed, MariaDB 10.1 released with encryption features |
2017 | MariaDB 10.2 adds JSON support and Window functions |
2019 | MariaDB 10.4 introduces system-versioned tables and new authentication |
2022 | MariaDB 10.9+ improves replication and backup features |
Feature | SQLite | PostgreSQL | MariaDB |
---|---|---|---|
License | Public Domain | PostgreSQL License (MIT-like) | GPLv2 (Community), Enterprise License |
Storage Type | Single-file DB | Client-Server | Client-Server |
Scalability | Not scalable (Single-file) | Highly scalable (Multi-threaded, Multi-node) | Scalable (Replication, Clustering) |
ACID Compliance | Partial (Write-ahead logging) | Fully ACID-compliant | Fully ACID-compliant |
Concurrency | Single-threaded | Multi-threaded with MVCC | Multi-threaded with MVCC |
Performance | Fast for small DBs | Optimized for complex queries | Optimized for OLTP workloads |
Use Cases | Embedded systems, Mobile apps | Data Warehousing, Enterprise Apps | Web apps, MySQL replacement |
Replication | Not supported | Asynchronous, Streaming Replication | Supported |
Understanding the implications of the GNU General Public License v2
Answer: B (To improve query performance)
Answer: C (Unique index)
Database | Index Creation |
---|---|
SQLite | CREATE INDEX idx_color ON Flowers(color); |
PostgreSQL | CREATE INDEX idx_color ON Flowers(color); |
MariaDB | CREATE INDEX idx_color ON Flowers(color); |
Which SQL query is used to define an index?
Correct answers: A
Which data structure is most commonly used to store indexes in SQL databases?
Correct answer: D
What is the key difference between a Binary Tree and a B-Tree?
Correct answer: D
What is the time complexity for finding a value in an indexed attribute versus a non-indexed attribute?
Correct answer: B
How does adding an index affect the storage complexity of a database?
Correct answer: A
In which scenario is adding an index not recommended?
Correct answer: A
Please read page 128 in Textbook 1 on Natural Joins and explain one risk when using a Natural Join
Without explicitly specifying the join condition, there is a risk of accidentally choosing the same name for a new column to be added to the table even though the columns are unrelated to one another. In which case there would be unintended inclusion or exclusion of columns due to automatically matched column names.
Please read page 141 in Textbook 1 on Multi-Table Query Performance and explain which types of queries typically require JOINS
Queries especially relevant to related information or simply that we are retrieving information from multiple sources. Querying aggregated data from multiple tables to perform calculations would require joins. The act of filtering data would also require joins because we can apply conditions that help eliminate duplicates or unnecessary information
Please read page 149 in Textbook 1 with Figure 7-12 and explain what unmatched rows mean and how an OUTER JOIN handles them
Unmatched rows are rows in a table that do not have corresponding matches in another table, Outer join handles them by providing a way to keep them. This is done by filling the missing values from the unmatched table with NULLs
Please read pages 67–71 in Textbook 2 on GROUP BY and explain the difference between the two queries:
SELECT type_id, COUNT(*) FROM foods GROUP BY type_id HAVING COUNT(*) < 20; SELECT type_id, COUNT(*) FROM foods;
The major difference between the two is that one uses GROUP BY and HAVING and the other is a simple select query. The use of GROUP BY and HAVING allows us to aggregate the rows in foods by type_id with the condition of “having” the count of less than 20. The second query doesn’t aggregate or filter the data. The output is very different
"For the modification to succeed, the expressions for all constraints must evaluate to true."
In the event where the values of expressions are compared, all constraints for the expressions must evaluate to true for the test of comparison to evaluate to true. Check involves verifying data integrity and so if all the constraints are met then the result is true otherwise false or null. The idea of a modification could be an insert or update, if any of the constraints evaluate to false then it is not successful and rejected
πname,price(σcolor=′red′∧stockQuantity>20(Flowers))
πname(σflowerID ≠ NULL(Flowers⋈Orders))
Using join before selection, the first expression performs a join first, then filters yellow flowers
It is inefficient since it joins all flowers with all orders, though only yellow flowers are needed
T1← σ color=′Yellow′ (F) T2← π name (T2) T3← T1⋈T2
Upon sending a GET /api/flowers?color=red request, the backend server processes it by extracting the query parameter
SELECT * FROM Flowers WHERE color = 'red';
Web interface:
Advantage: Accessible without SQL knowledge
Disadvantage: Limited query complexity
Console-based queries:
Advantage: Allows complex, customized queries
Disadvantage: Requires SQL knowledge
SELECT category, AVG(price) AS average_price FROM Flowers GROUP BY category; SELECT category, AVG(price) AS average_price FROM Flowers GROUP BY category HAVING AVG(price) > 10;
“Having” allows us to create the condition for price
CREATE TABLE flower ( id INT PRIMARY KEY, color VARCHAR(50), heightInCm INT CHECK (heightInCm > 0), petalCount INT CHECK (petalCount >= 1), species VARCHAR(100), bloomingSeason VARCHAR(50), price DECIMAL(10, 2) CHECK (price >= 0), fragranceLevel INT );
SELECT f.id AS flower_id, f.name AS flower_name, f.color, s.id AS supplier_id, s.name AS supplier_name, s.location FROM flower f INNER JOIN supplier s ON f.supplier_id = s.id;
SELECT c.id AS customer_id, c.name AS customer_name, c.email, o.id AS order_id, o.order_date, o.amount FROM customer c LEFT JOIN order o ON c.id = o.customer_id;
SELECT p.id AS product_id, p.name AS product_name, p.price, s.id AS sale_id, s.quantity_sold FROM product p RIGHT JOIN sale s ON p.id = s.product_id;
# 1️⃣ Clone the repository
git clone https://github.com/SE4CPS/DMS.git
# 2️⃣ Navigate to the project folder
cd DMS/project/part1
# 3️⃣ Check current teams
dir # (Windows) or ls (Mac/Linux)
# 4️⃣ Create a new team folder
mkdir team-n
cd team-n
# 5️⃣ Make your code changes
# 6️⃣ Stage your new folder
git add team-n
# 7️⃣ Commit your changes
git commit -m "Added team-n folder"
# 8️⃣ Push changes to the repository
git push origin main
Why is an ERD primarily useful in database design?
Correct: B
How many errors are in this SQL query?
SELECT name age FROM users WHERE username = "admin";
Correct: C
Which entity-relationship setup best fits an online course platform?
Correct: A
Which set of attributes best fits an ERD for a library system?
Correct: C
Which SQL data types best fit this sample data?
Correct: C
Which SQL constraints best enforce data integrity?
Correct: B
Which of the following SQL operations ensures data integrity by enforcing rules on the values allowed in a column?
Correct answer: A, B, C
Which approach is best for generating unique IDs in a database?
Correct: A, B, C
Why did Edgar F. Codd choose relational algebra for databases?
Correct: A
How does relational algebra operate on data?
Correct: A
Which relational algebra operation is used to retrieve specific columns from a relation?
Correct: B
Which transformation optimizes the following relational algebra query?
π_name(σ_age > 30 (Employee ⨝ Department))
Correct: C
Which SQLite dot command is used to display the database schema?
Correct: B
How do PostgreSQL and SQLite handle primary key generation differently?
Correct: A
How do a console, API, and web interface differ?
Correct: B
Which SQL command is used to add a new column to an existing table?
Correct: D
Which SQL command is used to create a new table?
Correct: A
Which query correctly retrieves departments with more than 10 employees?
Correct: B
Which SQL query correctly calculates the total salary, highest salary, lowest salary, average salary, and employee count per department?
Correct: B
Which query correctly groups employees by department and counts them?
SELECT department, COUNT(*) FROM employees ___ department;
Correct: A
Which query retrieves departments with more than 5 employees?
SELECT department, COUNT(*) FROM employees GROUP BY department ___ COUNT(*) > 5;
Correct: C
Which query filters before grouping rather than after?
SELECT department, COUNT(*) FROM employees ___ salary > 50000 GROUP BY department;
Correct: B
Which JOIN retrieves only matching rows from both tables?
SELECT employees.name, departments.name FROM employees ___ JOIN departments ON employees.dept_id = departments.id;
Correct: C
Which JOIN retrieves all rows from the left table and matching rows from the right?
SELECT employees.name, departments.name FROM employees ___ JOIN departments ON employees.dept_id = departments.id;
Correct: A
Which JOIN retrieves all rows from both tables, filling missing matches with NULLs?
SELECT employees.name, departments.name FROM employees ___ JOIN departments ON employees.dept_id = departments.id;
Correct: D
Which statement correctly describes normalization levels in databases?
Correct: A, C, D
Why is normalization important in database design?
Correct: A
How does indexing improve query speed?
Correct: C
When is indexing most beneficial?
Correct: D
A flower shop tracks orders and customers. Each order is linked to a customer. Propose four attributes for each table and write two CREATE TABLE statements ensuring good data quality with appropriate primary key, foreign key, data types and constraints, including CHECK constraints
The Inventor of the Relational Database Model
Edgar Codd introduced the relational model in 1970 to solve these issues
Example (Hierarchical Data): ├── Customers │ ├── Orders │ │ ├── Order Items │ │ ├── Payments
Before relational databases, data was difficult to manage, leading to inefficiency and inconsistency
Entities: Customer (CustomerID, Name, Email) Order (OrderID, CustomerID, OrderDate) Product (ProductID, Name, Price) Relations: Customer - Order (One-to-Many via CustomerID) Order - Product (Many-to-Many via OrderID & ProductID)
Which SQL query correctly retrieves all rows from the flower table where the species is 'Rose' and the color is 'Red'?
Correct: B
Which SQL query correctly retrieves only the species and color columns from the flower table?
Correct: A
Which SQL query correctly retrieves only the species and color columns from the flower table?
Correct: A
What is the purpose of the GROUP BY clause in SQL?
Correct: B
A florist wants to find out how many flowers of each type were sold. Which SQL query should they use?
Correct: B
Design an Entity-Relationship Diagram (ERD) for a flower ordering system with the following requirements:
The following table represents unnormalized flower order data:
| order_id | customer_name | customer_phone | order_date | flower_details | total_price | |----------|--------------|----------------|-------------|------------------------------|-------------| | 101 | Alice Green | 555-1234 | 2024-03-01 | Rose-Red-2, Tulip-Yellow-3 | 25.00 | | 102 | Bob White | 555-5678 | 2024-03-02 | Lily-White-1 | 10.00 |
Transform this data step by step into 1NF, 2NF, and 3NF
A florist wants to find flower types that were sold more than 10 times. Which SQL query should they use?
Correct: B
A florist wants to find the number of different flower types sold. Which SQL query should they use?
Correct: C
How many errors (1, 2, 3, or 4) are present in the following SQL query?
SELECT name, price FROM flowers ORDER BY color;
Correct: A (1 error)
Which SQL query correctly finds the least expensive flower?
Correct: C
In relational algebra optimization, which approach is typically used to improve query performance?
Correct: A
What is the difference between an Entity, a Relation, and a Table?
Correct: A
What do SQLite’s AUTOINCREMENT and PostgreSQL’s SERIAL have common?
Correct: A
Which SQL data type is best for storing the value 2.99?
Correct: C
What is the primary purpose of using JOINs in SQL?
Correct: A
Which SQL JOIN type typically generates the most data?
Correct: D
Which SQL JOIN type typically generates the least data?
Correct: A
Which of the following is not an aggregate function in SQL?
Correct: D
Why is relational algebra important for relational databases?
Correct: A
Which statement is true about the UNION operator in SQL?
JOIN
condition to workCorrect: C
Which SQL query retrieves all unique flower names that appear in either the spring_flowers or summer_flowers tables?
Correct: A
If you need a database that allows full customization without license restrictions, would you choose SQLite, PostgreSQL, or MariaDB? Consider their licensing models and explain your choice
Time: 8:00 - 9:00 AM
Why is an ERD primarily useful in database design?
Correct: D
How many errors are in the SQL query? Assume table, attributes, and data types are correct
SELECT name, stock FROM Flower WHERE name = 'rose';
Correct: A
Which set of entities and relationships best represents how a flower ordering system tracks customer purchase?
Correct: C
Which of the following table names is NOT allowed for an order tracking table in SQL?
Correct: B (ORDER is a reserved SQL keyword for sorting)
Which SQL data types best fit this flower ordering system sample data?
Correct: C
Which SQL constraints best enforce data integrity in a flower ordering system?
Correct: D
Which SQL constraint ensures that a flower's price is always greater than zero, enforcing data integrity?
Correct: C (price > 0),
When is AUTO_INCREMENT not a good choice for generating unique IDs in a flower ordering system?
Correct: B
What database concept did Edgar F. Codd invent?
Correct: B
What is the main purpose of relational algebra in relational databases?
Correct: B
Which relational algebra operation is used to filter rows in a relation?
Correct: B
Which transformation optimizes the following relational algebra query for a flower ordering system?
π_flowerName (σ_price > 10 (Orders ⨝ Flowers))
Correct: C
In SQLite, how many files are used to store the entire database?
Correct: A
What is a key difference between SQLite and PostgreSQL in how they run?
Correct: D
How do a database console, API, and web interface differ?
Correct: B
Which SQL command is used to remove a column from an existing table?
Correct: B or D
Which SQL command is used to delete an entire table from a database?
Correct: C
Which query correctly retrieves flower types that have been ordered more than 10 times?
Correct: B
Which JOIN retrieves only matching rows from both tables?
SELECT * FROM employees ___ JOIN departments ON employees.dept_id = departments.id;
Correct: C
Please draw an Entity-Relationship Diagram (ERD) for a flower supply system with two entities. Ensure that your diagram clearly indicates primary keys (PK), foreign keys (FK), and attribute types and constraints.
A transaction partially executes, and some operations succeed while others fail, leaving the database in an inconsistent state. Which ACID property is violated?
Correct Answer: A) Atomicity
Two transactions read and update the same record simultaneously, leading to inconsistent data due to uncommitted changes. Which ACID property is violated?
Correct Answer: C) Isolation
State | Description | SQL Keyword (SQLite, PostgreSQL, MariaDB, Oracle) |
---|---|---|
Active | The transaction is in progress and operations are being executed | BEGIN TRANSACTION; / START TRANSACTION; |
Partially Committed | All operations have been executed, but the final commit is pending | |
Committed | All changes are permanently saved to the database, completing the transaction | COMMIT; |
Failed | An error occurred during execution, preventing further progress | |
Rollback | The transaction is explicitly reversed and returning the database to its previous state | ROLLBACK; |
Concurrency control ensures that multiple transactions can execute simultaneously without interfering with each other, maintaining data integrity and consistency
Method | Description | Used In | Example (T1, T2, Value X) |
---|---|---|---|
Lock-Based Concurrency | Transactions acquire locks on data to prevent conflicts | PostgreSQL, MySQL (InnoDB), Oracle |
T1 locks X=10 and updates it to X=20 T2 tries to update X but must wait until T1 commits or rolls back |
Optimistic Concurrency Control | Transactions proceed without locking, but check for conflicts before committing | SQLite, PostgreSQL (MVCC), Oracle |
T1 reads X=10, performs calculations T2 updates X=20 and commits T1 tries to commit X=15 but detects conflict and must restart |
Timestamp-Based Concurrency | Each transaction gets a timestamp and is executed based on ordering rules | Oracle, PostgreSQL (MVCC), SQL Server |
T1 (timestamp 100) reads X=10 T2 (timestamp 200) updates X=20 and commits T1 tries to update X=15 but is aborted due to an older timestamp |
Multiversion Concurrency Control (MVCC) | Each transaction sees a snapshot of the database, avoiding locks | PostgreSQL, MySQL (InnoDB), Oracle |
T1 starts and sees X=10 T2 updates X=20 and commits T1 still sees X=10 until it commits and starts a new transaction |
Method | Best When ✅ | Not Ideal When ❌ |
---|---|---|
Lock-Based Concurrency | ✅ High-conflict environments where strict consistency is required (e.g., financial transactions) | ❌ High-read workloads, as locks can cause contention and slow down performance |
Optimistic Concurrency Control (OCC) | ✅ Read-heavy workloads with infrequent conflicts (e.g., analytics, dashboards) | ❌ High-write workloads where conflicts are common, leading to frequent transaction retries |
Timestamp-Based Concurrency | ✅ Systems that require strict ordering of transactions (e.g., blockchain, distributed databases) | ❌ Systems where real-time changes need immediate visibility to other transactions |
Multiversion Concurrency Control (MVCC) | ✅ Databases with many concurrent reads and some writes, avoiding read locks (e.g., PostgreSQL, MySQL InnoDB) | ❌ When strict consistency is required, as it may allow anomalies like write skew |
Isolation ensures that transactions execute independently, preventing conflicts and inconsistencies when multiple transactions run concurrently
Isolation Level | Description | Best When ✅ | Not Ideal When ❌ |
---|---|---|---|
Read Uncommitted | Allows transactions to read uncommitted changes from other transactions, leading to invalid reads | ✅ Performance is the highest priority and minor inconsistencies are acceptable | ❌ Data consistency is required, as invalid reads can occur |
Read Committed | Transactions can only read committed data, preventing invalid reads but allowing non-repeatable reads | ✅ General-purpose applications where occasional inconsistencies are tolerable | ❌ When repeatable reads are necessary to maintain consistency |
Repeatable Read | Ensures that transactions read the same data multiple times without seeing changes from other transactions | ✅ When avoiding non-repeatable reads is critical, such as in financial applications | ❌ Systems with high concurrency, as it may reduce performance due to increased locking |
Serializable | Enforces full isolation by executing transactions as if they run sequentially, eliminating concurrency issues | ✅ When absolute consistency is required, such as in banking and accounting systems | ❌ High-concurrency environments, as it can lead to significant performance bottlenecks |
Isolation Level | Transaction | Step | Value of X |
---|---|---|---|
Read Uncommitted | T1 | Starts and reads X | X=10 |
T2 | Starts, updates X to 20, does not commit | X=20 (invalid read visible to T1) | |
T1 | Reads X again | X=20 | |
T2 | Rolls back changes | X=10 (T1 already saw uncommitted changes) | |
Read Committed | T1 | Starts and reads X | X=10 |
T2 | Starts, updates X to 20, commits | X=20 | |
T1 | Reads X again | X=20 (sees committed changes) | |
T2 | Updates X to 25, commits | X=25 (T1 sees new committed changes) | |
Repeatable Read | T1 | Starts and reads X | X=10 |
T2 | Starts, updates X to 20, commits | X=20 | |
T1 | Reads X again | X=10 (does not see T2's changes) | |
T1 | Reads X once more | X=10 (still sees original snapshot) | |
Serializable | T1 | Starts and reads X | X=10 |
T2 | Starts, tries to update X to 20 | Blocked by T1 | |
T1 | Commits | X=10 | |
T2 | Updates X to 20 | X=20 (now visible to new transactions) |
Which isolation level allows invalid reads, making it the least strict but fastest?
Answer: D) Read Uncommitted
Which isolation level is generally recommended for preventing anomalies while maintaining good concurrency in most applications?
Answer: C) Read Committed
Hotel Reservation
SELECT RoomID FROM Rooms
WHERE RoomID NOT IN (
SELECT RoomID FROM Reservations WHERE CheckInDate < '2025-06-15' AND CheckOutDate > '2025-06-10')
LIMIT 1;
INSERT INTO Reservations (ReservationID, CustomerID, RoomID, CheckInDate, CheckOutDate)
VALUES (202, 15, 301, '2025-06-10', '2025-06-15');
INSERT INTO Payments (PaymentID, ReservationID, Amount, PaymentDate)
VALUES (501, 202, 499.99, '2025-03-15');
What happens if a transaction fails during execution?
Correct Answer: C. All changes are rolled back - Atomicity ensures transactions are all-or-nothing
Which statement best describes the Consistency property?
Correct Answer: B. Database moves from one valid state to another - Consistency maintains data integrity rules
What is the primary purpose of the Isolation property?
Correct Answer: C. To prevent concurrent transactions from interfering - Isolation ensures concurrent transactions appear sequential
Overlapping Hotel Reservations
ReservationID | RoomID | CheckInDate | CheckOutDate |
---|---|---|---|
301 | 101 | 2025-06-08 | 2025-06-12 |
302 | 101 | 2025-06-10 | 2025-06-15 |
303 | 101 | 2025-06-14 | 2025-06-18 |
304 | 102 | 2025-06-15 | 2025-06-20 |
305 | 103 | 2025-06-01 | 2025-06-05 |
Hotel Reservation with Transaction
START TRANSACTION;
SELECT RoomID FROM Rooms
WHERE RoomID NOT IN (
SELECT RoomID FROM Reservations WHERE CheckInDate < '2025-06-15' AND CheckOutDate > '2025-06-10')
LIMIT 1
FOR UPDATE;
INSERT INTO Reservations (ReservationID, CustomerID, RoomID, CheckInDate, CheckOutDate)
VALUES (202, 15, 301, '2025-06-10', '2025-06-15');
INSERT INTO Payments (PaymentID, ReservationID, Amount, PaymentDate)
VALUES (501, 202, 499.99, '2025-03-15');
COMMIT;
Transaction Locking in Hotel Reservation
Time | Transaction 1 (User A) – Locks Room 301 | Transaction 2 (User B) |
---|---|---|
T1 | START TRANSACTION; |
|
T2 | SELECT RoomID FROM Rooms WHERE RoomID NOT IN (...) LIMIT 1 FOR UPDATE; (Locks Room 301) |
|
T3 | INSERT INTO Reservations (...); |
(Waiting... Cannot access Room 301) |
T4 | INSERT INTO Payments (...); |
(Waiting... Still blocked) |
T5 | COMMIT; (Lock on Room 301 released) |
|
T6 | (Transaction 1 is done) | SELECT RoomID FROM Rooms WHERE RoomID NOT IN (...) LIMIT 1 FOR UPDATE; (Locks next available room) |
ACID Properties in Transactions
Property | Description |
---|---|
Atomicity | Ensures that a transaction is fully completed or fully rolled back. No partial execution |
Consistency | Ensures that the database remains in a valid state before and after a transaction |
Isolation | Ensures that concurrent transactions do not interfere with each other |
Durability | Ensures that once a transaction is committed, it is permanently saved even in case of system failure |
Oracle 1977 - today
Year | Milestone |
---|---|
1977 | Oracle Corporation founded in Santa Clara, California, by Larry Ellison |
1979 | Oracle Version 2 – first commercially available SQL-based RDBMS (1982 IBM DB2) |
1983-1985/td> | Introduced transaction management, concurrency control, read consistency, and client-server architecture |
1988-1992 | PL/SQL added, along with stored procedures, triggers, and integrity constraints |
1997-1999 | Object-relational features, partitioning, built-in Java support, and internet integration |
2001-2003 | Added XML support, Real Application Clusters (RAC), grid computing, and auto-tuning |
2007-2013 | Advanced compression, partitioning, multi-tenant architecture for cloud |
2018-2021 | AI-powered autonomous database, blockchain tables, and machine learning optimizations |
Relational Databases
Feature | SQLite | PostgreSQL | MariaDB | Oracle DB |
---|---|---|---|---|
Storage | Single-file database (lightweight) | Client-server architecture | Client-server architecture | Client-server architecture |
ACID Compliance | Yes | Yes | Yes | Yes |
Best For | Embedded systems, mobile apps | Complex queries, enterprise apps | MySQL alternative, web apps | Large-scale enterprise solutions |
Performance | Fast for small databases | Optimized for complex queries | Optimized for web applications | Highly optimized for enterprise workloads |
Scalability | Limited (single file) | High (supports replication, clustering) | High (supports clustering, replication) | Very High (supports RAC, sharding) |
Concurrency | Limited (single writer, multiple readers) | Strong concurrency support | Good concurrency support | Advanced concurrency management |
License | Public domain (free) | Open-source (PostgreSQL License) | Open-source (GPL) | Proprietary (some free versions) |
Licensing: Oracle vs Others
Database | License Type | Cost |
---|---|---|
Oracle | Proprietary (Oracle License) | Paid (Free for limited versions like Oracle XE) |
PostgreSQL | Open-source (PostgreSQL License, BSD-like) | Free |
MariaDB | Open-source (GPL v2) | Free |
SQLite | Public domain (No restrictions) | Free |
Lab Practice
Controls who can access data
Restrict DB/API access via unique API keys
postgresql://username:password@host:port/database?sslmode=require
Restrict DB/API access to specific IP addresses
Model | Who Controls Access? | Access Logic | Example | Supported by Relational DBs |
---|---|---|---|---|
RBAC | System (via Roles) | User has role with permission | Doctor role can read patient record | ✅ Widely Supported |
DAC | Record Owner | Owner grants access | Patient grants doctor access | ✅ Partially Supported |
MAC | System (via Labels) | Access requires clearance | Only staff with "Confidential" clearance can read | ❌ Rarely Supported |
GRANT SELECT ON patient_records TO doctor_alex;
CREATE ROLE physician;
CREATE USER dr_smith WITH PASSWORD 'secure123';
GRANT physician TO dr_smith;
GRANT SELECT ON employees TO physician;
Allows read-only access to a table
REVOKE SELECT ON employees FROM physician;
Removes previously granted privileges
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM PUBLIC;
Database | RBAC Support | IP-Based Access | API Key Control |
---|---|---|---|
SQLite | ❌ | ❌ | External Only |
PostgreSQL | ✅ Column-Level | ✅ via pg_hba.conf | External Only |
MariaDB | ✅ Roles + Fine-Grained | ✅ via host-based grants | External Only |
OracleDB | ✅ RBAC + VPD | ✅ via network ACLs | External Only |
Plain Text | Encrypted Text |
---|---|
Patient Name: Alice | U2FsdGVkX1+gTx4kV7K9Zw== |
SSN: 123-45-6789 | 8b1a9953c4611296a827abf8c47804d7 |
Diagnosis: Flu | 5d41402abc4b2a76b9719d911017c592 |
CREATE TABLE patients (
id SERIAL,
name TEXT,
ssn TEXT ENCRYPTED FOR STORAGE
);
Plain Text | Encrypted Text |
---|---|
Patient Name: Alice | U2FsdGVkX1+gTx4kV7K9Zw== |
SSN: 123-45-6789 | 8b1a9953c4611296a827abf8c47804d7 |
Diagnosis: Flu | 5d41402abc4b2a76b9719d911017c592 |
Database | Encryption at Rest | Encryption in Transit | Column-Level | TDE Support |
---|---|---|---|---|
SQLite | ❌ (via extensions) | ❌ | ❌ | ❌ |
PostgreSQL | ✅ (manual) | ✅ (SSL) | ✅ (PGCrypto) | ❌ |
MariaDB | ✅ | ✅ | ✅ (via functions) | ✅ |
OracleDB | ✅ | ✅ | ✅ | ✅ |
user = input("Enter username: ")
query = f"SELECT * FROM users WHERE name = '{user}'"
cursor.execute(query)
Input: alice' OR '1'='1
Query: SELECT * FROM users WHERE name = 'alice' OR '1'='1'
Bypasses authentication and returns all users
user = input("Enter username: ")
query = "SELECT * FROM users WHERE name = %s"
cursor.execute(query, (user,))
Using parameters prevents SQL injection
import logging
logging.basicConfig(level=logging.INFO)
query = "SELECT * FROM patients"
logging.info(f"Running query: {query}")
cursor.execute(query)
query = "SELECT * FROM patients WHERE name = %s"
params = ("Alice",)
logging.info("Query: %s | Params: %s", query, params)
cursor.execute(query, params)
log_statement = 'all'
general_log = ON
A doctor’s laptop with a local database of patient records is stolen. The database file is unprotected and contains names, SSNs, and diagnoses
What security measure would have reduced the risk?
A staff member accessed over 200 patient records within 5 minutes — far beyond typical behavior for their role
What access control or monitoring strategy should be used?
A public-facing web app allowed attackers to run raw SQL through a login form, exposing the entire user database
What protection should have been implemented?
CREATE EXTENSION IF NOT EXISTS pgcrypto;
Technical Aspect | PGP Encryption (Asymmetric) | Symmetric Encryption (AES, DES) |
---|---|---|
Key Type | Public-Private Key Pair | Single Shared Secret Key |
Encryption Speed | Slower; computationally intensive | Faster; suitable for large data |
Key Management | Complex; public key distribution required | Simpler; secure key sharing essential |
Security Strength | High; secure over insecure channels | Medium to High; risk if key compromised |
Use Case Examples | Email encryption, digital signatures | File encryption, database encryption |
Reference: PostgreSQL pgcrypto Documentation
DROP TABLE secure_data;
CREATE TABLE IF NOT EXISTS secure_data(
id SERIAL PRIMARY KEY,
sensitive_info BYTEA NOT NULL
);
INSERT INTO secure_data (sensitive_info)
VALUES (pgp_sym_encrypt('Secret Information', 'my_strong_password'));
SELECT id, ENCODE(sensitive_info, 'hex') AS encrypted_info
FROM secure_data;
SELECT id, pgp_sym_decrypt(sensitive_info, 'my_strong_password')
AS decrypted_info
FROM secure_data;
SET password_encryption = 'scram-sha-256';
CREATE ROLE username WITH LOGIN PASSWORD 'your_secure_password';
SELECT rolname, rolsuper, rolcreaterole, rolcreatedb, rolcanlogin
FROM pg_roles;
SET password_encryption = 'md5';
CREATE ROLE username WITH LOGIN PASSWORD 'your_secure_password';
CREATE USER Bob PASSWORD 'bob_pass_123!';
CREATE USER Alice PASSWORD 'alice_pass_123!';
-- Grant full access to Bob
GRANT ALL PRIVILEGES ON Flowers TO Bob;
-- Grant read-only access to Alice
GRANT SELECT ON Flowers TO Alice;
-- Show all users and roles
SELECT rolname FROM pg_roles;
-- Switch user temporarily
SET ROLE Alice;
-- Switch back
RESET ROLE;
-- Switch to user 'Alice' (read-only)
SET ROLE Alice;
-- Successful query (Alice can SELECT)
SELECT * FROM flowers;
-- Failed query (Alice cannot DELETE)
DELETE FROM flowers WHERE id = 1;
RESET ROLE;
-- Switch to user 'bob' (full access)
SET ROLE Bob;
-- Successful query (Bob can DELETE)
DELETE FROM flowers WHERE id = 1;
RESET ROLE;
REVOKE INSERT ON flowers FROM Bob;
CREATE ROLE viewer;
CREATE ROLE manager;
CREATE USER alice PASSWORD 'alice_pass';
CREATE USER bob PASSWORD 'bob_pass';
GRANT viewer TO alice;
GRANT manager TO bob;
GRANT SELECT ON flowers_core TO viewer;
GRANT SELECT, INSERT, UPDATE ON flowers_pricing TO manager;
GRANT alice TO current_user;
SET ROLE alice;
SELECT * FROM flowers_core;
SELECT * FROM flowers_pricing;
RESET ROLE;
CREATE TABLE flowers_core (
id INTEGER PRIMARY KEY,
name TEXT
);
CREATE TABLE flowers_pricing (
id INTEGER PRIMARY KEY,
price REAL
);
INSERT INTO flowers_core (id, name) VALUES (1, 'Rose');
INSERT INTO flowers_pricing (id, price) VALUES (1, 2.99);
GRANT SELECT ON flowers_core TO viewer;
GRANT SELECT, INSERT, UPDATE ON flowers_pricing TO manager;
Technical Area | Importance of Logging |
---|---|
Query Performance | Identify slow queries and optimize them |
Security | Track unauthorized access attempts |
Error Diagnosis | Understand and debug failed operations |
Auditing | Maintain compliance and trace changes |
Replication & Backup | Ensure consistency and troubleshoot issues |
-- In postgresql.conf:
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d.log'
log_statement = 'all'
log_min_error_statement = error
log_min_messages = warning
GRANT USAGE ON SCHEMA public TO Alice;
GRANT SELECT ON team1_flowers TO Alice;
-- GRANT ALL PRIVILEGES ON team1_flowers TO Alice;
GRANT Alice TO current_user;
SET ROLE Alice;
SELECT * FROM team1_flowers;
RESET ROLE;
sqlite3 mydatabase.db
.echo ON
.tables
.schema
sqlite3 mydatabase.db
.echo ON
.output my_log.txt
.schema
Technical Aspect | SQLCipher | Standard SQLite |
---|---|---|
Encryption Method | Transparent AES-256 Encryption | No built-in encryption |
Data-at-rest Security | Strong; protects database files on disk | Weak; plaintext data storage |
Performance Impact | Moderate overhead due to encryption operations | Minimal; faster access (unencrypted) |
Use Cases | Mobile apps, sensitive data storage | General-purpose lightweight storage |
Key Management | Application-managed passphrase | Not applicable |
Reference: SQLCipher GitHub Documentation
sqlcipher mydatabase.db
sqlite> PRAGMA key = 'your_secure_password';
sqlite> CREATE TABLE flowers(id INTEGER, name TEXT);
sqlite> INSERT INTO flowers VALUES (1, 'Rose');
sqlite> .quit
A bike rental company uses a PostgreSQL database to manage rentals, customer accounts, and location check-ins.
Which role-based controls, logging settings, and SQL security features would you apply?
A local furniture business tracks product inventory, employee tasks, and customer orders in PostgreSQL.
How would you design user roles, query logging, and local database encryption?
A flower shop uses both an online order system and an in-store terminal, with all data synced to a central PostgreSQL database.
What access control model, SQL safety, and fragmentation strategy makes sense here?
A staff member accessed over 200 patient records within 5 minutes — far beyond typical behavior for their role
What access control or monitoring strategy should be used?
A public-facing web app allowed attackers to run raw SQL through a login form, exposing the entire user database
What protection should have been implemented?
Aspect | Horizontal Partitioning | Vertical Partitioning |
---|---|---|
Definition | Rows are split across nodes (sharding) | Columns are split across nodes |
Structure | Each shard has all columns but different rows | Each partition holds subset of columns |
Performance | Good for large-scale read/write traffic | Good for optimizing access to specific attributes |
Use Case | User data by region or ID | Separating frequently vs infrequently accessed fields |
Challenge | Re-assembling data across shards | Joining data from multiple partitions |
Replication Type | Description | Pros | Cons |
---|---|---|---|
Primary-Secondary | One primary node handles writes, secondaries replicate and serve reads | Simpler to manage; Good for read-heavy workloads | Primary can become a bottleneck; No write scalability |
Multi-Primary | Multiple nodes can accept writes and sync changes | Increased availability and write scalability | Conflict resolution can be complex |
Peer-to-Peer | All nodes are equal and can read/write | Highly scalable; Fault-tolerant | Complex conflict resolution; Requires more coordination |
Synchronous | Data is replicated to all nodes in real-time | Strong consistency | Higher latency; Reduced availability during failures |
Asynchronous | Replication happens with a delay | Low latency; Better performance | Risk of data inconsistency; May cause eventual consistency issues |
Model | Description | Use Case |
---|---|---|
Strong Consistency | Every read reflects the most recent write | Transactions, critical systems |
Eventual Consistency | Reads may be stale, but data will converge over time | Social media feeds, caches |
Causal Consistency | Operations that are causally related are seen in order | Collaborative apps, messaging |
Read-Your-Writes | A process always reads its own updates | User sessions, dashboards |
Monotonic Reads | Reads never go backward in time | Progressive data views |
import psycopg2
conns = [psycopg2.connect(host=h, dbname="garden", user="u", password="p") for h in ["db1", "db2", "db3"]]
def replicate(flower_id, flower_data):
for conn in conns:
conn.cursor().execute("INSERT INTO flowers (id, data) VALUES (%s, %s)", (flower_id, flower_data))
conn.commit()
replicate(7, '{"type": "Rose", "color": "Red"}')
import psycopg2
conns = [psycopg2.connect(host=h, dbname="garden", user="u", password="p") for h in ["db1", "db2", "db3"]]
def insert_shard(flower_id, flower_data):
conn = conns[flower_id % len(conns)]
conn.cursor().execute("INSERT INTO flowers (id, data) VALUES (%s, %s)", (flower_id, flower_data))
conn.commit()
insert_shard(7, '{"type": "Rose", "color": "Red"}')
import psycopg2
partitions = {
"Rose": psycopg2.connect(host="db1", dbname="garden", user="u", password="p"),
"Tulip": psycopg2.connect(host="db2", dbname="garden", user="u", password="p"),
"Daisy": psycopg2.connect(host="db3", dbname="garden", user="u", password="p")
}
def insert_by_type(flower_type, flower_data):
conn = partitions[flower_type]
conn.cursor().execute("INSERT INTO flowers (type, data) VALUES (%s, %s)", (flower_type, flower_data))
conn.commit()
insert_by_type("Rose", '{"color": "Red", "season": "Spring"}')
Aspect | Distributed Databases | Traditional SQL Databases |
---|---|---|
Complexity | High setup & maintenance complexity | Simpler to deploy and manage |
Consistency | Eventual consistency trade-offs | Strong ACID consistency |
Latency | Higher due to network coordination | Lower in local settings |
Scalability | Horizontally scalable | Limited vertical scalability |
Query Support | Limited joins, often denormalized | Rich SQL query and joins |
CAP Theorem | Cannot guarantee all 3: Consistency, Availability, Partition Tolerance | Not designed for partition tolerance |
Which describes semantic horizontal partitioning in distributed databases?
What does the CAP theorem state about distributed systems?
Which database is commonly categorized as CP in the CAP theorem?
What kind of partitioning splits data by rows?
In distributed databases, what is a typical challenge of replication?
Which replication type allows multiple nodes to accept writes?
Which consistency model ensures every read returns the latest write?
What kind of partitioning splits columns across different nodes?
Why is single-node MySQL considered AC (not CP)?
What is a benefit of inserting the same flower data into both PostgreSQL and SQLite?
Which node stores "Rose" flower data in our semantic partitioning example?
Which concept does hash-based partitioning demonstrate in our example?
Which describes semantic horizontal partitioning in distributed databases?
What does the UUID field help us achieve in a distributed database?
Why do we use UUID instead of primary key when syncing data?
Which operation simulates fault tolerance in our application?
What does the manual sync function do in our demo?
What feature does the query aggregation function demonstrate?
Which of the following best describes our demo's architecture?
Class Schedule:
Time: 08:00 AM – 09:20 AM on Monday, Wednesday, Friday
Location: John T Chambers Technology Center 114 (CTC 114)
A Database Management System (DMS) is designed for the efficient storage, access, and update of data. This course explores DMS from two main perspectives:
Teaching Assistant: | Ms. Konika Reddy Saddikuti |
Email: | k_saddikuti@u.pacific.edu |
Office Location: | Student Support Center |
Office Hours: | By appointment |
Instructor: | Dr. Solomon Berhe |
Email: | sberhe@pacific.edu |
Zoom: | Zoom Meeting Link |
Office: | CTC 117 |
Office Hours: | Mon/Wed/Fri, 2:00–3:15 PM (or by appointment) |
Textbooks:
Homework: | Includes written exercises and programming assignments. Submit via Canvas or GitHub. |
Team Projects: | Two projects (database design, development, and application integration) |
Exams: | One midterm and one final. Students may bring one handwritten note sheet (8.5x11) |
All course materials, announcements, and assignments will be posted on Canvas. Check regularly for updates
Students are expected to act with integrity, honesty, and responsibility. Violations of the Honor Code will be reported to the Office of Student Conduct and Community Standards
For more details, see the University Academic Honesty Policy
Source Document: Course Syllabus
Table of Contents | DMS | Office Hours: Mon, Wed, Fri 2:00 PM - 3:15 PM, CTC 117 | sberhe@pacific.edu | k_saddikuti@u.pacific.edu | Syllabus | Join Zoom | GitHub Repository | Docs | SQL IDE | PlantUML