Course: COMP 163
Time: 08:00 AM - 09:15 AM on Monday, Wednesday, Friday
Location: John T Chambers Technology Center 114 (CTC 114)
| Week | Topic | Assignment (Canvas + GitHub) | Database | Comment |
|---|---|---|---|---|
| 1 (Jan 12, Jan 14, Jan 16) | Introduction to Database Management System | Lab 1 + Homework 1 (2w) | SQLite | |
| 2 (Jan 21, Jan 23) | Entity Relationship Model | Holiday on Monday (Jan 19) | ||
| 3 (Jan 26, Jan 28, Jan 30) | Relational Algebra | Lab 2 + Homework 2 (2w) | ||
| 4 (Feb 2, Feb 4, Feb 6) | Relational Database Interfaces | PostgreSQL | ||
| 5 (Feb 9, Feb 11, Feb 13) | Relational Database Advanced SQL | Lab 3 + Homework 3 (2w) | ||
| 6 (Feb 18, Feb 20) | Relational Database Normalization | Project Part 1 (4w) | Holiday on Monday (Feb 16) | |
| 7 (Feb 23, Feb 25, Feb 27) | Relational Database Table Indexing | MariaDB | ||
| 8 (Mar 2, 4, 6) | Midterm Preparation | Midterm (Mar 6) | ||
| 9 (–) | Spring Break | |||
| 10 (Mar 16, 18, 20) | Relational Database Transactions | Lab 4 + Homework 4 (2w), Project Part 2 (4w) | OracleSQL | |
| 11 (Mar 23, 25, 27) | Relational Database Security | |||
| 12 (Mar 30, Apr 1, 3) | Relational Database and Data Distribution | Lab 5 + Homework 5 (2w) | ||
| 13 (Apr 6, 8, 10) | Relational Database and Data Aggregation | ClickHouse | ||
| 14 (Apr 13, 15, 17) | Relational Database and NoSQL, Semester Review | Final Preparation | ||
| 15 (Apr 20, 22, 24) | Project Presentation (15 min) | |||
| 16 (Apr 27) | Project Presentation (15 min) | Classes End (April 28) | ||
| 17 (May 4) | Final Exam: Monday, May 4th, 10-11 AM |
Flower
------
id: 101
name: "Rose"
color: "Red"
price: 2.50
Flowers
-------
101 | Rose | Red | 2.50
102 | Tulip | Yellow | 1.75
103 | Lily | White | 3.00
104 | Daisy | White | 1.25
105 | Orchid | Purple | 4.50
...
(Thousands of rows)
Rule:
price >= 0
✔ Rose | 2.50
✔ Tulip | 1.75
✘ Lily | -3.00 ← rejected
SELECT name, price
FROM flowers
WHERE color = 'White';
Result:
Lily | 3.00
Daisy | 1.25
App / Website / API / Software
↓
--------- DBMS ---------
↓
Database (disk / SSD)
| System | How it Stores Data | How you Query | Main Problem |
|---|---|---|---|
| Paper | Handwritten records | Search by reading | Slow, hard to scale |
| Files (folders) | Documents in directories | Manual naming + search | Duplicates, inconsistent versions |
| Spreadsheets | Rows and columns | Filters and formulas | Hard for multi user + rules |
| Custom App (ad hoc) | Program specific format | Whatever the app supports | No standard, hard to maintain |
| Type | Data Shape | Good For | Example |
|---|---|---|---|
| Relational | Tables (rows/columns) | Business data, consistency | PostgreSQL, MySQL |
| Document | JSON like documents | Flexible fields, web apps | MongoDB |
| Key Value | Key → value | Caching, sessions | Redis |
| Column Store | Columns (analytics) | Fast aggregates | ClickHouse |
| Graph | Nodes + edges | Relationships, networks | Neo4j |
Each component has a distinct role, but all three must work together.
-- Generated by an LLM
DELETE FROM orders;
-- Missing WHERE clause
-- Result: all rows deleted
Conclusion: LLMs can assist, but humans must review SQL before execution.
flowers(id, name, color, price)
orders(id, flowerId, qty)
| StudentId | Name | Major |
|---|---|---|
| 101 | Ana | CS |
| 102 | Ben | EE |
| 103 | Cam | CS |
SELECT name, price
FROM flowers
WHERE price > 2.00;
Can this query be used in programs written in Java, C++, and Python?
Correct: D
SQL is a database query language. The same SQL query can be sent from Java, C++, Python, or any other language through a database API.
Connection conn = DriverManager.getConnection(url);
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("""
SELECT name, price
FROM flowers
WHERE price > 2.00
""");
conn = sqlite3.connect("flowers.db")
cur = conn.cursor()
cur.execute("""
SELECT name, price
FROM flowers
WHERE price > 2.00
""")
sqlite3* db;
sqlite3_open("flowers.db", &db);
sqlite3_exec(db,
"SELECT name, price FROM flowers WHERE price > 2.00",
nullptr, nullptr, nullptr);
const db = new sqlite3.Database("flowers.db");
db.all(
`SELECT name, price
FROM flowers
WHERE price > 2.00`, callback);
Which component is executing the SQL query?
Correct: C
The DBMS parses, plans, and executes SQL. Programming languages only send the query and receive results.
Java · Python · C++ · JS
send SQL string
parse
plan
execute
flowers.db
tables
rows
indexes
What is the role of a programming language when working with a database?
Correct: C
Applications use programming languages to communicate with the DBMS, but SQL remains the interface for data access.
# step-by-step instructions
# how to do it
for flower in flowers:
if flower.price > 2.00:
print(flower.name, flower.price)
-- describe the result
-- what you want
SELECT name, price
FROM flowers
WHERE price > 2.00;
Query A:
SELECT * FROM flowers;
Query B:
INSERT INTO flowers(name, price)
VALUES ('Rose', 2.50);
.db)
flowers.db
├─ flowers
├─ suppliers
└─ orders
Java · Python · C++ · JS
send SQL string
parse
plan
execute
flowers.db
tables
rows
indexes
sqlite3 flowers.db
enter SQL
| Type | Description | Flower Example |
|---|---|---|
| INTEGER | Whole numbers | flower_id |
| REAL | Decimal numbers | price |
| TEXT | Strings | name, color |
| BLOB | Binary data | image |
| NULL | Missing value | unknown_color |
flowers(
id,
name,
color,
price
)
| Column | Type | Reason |
|---|---|---|
| id | INTEGER | Primary identifier |
| name | TEXT | Flower name |
| color | TEXT | Readable attribute |
| price | REAL | Decimal value |
CREATE TABLE flowers (
id INTEGER PRIMARY KEY,
name TEXT,
color TEXT,
price REAL
);
price REAL CHECK (price >= 0)
.db file
flowers.db
├─ flowers
├─ suppliers
└─ orders
| Type | Description | Flower Example |
|---|---|---|
| INTEGER | Whole numbers | flower_id |
| REAL | Decimal numbers | price |
| TEXT | Strings | name, color |
| BLOB | Binary data | image |
| NULL | Missing value | unknown_color |
flowers(
id,
name,
color,
price
)
| Column | Type | Reason |
|---|---|---|
| id | INTEGER | Unique identifier |
| name | TEXT | Flower name |
| color | TEXT | Human readable |
| price | REAL | Decimal value |
CREATE TABLE flowers (
id INTEGER PRIMARY KEY,
name TEXT,
color TEXT,
price REAL
);
INTEGER PRIMARY KEY is special
price REAL CHECK (price >= 0)
inventoryid INTEGER PRIMARY KEYname TEXT NOT NULLcategory TEXT NOT NULLprice REAL NOT NULLquantity INTEGER NOT NULL
CREATE TABLE inventory (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
category TEXT NOT NULL,
price REAL NOT NULL,
quantity INTEGER NOT NULL
);
.schema inventory
INSERT INTO inventory (name, category, price, quantity)
VALUES ('Oak Table', 'Table', 299.99, 4);
INSERT INTO inventory (name, category, price, quantity)
VALUES ('Leather Sofa', 'Sofa', 899.00, 2);
INSERT INTO inventory (name, category, price, quantity)
VALUES ('Desk Lamp', 'Lamp', 39.50, 12);
SELECT id, name, category, price, quantity
FROM inventory;
SELECT name, quantity
FROM inventory
WHERE quantity <= 3;
supplier (TEXT)supplier to inventory
ALTER TABLE inventory
ADD COLUMN supplier TEXT;
.schema inventory
UPDATE inventory
SET supplier = 'Pacific Furnishings'
WHERE name = 'Oak Table';
UPDATE inventory
SET supplier = 'West Coast Leather'
WHERE name = 'Leather Sofa';
UPDATE inventory
SET supplier = 'BrightHome'
WHERE name = 'Desk Lamp';
SELECT id, name, supplier, quantity
FROM inventory;
SELECT name, category, quantity
FROM inventory
WHERE supplier = 'BrightHome';
UNIQUE, NOT NULL, DEFAULTFirst the concept, then the SQL (using our Furniture Store case study).
inventory
---------
id | name | category | price | quantity
1 | Oak Table | Table | 299.99 | 4
2 | Leather Sofa | Sofa | 899.00 | 2
3 | Desk Lamp | Lamp | 39.50 | 12
Good:
id = 1, 2, 3, 4, ...
Bad:
id = 1, 1, 2, 3 (duplicate)
Without a key:
UPDATE inventory
SET price = 10
WHERE category = 'Lamp';
Could affect many rows.
inventoryid
id is not "meaningful".
It is just an identifier.
name/category can change
id should not.
CREATE TABLE inventory (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
category TEXT NOT NULL,
price REAL NOT NULL,
quantity INTEGER NOT NULL
);
INTEGER PRIMARY KEY can be auto-generatedid during INSERT
INSERT INTO inventory (name, category, price, quantity)
VALUES ('Oak Table', 'Table', 299.99, 4);
-- id assigned automatically
id INTEGER PRIMARY KEY AUTOINCREMENT
CREATE TABLE inventory (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
category TEXT NOT NULL,
price REAL NOT NULL,
quantity INTEGER NOT NULL
);
SQLite will keep increasing id values even if rows are deleted.
SELECT id, name, category, price, quantity
FROM inventory;
UPDATE inventory
SET price = 279.99
WHERE id = 1;
DELETE FROM inventory
WHERE id = 1;
Primary keys exist for correctness and control.
Rules enforced by the database to protect data quality.
price REAL NOT NULL
sku TEXT UNIQUE
quantity INTEGER DEFAULT 0
Prevents missing critical information.
name TEXT NOT NULL,
category TEXT NOT NULL,
price REAL NOT NULL
Prevents duplicate values across rows.
sku TEXT UNIQUE
Provides a value when none is supplied.
quantity INTEGER DEFAULT 0
CREATE TABLE inventory (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
category TEXT NOT NULL,
sku TEXT UNIQUE,
price REAL NOT NULL,
quantity INTEGER DEFAULT 0
);
Real systems connect data across tables.
| id (PK) | name | country |
|---|---|---|
| 1 | Pacific Furnishings | USA |
| 2 | West Coast Leather | USA |
| 3 | BrightHome | Canada |
| id (PK) | name | category | supplier_id (FK) |
|---|---|---|---|
| 101 | Oak Table | Table | 1 |
| 102 | Leather Sofa | Sofa | 2 |
| 103 | Desk Lamp | Lamp | 3 |
Each inventory item references exactly one supplier using a foreign key.
The database enforces that supplier_id = 1 must exist in the
supplier table before the inventory row can be inserted.
A reference to a primary key in another table.
supplier_id INTEGER
REFERENCES supplier(id)
CREATE TABLE supplier (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
contact_email TEXT UNIQUE
);
CREATE TABLE inventory (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
category TEXT NOT NULL,
price REAL NOT NULL,
quantity INTEGER DEFAULT 0,
supplier_id INTEGER,
FOREIGN KEY (supplier_id)
REFERENCES supplier(id)
);
Constraints and keys are not optional. They are the foundation of reliable, maintainable databases.
The furniture store now wants to manage suppliers separately.
supplierThink about what uniquely identifies a supplier and what information should never be missing.
id : unique supplier identifiername : supplier nameemail : contact emailphone : optional contact numberINTEGER PRIMARY KEY AUTOINCREMENTname is NOT NULLemail is UNIQUE
CREATE TABLE supplier (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
phone TEXT
);
.schema supplier
supplier_id to inventorysupplier(id)In SQLite, foreign keys must be enabled explicitly.
PRAGMA foreign_keys = ON;
Run this before creating or using foreign key constraints.
CREATE TABLE inventory (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
category TEXT NOT NULL,
price REAL NOT NULL,
quantity INTEGER DEFAULT 0,
supplier_id INTEGER NOT NULL,
FOREIGN KEY (supplier_id)
REFERENCES supplier(id)
);
You now have: entities, primary keys, constraints, and a foreign key relationship enforced by SQLite.
Due: January 30, 11:59 PM
Goal: model, design, and implement a small relational database from a real-world
scenario.
A local library wants to track books loaned to members.
Entities are the things the system needs to track.
| Good Entities | Bad Entities |
|---|---|
| Book | City |
| Member | Chair |
| Staff | Building |
Relationships describe how entities are connected.
Output: libraryERD.png
@startuml
entity "Book" {
+book_id : INTEGER
title : TEXT
isbn : TEXT
}
entity "Member" {
+member_id : INTEGER
name : TEXT
email : TEXT
}
Book -- Member : borrows
@enduml
Verify using the .schema command
| Operation | What You Do |
|---|---|
| Create | INSERT at least 3 rows |
| Read | SELECT all rows |
| Update | Change a value in one row |
| Delete | Remove one specific row |
This lab connects concepts → design → SQL implementation.
Due: January 30, 11:59 PM
Focus: conceptual understanding of SQL, SQLite, and course expectations.
This homework ensures you understand the role of SQL, the difference between language and system, and the scope of the course.
Please read Chapters 1 and 2 of the following books:
Goal: general understanding, not mastery of syntax.
Do not worry about remembering every command.
Answer in your own words. Short, clear explanations are preferred.
Explain the purpose of each keyword:
CREATESELECTUPDATEDELETEINSERTPrecision of thought matters more than syntax.
Please read the course syllabus carefully:
Submitting this homework confirms that you reviewed the syllabus.
This homework sets the foundation for everything that follows.
-- Work order:
-- 1) requirements
-- 2) ERD
-- 3) SQL schema
-- 4) queries
The system should allow users to rent bikes easily
The system should allow registered users to rent a bike by selecting a location, choosing an available bike, and completing payment through the application within three steps
Purpose: Bike Rental Database System to manage rentals and track customers, bikes, and transactions
-- Nouns: user, bike, location, payment, transaction
-- Verbs: rent, select, choose, pay
-- Candidate entities:
-- - Customer
-- - Bike
-- - Location
-- - Payment
-- - RentalTransaction
-- Entity candidates:
-- Customer
-- Bike
-- RentalTransaction
-- Location
-- Payment
-- Relationship examples:
-- Customer rents Bike
-- Bike located at Location
-- Customer pays Payment
-- ERD view:
-- Customer --(rents)--> Bike
-- Entity will have attributes:
-- BikeID, Model, Color, Condition, Availability
-- Prefer consistent names:
-- Customer, Bike, RentalTransaction
-- Avoid reserved words for table names
-- Safe habit:
-- Pick a naming style and keep it.
-- Example: PascalCase tables, snake_case columns
A travel booking system should allow users to search for, compare, and book flights, hotels, and rental cars. Users must be able to filter results by price, date, and location, and securely complete payments
A travel booking system should allow users to search for, compare, and book flights, hotels, and rental cars. Users must be able to filter results by price, date, and location, and securely complete payments.
-- Practice:
-- turn nouns into entity boxes
-- connect boxes with verb labels
-- Customer attributes:
-- CustomerID, Name, Email, Phone
-- Bike attributes:
-- BikeID, Model, Color, Condition, Availability
-- Primary key examples:
-- Customer(CustomerID)
-- Bike(BikeID)
-- RentalTransaction(RentalID)
-- Optional examples:
-- Customer.Email, Customer.Phone
-- Bike.Color, Bike.Condition
-- RentalTransaction.ReturnDate
-- BookingDate DATE
-- Availability BOOLEAN
-- Name VARCHAR(100)
| Data Type | Description | Example |
|---|---|---|
| INTEGER | A whole number, no decimal points | 42 |
| BIGINT | A large integer value | 9223372036854775807 |
| DECIMAL(p, s) | Exact numeric value | DECIMAL(10, 2) |
| FLOAT | Approximate numeric value | 3.14159 |
| CHAR(n) | Fixed length string | CHAR(5) |
| VARCHAR(n) | Variable length string | VARCHAR(50) |
| TEXT | Variable length text | "long notes" |
| DATE | Date | 2025-01-14 |
| TIME | Time | 14:30:00 |
| TIMESTAMP | Date and time | 2025-01-14 14:30:00 |
| BOOLEAN | TRUE or FALSE | TRUE |
| BLOB | Binary data | [bytes] |
-- Common constraints:
-- PRIMARY KEY
-- FOREIGN KEY
-- NOT NULL
-- UNIQUE
-- DEFAULT
| Constraint | Description |
|---|---|
| PRIMARY KEY | Uniquely identifies each record; implies NOT NULL |
| FOREIGN KEY | Ensures referential integrity by linking to another table |
| NOT NULL | Column cannot be NULL |
| UNIQUE | All values in a column are distinct |
| DEFAULT | Default value when no value is provided |
| AUTO_INCREMENT | Auto generates unique integer id (DB specific) |
| Conceptual Term | Technical Term |
|---|---|
| Diagram | Schema |
| Entity | Table |
| Attribute | Column |
| Record | Row |
| Identifier | Primary Key |
| Relationship | Foreign Key |
| Optional Attribute | Nullable Column |
| Mandatory Attribute | NOT NULL Column |
CREATE TABLE Customer (
CustomerID INTEGER PRIMARY KEY AUTOINCREMENT,
Name VARCHAR(100) NOT NULL,
Email VARCHAR(100) UNIQUE,
Phone VARCHAR(15)
);
CREATE TABLE Bike (
BikeID INTEGER PRIMARY KEY AUTOINCREMENT,
Model VARCHAR(100) NOT NULL,
Color VARCHAR(50),
Condition VARCHAR(50),
Availability BOOLEAN DEFAULT TRUE
);
CREATE TABLE RentalTransaction (
RentalID INTEGER PRIMARY KEY AUTOINCREMENT,
CustomerID INTEGER NOT NULL,
BikeID INTEGER NOT NULL,
BookingDate DATE NOT NULL,
ReturnDate DATE,
FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID),
FOREIGN KEY (BikeID) REFERENCES Bike(BikeID)
);
-- Example risk:
-- UI collects phone as text "+1-123-456"
-- DB column INTEGER
-- Insert fails or loses formatting
| Front-End | Back-End | Database | Result |
|---|---|---|---|
| Text input for phone number | String validation for format | INTEGER type | Formatted numbers rejected |
| Dropdown allows decimal selection | Expects integer values | FLOAT type | Unexpected decimal input |
| Date picker with local format | Assumes ISO format | DATE type | Date conversion errors |
-- Entity → table later
-- Example: Customer, Bike
-- Relationship example:
-- Customer rents Bike
-- Primary key column:
-- CustomerID INTEGER PRIMARY KEY
Cardinality matters: 1:1, 1:N, N:N
1 Entity
0 Relationship
2 Entity
1 Relationship
3 Entity
2 Relationship
-- Design tip:
-- prefer the simplest ERD that meets requirements
-- add complexity only when needed
| Cardinality | Meaning | Example |
|---|---|---|
| 1:1 | Each record links to one record | User and Profile |
| 1:N | One record links to many records | Customer and Rentals |
| N:N | Many records link to many records | Students and Courses |
| Cardinality | When to Use | Example |
|---|---|---|
| 1:1 | Each record in one table corresponds to exactly one record in another table | User and Profile |
| 1:N | One record can be associated with multiple records in another table | Library and Books |
| N:N | Multiple records in both tables can be related to each other | Students and Courses |
SELECT Book.title
FROM Book
WHERE Book.isbnId = '978-1-';
SELECT Book.title
FROM Book, ISBN
WHERE Book.bookId = ISBN.bookId
AND ISBN.isbnId = '978-1-';
SELECT Book.title
FROM Book, ISBN, Book_ISBN
WHERE Book.bookId = Book_ISBN.bookId
AND ISBN.isbnId = Book_ISBN.isbnId
AND ISBN.isbnId = '978-1-';
| # | Scenario | Which Cardinality? |
|---|---|---|
| 1 | Book and ISBN | |
| 2 | Author and Book | |
| 3 | Student and borrow Book | |
| 4 | Student and Student ID | |
| 5 | Book and Publisher | |
| 6 | Publisher and Books | |
| 7 | Library and Book | |
| 8 | Book and Format (e.g., eBook, hardcover) | |
| 9 | Book and Catalog Number in the Library System | |
| 10 | Reader and Book Review |
-- Next step:
-- use your ERD to justify every CREATE TABLE line
@startuml
entity "EntityA" as A {
+id : INTEGER
attribute_1 : VARCHAR
attribute_2 : INTEGER
}
entity "EntityB" as B {
+id : INTEGER
attribute_1 : VARCHAR
}
A -- B : "relationship"
@enduml
Requirements → ERM → Tables → Data
-- Work order:
-- 1) requirements
-- 2) ERM / ERD
-- 3) schema
-- 4) data
-- 5) queries
The store should sell furniture products to customers and record each order.
Each order can include multiple products with quantities and prices at the time of purchase.
The store should track inventory per product, so staff know what is in stock and what needs reordering.
-- Design choice:
-- Keep the first version small.
-- Add delivery, returns, warehouses later.
-- Minimal set of tables:
-- 1) Product (includes inventory count)
-- 2) Customer
-- 3) CustomerProductSales (captures sales rows with two FKs)
-- Tradeoff:
-- Normalized: Order + OrderItem (best design)
-- Constrained: store each sale inCustomerProductSales
-- Conceptual ERM:
-- Customer N:N Product (via CustomerProductSales)
-- Product carries stock columns (qty_on_hand, reorder_level)
@startuml
hide circle
skinparam linetype ortho
skinparam shadowing false
entity "Customer" as Customer {
+customer_id : INTEGER
--
name : VARCHAR
email : VARCHAR
phone : VARCHAR
}
entity "Product" as Product {
+product_id : INTEGER
--
sku : VARCHAR
name : VARCHAR
category : VARCHAR
price : DECIMAL
qty_on_hand : INTEGER
reorder_level : INTEGER
}
entity "CustomerProductSales" as CustomerProductSales {
+sale_id : INTEGER
--
customer_id : INTEGER (FK)
product_id : INTEGER (FK)
sale_date : DATE
qty : INTEGER
unit_price : DECIMAL
status : VARCHAR
}
Customer ||--o{ CustomerProductSales : buys
Product ||--o{ CustomerProductSales : sold_in
@enduml
-- Example row meaning:
-- customer 1 bought product 5
-- qty = 2 at unit_price = 89.00 on 2026-01-10
-- Primary keys are stable identifiers.
-- They should not carry meaning.
-- Constraints catch errors early.
-- They protect data quality.
-- Customer ||--o{ CustomerProductSales
-- FK: CustomerProductSales.customer_id → Customer.customer_id
-- FK: CustomerProductSales.product_id → Product.product_id
| Table | Purpose | Key Columns |
|---|---|---|
| Customer | Who buys | customer_id (PK), email (UNIQUE) |
| Product | What is sold and stock count | product_id (PK), sku (UNIQUE), qty_on_hand |
| CustomerProductSales | Sales rows (customer ↔ product) | sale_id (PK), customer_id (FK), product_id (FK) |
CREATE TABLE Customer (
customer_id INTEGER PRIMARY KEY AUTOINCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE,
phone VARCHAR(30)
);
CREATE TABLE Product (
product_id INTEGER PRIMARY KEY AUTOINCREMENT,
sku VARCHAR(40) NOT NULL UNIQUE,
name VARCHAR(120) NOT NULL,
category VARCHAR(60),
price DECIMAL(10,2) NOT NULL,
qty_on_hand INTEGER NOT NULL DEFAULT 0,
reorder_level INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE CustomerProductSales (
sale_id INTEGER PRIMARY KEY AUTOINCREMENT,
customer_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
sale_date DATE NOT NULL,
status VARCHAR(30) NOT NULL DEFAULT 'NEW',
qty INTEGER NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
FOREIGN KEY (customer_id) REFERENCES Customer(customer_id),
FOREIGN KEY (product_id) REFERENCES Product(product_id)
);
-- Insert order:
-- 1) Customer
-- 2) Product
-- 3) CustomerProductSales
INSERT INTO Customer (name, email, phone) VALUES
('Ava Kim', 'ava.kim@example.com', '209-555-0101'),
('Noah Patel', 'noah.patel@example.com', '209-555-0112');
INSERT INTO Product (sku, name, category, price, qty_on_hand, reorder_level) VALUES
('SOFA-1001', 'Linen Sofa 3 Seat', 'Sofa', 899.00, 5, 2),
('TBLE-2001', 'Oak Dining Table', 'Table', 749.00, 2, 1),
('CHAI-3001', 'Walnut Accent Chair', 'Chair', 299.00, 7, 3);
INSERT INTO Product (sku, name, category, price, qty_on_hand, reorder_level) VALUES
('DSKR-4001', 'Standing Desk', 'Desk', 499.00, 3, 1),
('LAMP-5001', 'Floor Lamp', 'Lighting', 89.00, 12, 5),
('RUGG-6001', 'Wool Rug 5x8', 'Rug', 179.00, 4, 2);
INSERT INTO CustomerProductSales (customer_id, product_id, sale_date, status, qty, unit_price)
VALUES
(1, 1, '2026-01-10', 'PAID', 1, 899.00),
(1, 5, '2026-01-10', 'PAID', 2, 89.00);
INSERT INTO CustomerProductSales (customer_id, product_id, sale_date, status, qty, unit_price)
VALUES
(2, 2, '2026-01-11', 'NEW', 1, 749.00),
(2, 3, '2026-01-11', 'NEW', 2, 299.00),
(1, 6, '2026-01-12', 'PAID', 1, 179.00),
(1, 4, '2026-01-12', 'PAID', 1, 499.00);
SELECT product_id, sku, name, price, qty_on_hand, reorder_level
FROM Product
ORDER BY category, price DESC;
| Condition | Meaning | Action |
|---|---|---|
| qty_on_hand = 0 | Out of stock | Block purchase |
| qty_on_hand ≤ reorder_level | Low stock | Restock soon |
| qty_on_hand > reorder_level | Healthy stock | No action |
SELECT sku, name, qty_on_hand, reorder_level
FROM Product
WHERE qty_on_hand <= reorder_level
ORDER BY qty_on_hand ASC;
SELECT c.name,
COUNT(s.sale_id) AS sale_rows,
SUM(s.qty * s.unit_price) AS total_spent
FROM Customer c
JOIN CustomerProductSales s ON s.customer_id = c.customer_id
GROUP BY c.customer_id
ORDER BY total_spent DESC;
UPDATE Product
SET qty_on_hand = qty_on_hand + 10
WHERE sku = 'TBLE-2001';
UPDATE Product
SET price = 799.00
WHERE sku = 'SOFA-1001';
UPDATE CustomerProductSales
SET status = 'FULFILLED'
WHERE sale_id = 1;
DELETE FROM CustomerProductSales
WHERE sale_id = 2
AND status = 'NEW';
| Goal | SQL | Example |
|---|---|---|
| Read data | SELECT | List low stock products |
| Add data | INSERT | Create new sales rows |
| Change data | UPDATE | Restock inventory |
| Remove data | DELETE | Delete a test sale |
-- Guard rails:
-- qty_on_hand >= 0
-- price > 0
-- qty > 0
-- valid status values
-- Next iteration:
-- Add Order and OrderItem tables
-- Group line items under an order_id
-- Stronger constraints + joins
-- Grow the model only when needed.
-- Keep the first version stable.
-- Habit:
-- justify each column with a requirement
Modify requirements and update the ERM and tables
Georg Cantor – Founder of Set Theory
items = []
items.append("chair")
items.append("table")
items.append("chair") # duplicate allowed!
print(items) # ['chair', 'table', 'chair']
s = set()
s.add("chair")
s.add("table")
s.add("chair") # ignored!
print(s) # {'chair', 'table'}
Sets:
A = {1, 2}
B = {2, 3}
A ∪ B = {1, 2, 3}
A ∩ B = {2}
Logic:
P(x): x ∈ A
Q(x): x ∈ B
P(x) ∨ Q(x): true for 1,2,3
P(x) ∧ Q(x): true for 2
¬P(x): true for 3,4
s = {"chair", "table"}
print(len(s)) # 2
s.add("chair") # ignored
s.add("desk") # added
s.add("chair") # ignored again
print(len(s)) # still 3
print(s) # {'chair', 'table', 'desk'}
CREATE TABLE Furniture (
id INTEGER PRIMARY KEY AUTOINCREMENT,
item VARCHAR(50) UNIQUE
);
INSERT INTO Furniture (item) VALUES ('chair'), ('table');
-- Duplicate item → raises exception
INSERT INTO Furniture (item) VALUES ('chair');
-- Error: UNIQUE constraint failed: Furniture.item
-- Using OR IGNORE to mimic set (duplicates ignored)
INSERT OR IGNORE INTO Furniture (item) VALUES ('chair');
INSERT OR IGNORE INTO Furniture (item) VALUES ('desk');
INSERT OR IGNORE INTO Furniture (item) VALUES ('chair');
SELECT COUNT(*) FROM Furniture; -- 3
Set = {chair, table, lamp} with no duplicates
const furniture = new Set([
"chair",
"table",
"lamp"
]);
Set<String> furniture = new HashSet<>();
furniture.add("chair");
furniture.add("table");
furniture.add("lamp");
furniture = {"chair", "table", "lamp"}
CREATE TABLE Furniture (
item INTEGER PRIMARY KEY
);
INSERT INTO Furniture VALUES
('chair'), ('table'), ('lamp');
| Operation | Symbol | Description | Diagram |
|---|---|---|---|
| Union | ⋃ | All unique elements | |
| Intersection | ∩ | Common elements only | |
| Difference | − | In A but not in B |
Ordered sequence of elements
Bicycles relation:
bikeId | brand | type | price
1 | Trek | Mountain | 1200 ← tuple 1
1 | Trek | Mountain | 1200 ← tuple 1
1 | Trek | Mountain | 1200 ← tuple 1
2 | Giant | Road | 1500 ← tuple 2
3 | Cannondale | Hybrid | 1000 ← tuple 3
Each row = one tuple (ordered sequence of elements)
Bikes relation (a set of tuples):
bikeId | brand | type | price
1 | Trek | Mountain | 1200
2 | Giant | Road | 1500
3 | Cannondale | Hybrid | 1000
• Each row = one tuple
• No duplicate rows allowed
• All rows share the same columns (attributes)
Input Bicycles:
bikeId | brand | type | price
1 | Trek | Mountain | 1699
2 | Giant | Road | 1100
3 | Cannondale | Hybrid | 999
After σ(price < 1200):
bikeId | brand | type | price
2 | Giant | Road | 1100
3 | Cannondale | Hybrid | 999
Input:
bikeId | brand | model | price | color
1 | Trek | Roscoe 7 | 1699 | Blue
2 | Giant | Contend | 1299 | Black
3 | Trek | Marlin 5 | 799 | Red
After π(brand, price):
brand | price
Trek | 1699
Giant | 1299
Trek | 799
Available:
bikeId | brand
1 | Trek
2 | Giant
Sold:
bikeId | brand
2 | Giant
4 | Scott
After Union:
bikeId | brand
1 | Trek
2 | Giant
4 | Scott
Stock:
bikeId | brand
1 | Trek
2 | Giant
3 | Cannondale
Sold:
bikeId | brand
2 | Giant
After Difference:
bikeId | brand
1 | Trek
3 | Cannondale
Bikes (2 rows) × Stores (2 rows) = 4 rows
Bikes: Stores:
1 Trek A Berlin
2 Giant B Munich
Result:
bikeId | brand | store | city
1 | Trek | A | Berlin
1 | Trek | B | Munich
2 | Giant | A | Berlin
2 | Giant | B | Munich
Orders: Customers:
orderId | custId | amt custId | name
101 | 5 | 299 5 | Alice
102 | 7 | 450 7 | Bob
After Natural Join:
orderId | custId | amt | name
101 | 5 | 299 | Alice
102 | 7 | 450 | Bob
Enrollments: Required:
student | course course
A | Math Math
A | Physics Physics
B | Math
After Division (students who took ALL required):
student
A
Before:
bikeId | brand | price
After ρ(Inv ← Bicycles):
Inv.bikeId | Inv.brand | Inv.price
| Operator | Symbol | Description | Example |
|---|---|---|---|
| Selection | σ | Filter rows | σ(price < 1200)(Bikes) |
| Projection | π | Select columns | π(brand, price)(Bikes) |
| Union | ⋃ | Combine, no duplicates | Avail ⋃ Sold |
| Difference | − | A minus B | Stock − Sold |
| Cartesian Product | × | All pairs | Bikes × Stores |
| Natural Join | ⨝ | Match on common attrs | Orders ⨝ Customers |
| Division | ÷ | Match all values | Students ÷ Required |
| Renaming | ρ | Rename relation/attrs | ρ(Inv)(Bikes) |
σ → less rows
π → less columns
⨝ → less comparisons
Relations
↓
Selection
↓
Projection
↓
Join
↓
Result
Before:
π brand (σ price < 500 (Bicycles ⨝ Stores))
After:
π brand ((σ price < 500 (Bicycles)) ⨝ Stores)
Step 1: Bicycles ⨝ Stores
→ large intermediate result
Step 2: σ price < 500
→ many rows discarded late
π brand
Step 1: σ price < 500 (Bicycles)
→ small input
Step 2: filtered Bicycles ⨝ Stores
→ fewer comparisons
π brand
Before:
π brand (π bikeId, brand (Bicycles))
After:
π brand (Bicycles)
Before:
π brand (Bicycles ⨝ Stores ⨝ Suppliers)
After:
π brand (Bicycles ⨝ Stores)
(Bicycles ⨝ Stores) ⨝ Suppliers
Bicycles ⨝ (Stores ⨝ Suppliers)
Query:
σ R.name = 'West' (O ⨝ C ⨝ R)
(O ⨝ C) ⨝ R
O ⨝ C → ~100,000,000,000 rows
O ⨝ (C ⨝ (σ R='West'(R)))
R: 10 → 1
C: 100,000 → 10,000
O ⨝ C → ~1,000,000,000 rows
Before:
σ price < 500 (π brand, price (Bicycles ⨝ Stores))
After:
(π brand, price (σ price < 500 (Bicycles)))
⨝ Stores
σ Price > 10 ∧ Name='Rose' ∧ Location='Local'
(Flowers ⨝ Suppliers)
Flowers ⨝ Suppliers
→ apply filters
1,000 × 500 = 500K
(σ Local (Suppliers))
⨝
(σ Rose ∧ Price > 10 (Flowers))
100 × 50 = 5K
(σ Rose ∧ Red (Flowers))
⨝ Supply
⨝ (σ Local (Suppliers))
π Name (
σ Price > 10 (
Flowers ⨝
(σ Local (Suppliers ⨝
σ Revenue > 1000 (Companies)))
)
)
π Name (
(σ Price > 10 (Flowers))
⨝
(σ Local (Suppliers ⨝
σ Revenue > 1000 (Companies)))
)
Write Operation
↓
[ No Journal ]
↓
Data File
SQLite
↓ write
OS Buffer
↓ (later)
Disk
Query
↓
Temp Tables
↓
RAM (Memory)
Sequence (1…N)
↓
Value assignment
↓
Skew introduced
↓
Large tables
seq = [1, 2, 3, ..., 200000]
INSERT customers
FOR EACH x IN seq:
(x, 'West' or 'Other')
WITH RECURSIVE
seq(x)
↓
x starts at 1
↓
x = x + 1
↓
STOP when x reaches limit
↓
INSERT generated rows
-- Define a recursive sequence
WITH RECURSIVE seq(x) AS (
-- Base case: start at 1
SELECT 1
UNION ALL
-- Recursive step: increment
SELECT x + 1
FROM seq
-- Abort condition
WHERE x < 200000
)
-- Insert generated values
INSERT INTO customers
SELECT
x, -- unique id
CASE -- controlled skew
WHEN x % 100 = 0
THEN 'West'
ELSE 'Other'
END
FROM seq;
x in seq(x)?x is the column name of the recursive tableseq is a temporary result setseq has one value: xseq = table, x = column
WITH RECURSIVE seq(x) AS (...)
seq
----
x
----
1
2
3
...
Time ─────────────────────────────▶
|──────────── real (wall clock) ────────────|
|──────── user (CPU work) ───────|
| sys |
Example:
real = user + sys + waiting
(what the user feels)
Run Time:
real 0.077
user 0.075025
sys 0.000000
SELECT COUNT(*)
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE c.region = 'West';
σ region='West' (Orders ⨝ Customers)
SELECT *
FROM sales s
JOIN products p ON s.product_id = p.id
WHERE p.category = 'Bike';
σ category='Bike' (Sales ⨝ Products)
SELECT o.id
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN regions r ON c.region_id = r.id
WHERE r.name = 'West';
Orders ⨝ (Customers ⨝ σ name='West' (Regions))
SELECT *
FROM enrollments e
JOIN students s ON e.student_id = s.id
WHERE s.major = 'CS';
Enrollments ⨝ σ major='CS' (Students)
SELECT *
FROM supply sp
JOIN suppliers s ON sp.supplier_id = s.id
JOIN flowers f ON sp.flower_id = f.id
WHERE s.location = 'Local';
Supply ⨝ (σ location='Local' (Suppliers) ⨝ Flowers)
SELECT *
FROM logs l
JOIN users u ON l.user_id = u.id
WHERE u.active = 1;
Logs ⨝ σ active=1 (Users)
SELECT *
FROM orders o
JOIN items i ON o.id = i.order_id
WHERE o.date > '2025-01-01';
σ date>'2025-01-01' (Orders) ⨝ Items
SELECT *
FROM activity a
JOIN conditions c ON a.id = c.activity_id
WHERE c.type = 'once';
Activity ⨝ σ type='once' (Conditions)
SELECT *
FROM payments p
JOIN users u ON p.user_id = u.id
JOIN countries c ON u.country_id = c.id
WHERE c.code = 'US';
Payments ⨝ (Users ⨝ σ code='US' (Countries))
SELECT *
FROM reviews r
JOIN products p ON r.product_id = p.id
WHERE p.rating > 4;
Reviews ⨝ σ rating>4 (Products)
SELECT COUNT(*)
FROM (
SELECT customer_id
FROM customers
WHERE region = 'West'
) c
JOIN orders o ON o.customer_id = c.customer_id;
-- Before
σ region='West' (Orders ⨝ Customers)
-- After
Orders ⨝ (π customer_id (σ region='West' (Customers)))
SELECT s.sale_id, s.product_id, s.qty
FROM (
SELECT id
FROM products
WHERE category = 'Bike'
) p
JOIN sales s ON s.product_id = p.id;
-- Before
σ category='Bike' (Sales ⨝ Products)
-- After
Sales ⨝ (π id (σ category='Bike' (Products)))
SELECT o.id
FROM (
SELECT id
FROM regions
WHERE name = 'West'
) r
JOIN customers c ON c.region_id = r.id
JOIN orders o ON o.customer_id = c.customer_id;
-- Before
π o.id (σ name='West' (Orders ⨝ Customers ⨝ Regions))
-- After
π o.id (
Orders ⨝
(Customers ⨝ (π id (σ name='West' (Regions))))
)
SELECT e.student_id, e.course_id
FROM enrollments e
JOIN (
SELECT id
FROM students
WHERE major = 'CS'
) s ON e.student_id = s.id;
-- Before
σ major='CS' (Enrollments ⨝ Students)
-- After
Enrollments ⨝ (π id (σ major='CS' (Students)))
SELECT sp.supplier_id, sp.flower_id
FROM (
SELECT id
FROM suppliers
WHERE location = 'Local'
) s
JOIN supply sp ON sp.supplier_id = s.id
JOIN flowers f ON sp.flower_id = f.id;
-- Before
σ location='Local' (Supply ⨝ Suppliers ⨝ Flowers)
-- After
((Supply ⨝ (π id (σ location='Local' (Suppliers)))) ⨝ Flowers)
SELECT l.log_id, l.user_id, l.action
FROM logs l
JOIN (
SELECT id
FROM users
WHERE active = 1
) u ON l.user_id = u.id;
-- Before
σ active=1 (Logs ⨝ Users)
-- After
Logs ⨝ (π id (σ active=1 (Users)))
SELECT o.id, i.item_id, i.qty
FROM (
SELECT id
FROM orders
WHERE date > '2025-01-01'
) o
JOIN items i ON o.id = i.order_id;
-- Before
σ date>'2025-01-01' (Orders ⨝ Items)
-- After
((π id (σ date>'2025-01-01' (Orders))) ⨝ Items)
SELECT a.id, a.title, c.type
FROM activity a
JOIN (
SELECT activity_id, type
FROM conditions
WHERE type = 'once'
) c ON a.id = c.activity_id;
-- Before
σ type='once' (Activity ⨝ Conditions)
-- After
Activity ⨝ (π activity_id, type (σ type='once' (Conditions)))
SELECT p.payment_id, p.amount
FROM (
SELECT id
FROM countries
WHERE code = 'US'
) c
JOIN users u ON u.country_id = c.id
JOIN payments p ON p.user_id = u.id;
-- Before
π payment_id, amount (σ code='US' (Payments ⨝ Users ⨝ Countries))
-- After
π payment_id, amount (
Payments ⨝
(Users ⨝ (π id (σ code='US' (Countries))))
)
SELECT r.review_id, r.product_id, r.stars
FROM reviews r
JOIN (
SELECT id
FROM products
WHERE rating > 4
) p ON r.product_id = p.id;
-- Before
σ rating>4 (Reviews ⨝ Products)
-- After
Reviews ⨝ (π id (σ rating>4 (Products)))
┌──────────────────────┐
│ Admin Interface │
│ (CLI · GUI · Tools) │
└──────────┬───────────┘
│
│
┌──────────────────────┐ │ ┌──────────────────────┐
│ Application Code │────┼────│ Data Tools / ETL │
│ (Web · Mobile · API) │ │ │ (Import · Export) │
└──────────┬───────────┘ │ └──────────┬───────────┘
│ │ │
│ ▼ │
│ ┌────────────────┐ │
└───────▶ DBMS ◀─────┘
│ (SQLite / PG) │
│ parse · exec │
│ enforce rules │
└────────────────┘
▲
│
┌──────────┴───────────┐
│ Reporting / BI │
│ (Queries · Views) │
└──────────────────────┘
┌──────────────────────┐
│ User │
│ (DBA · Developer) │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Terminal │
│ (CLI · Shell) │
│ SQL · scripts │
└──────────┬───────────┘
│
│ SQL / Batch Jobs
▼
┌──────────────────────┐
│ Database │
│ (Tables · Rules) │
│ constraints · ids │
└──────────────────────┘
┌──────────────────────┐
│ User │
│ (Clicks · Forms) │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ GUI │
│ (Desktop · Web UI) │
│ buttons · tables │
└──────────┬───────────┘
│
│ SQL / Queries
▼
┌──────────────────────┐
│ Database │
│ (Tables · Rules) │
│ constraints · ids │
└──────────────────────┘
┌──────────────────────┐
│ User │
│ (Browser · Client) │
└──────────┬───────────┘
│ Internet (HTTP / HTTPS)
▼
┌──────────────────────┐
│ Frontend │
│ (Web UI · Forms) │
│ HTML · CSS · JS │
└──────────┬───────────┘
│ API Calls
▼
┌──────────────────────┐
│ Backend │
│ (Logic · Auth) │
│ REST · Validation │
└──────────┬───────────┘
│ SQL
▼
┌──────────────────────┐
│ Database │
│ (Tables · Rules) │
│ constraints · ids │
└──────────────────────┘
┌──────────────────────┐ ┌──────────────────────┐
│ Users │ │ Cloud Platform │
│ (Browser · Mobile) │ │ (Network · Policy) │
└──────────┬───────────┘ └──────────┬───────────┘
│ │
│ Internet (HTTPS) │ Managed Services
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Cloud App Services │ │ Cloud Database │
│ (Frontend + Backend) │ │ (Managed RDBMS) │
│ APIs · Auth · Logic │ │ backups · replicas │
└──────────────────────┘ └──────────────────────┘
CREATE TABLE Orders (OrderID INT PRIMARY KEY, CustomerID INT FOREIGN KEY REFERENCES Customers(CustomerID)); -- A
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)); -- B
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY REFERENCES Customers(CustomerID)); -- C
CREATE TABLE Orders (OrderID INT PRIMARY KEY, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)) -- D
CREATE TABLE Table (ID INT PRIMARY KEY, Name VARCHAR(50));
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID))
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY REFERENCES Customers(CustomerID));
CREATE TABLE Orders (OrderID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID), CustomerID INT);
CREATE TABLE Orders (OrderID INT, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID));
┌──────────────────────┐
│ Admin Interface │
│ (CLI · GUI · Tools) │
└──────────┬───────────┘
│
│
┌──────────────────────┐ │ ┌──────────────────────┐
│ Application Code │────┼────│ Data Tools / ETL │
│ (Web · Mobile · API) │ │ │ (Import · Export) │
└──────────┬───────────┘ │ └──────────┬───────────┘
│ │ │
│ ▼ │
│ ┌────────────────┐ │
└───────▶ DBMS ◀─────┘
│ (SQLite / PG) │
│ parse · exec │
│ enforce rules │
└────────────────┘
▲
│
┌──────────┴───────────┐
│ Reporting / BI │
│ (Queries · Views) │
└──────────────────────┘
Time
↓
GUI tools
↓
CLI automation
↓
APIs for apps
↓
Web dashboards
| Interface | Role | Purpose | Technology |
|---|---|---|---|
| Admin Interface (GUI) | Database Administrators | User management, schema modifications, monitoring | SQLite Browser |
| Command-Line Interface (CLI) | Developers and DBAs | Direct SQL execution, automation, scripting | SQLite CLI |
| Application Programming Interface (API) | Developers | Programmatic database interaction, integrating apps | Neon Tech |
| Web-Based UI | Regular Users | Accessing, viewing, and interacting with data in a user friendly way | - |
App
↓ (driver)
Database
↓
SQL results
Driver examples:
psycopg2 (Python)
pg (Node)
JDBC (Java)
postgresql://user:password@host:port/database?sslmode=require
Risk increases when DB is exposed
Prefer:
App → Backend → DB
over:
App → DB
import psycopg2
conn = psycopg2.connect("postgresql://user:pass@host/db?sslmode=require")
cur = conn.cursor()
cur.execute("SELECT version();")
print(cur.fetchone())
Keep credentials out of code
Least privilege roles
Rotate secrets
User interface
Forms · Buttons · Lists
Application logic
Validation · APIs
Data storage
Tables · Constraints
git clone https://github.com/SE4CPS/2026-COMP-163
cd 2026-COMP-163
| Interface | Role | Purpose | Technology |
|---|---|---|---|
| Admin Interface (GUI) | Database Administrators | User management, schema modifications, monitoring | SQLite Browser |
| Command-Line Interface (CLI) | Developers and DBAs | Direct SQL execution, automation, scripting | SQLite CLI |
| Application Programming Interface (API) | Developers | Programmatic database interaction, integrating apps | Neon Tech |
| Web-Based UI | Regular Users | Accessing, viewing, and interacting with data in a user friendly way | - |
Admin → database ownership tasks
Dev → build apps on top
User → consume data safely
CLI · GUI · Tools
backups · users · schema · monitoring
┌──────────────────────┐
│ Admin Interface │
│ (CLI · GUI · Tools) │
└──────────┬───────────┘
│
│
┌──────────────────────┐ │ ┌──────────────────────┐
│ Application Code │────┼────│ Data Tools / ETL │
│ (Web · Mobile · API) │ │ │ (Import · Export) │
└──────────┬───────────┘ │ └──────────┬───────────┘
│ │ │
│ ▼ │
│ ┌────────────────┐ │
└───────▶ DBMS ◀─────┘
│ (SQLite / PG) │
│ parse · exec │
│ enforce rules │
└────────────────┘
▲
│
┌──────────┴───────────┐
│ Reporting / BI │
│ (Queries · Views) │
└──────────────────────┘
Traditional rule: treat admin actions like production change control.
Admin Interface = "How we control the DB"
Not:
- how users browse data
- how apps call SQL
Yes:
- how owners keep it stable
| Area | Typical Tasks | Why it matters | Common Tools |
|---|---|---|---|
| Identity | Users, roles, grants, password rotation | Least privilege | psql, pgAdmin, cloud console |
| Reliability | Backups, restore tests, replication checks | Recovery time | pg_dump, snapshots |
| Schema | Migrations, constraints, indexes, data fixes | Correctness | migration tools, SQL |
| Observability | Slow queries, locks, storage growth, alerts | Predictability | metrics, logs, dashboards |
psql
pg_dump / pg_restore
SQL scripts
cron / CI jobs
pgAdmin
DBeaver
SQLite Browser
Cloud consoles
┌──────────────┐ grants ┌──────────────┐
│ Admin │──────────────────▶│ Role │
│ (owner) │ │ (policy) │
└──────┬───────┘ └──────┬───────┘
│ │
│ assigns │ used by
▼ ▼
┌──────────────┐ ┌──────────────┐
│ User │ │ App │
│ (human) │ │ (service) │
└──────┬───────┘ └──────┬───────┘
│ │
└────────────── SQL ───────────────┘
▼
┌──────────────┐
│ DBMS │
│ checks rules │
└──────────────┘
Traditional rule: practice restore like a fire drill.
┌──────────────┐ pg_dump ┌──────────────┐
│ Primary │─────────────────▶│ Backup │
│ DBMS │ │ Store │
└──────┬───────┘ └──────┬───────┘
│ │
│ restore test │ retention
▼ ▼
┌──────────────┐ pg_restore ┌──────────────┐
│ Staging DB │◀────────────────│ Policies │
│ (verification) │ (30d / 90d) │
└──────┬───────┘ └──────────────┘
│
▼
confidence report
| Step | Action | Admin goal |
|---|---|---|
| 1 | Plan change | Know impact |
| 2 | Run in staging | Catch issues early |
| 3 | Deploy during window | Minimize disruption |
| 4 | Validate with queries | Confirm correctness |
| 5 | Record + audit | Traceability |
Signals → Actions
Slow queries → optimize SQL / add index
Locks → fix transactions
Disk growth → retention / cleanup
Auth failures → rotate secrets / investigate
Backup failures → stop and repair immediately
┌──────────────────────┐
│ Users │
└──────────┬───────────┘
│ HTTPS
▼
┌──────────────────────┐
│ Frontend │
└──────────┬───────────┘
│ REST
▼
┌──────────────────────┐
│ Backend │
└──────────┬───────────┘
│ SQL
▼
┌──────────────────────┐
│ DBMS │
└──────────┬───────────┘
▲
│ privileged (admin)
┌──────────┴───────────┐
│ Admin Interface │
│ (CLI · GUI · Tools) │
└──────────────────────┘
| Task | Admin? | Why |
|---|---|---|
| Creating a backup | Yes | Reliability ownership |
| Managing user permissions | Yes | Policy and access control |
| Writing a new API endpoint | No | Application layer work |
| Building a dashboard view | No | User or BI interface work |
Admin work:
- keeps system stable
- changes ownership settings
- restores service when broken
If it changes policy, backups, or schema:
it is admin.
Download DB Browser for SQLite
┌──────────────┐ open .db ┌──────────────────────┐
│ Student │ ──────────────▶ │ DB Browser (GUI) │
│ (local file) │ │ tables · schema · SQL│
└──────────────┘ └──────────┬───────────┘
│
▼
┌──────────────┐
│ SQLite DB │
│ tables · rows│
└──────────────┘
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
Is this an Admin Task, Developer Task, or User Task?
Traditional flow: learn SQLite first, then move to PostgreSQL.
Jump to PostgreSQL section
Backend
↓ (SSL)
PostgreSQL
Frontend
↓ HTTP
Backend (REST)
↓ SQL
Database
@app.route('/flowers')
def manage_flowers():
conn = get_db_connection()
cur = conn.cursor()
cur.execute("SELECT * FROM flowers")
flowers = cur.fetchall()
cur.close()
conn.close()
return render_template('flowers.html', flowers=flowers)
Is this a Backend Database Developer Task?
Is this a Backend Database Developer Task?
Is this a Backend Database Developer Task?
Is fetching data from an API a frontend or backend task?
Who is responsible for validating user input before storing it in the database?
Does the frontend directly modify the database?
We want to keep track of the flower inventory.
Start on paper before using any tool.
@startuml
entity FlowerEntity {
* id : int
--
attribute_one : text
attribute_two : text
attribute_three : number
}
@enduml
Constraints protect the data, not the code.
Decide at the database level first.
# first time only
git clone https://github.com/SE4CPS/2026-COMP-163.git
# every class / work session
cd 2026-COMP-163
git pull
Which component is responsible for ensuring data validation and access control?
Which component plays a key role in optimizing system performance and preventing security breaches?
Which component is essential for managing user requests, processing data, and ensuring secure communication?
Which component typically handles REST API requests and responses?
Which component is primarily responsible for handling database connection strings?
CREATE TABLE Book (
book_id SERIAL PRIMARY KEY,
uuid UUID DEFAULT gen_random_uuid(),
electronic_article_number VARCHAR(50) UNIQUE NOT NULL,
isbn VARCHAR(512) UNIQUE NOT NULL,
title VARCHAR(200) NOT NULL,
author VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL,
stock INT NOT NULL
);
UUID (Universally Unique Identifier) is a 128-bit unique identifier
Example: 550e8400-e29b-41d4-a716-446655440000
-- Supported in PostgreSQL, MySQL (8.0+), MariaDB, and others
CREATE TABLE Users (
user_id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
import uuid
# Generate a new UUID
user_id = uuid.uuid4()
print("Generated UUID:", user_id)
import { v4 as uuidv4 } from 'uuid';
// Generate a new UUID
const userId = uuidv4();
console.log("Generated UUID:", userId);
// Generate a UUID using the browser's crypto API
function generateUUID() {
return crypto.randomUUID();
}
console.log("Generated UUID:", generateUUID());
SELECT * FROM flowers;
SELECT * FROM flowers ORDER BY price DESC LIMIT 1;
SELECT * FROM customers;
SELECT orders.order_id, customers.name, orders.order_date, orders.total
FROM orders
JOIN customers ON orders.customer_id = customers.customer_id;
SELECT DISTINCT customers.name, customers.email
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id;
SELECT COUNT(*) FROM orders;
SELECT SUM(total) FROM orders;
SELECT * FROM order_details WHERE order_id = 1;
SELECT * FROM flowers WHERE stock > 0;
SELECT * FROM orders WHERE order_date > '2024-01-01';
SELECT * FROM flowers WHERE price > 10.00;
SELECT * FROM customers WHERE phone IS NOT NULL;
SELECT * FROM orders WHERE total > 50.00;
SELECT * FROM customers WHERE email LIKE '%@gmail.com';
SELECT * FROM orders WHERE DATE(order_date) = '2024-02-01';
SELECT * FROM flowers WHERE color = 'Red';
SELECT * FROM flowers WHERE LOWER(color) = 'red';
SELECT * FROM customers WHERE address IS NULL;
aggregates · grouping · ordering · limiting · analysis patterns
SELECT
columns, aggregates
FROM
table
WHERE
row filters
GROUP BY
groups
HAVING
group filters
ORDER BY
sort
LIMIT / OFFSET
page
Traditional habit:
Read the query in order:
FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
-- For Module 5 practice (single table focus)
-- flowers(flower_id, name, color, price, stock, category, supplier)
-- Example categories: Rose, Tulip, Lily
-- Example suppliers: BloomCo, PetalWorks, GardenDirect
ORDER BY price DESC, name ASC
First:
highest price
Then:
alphabetical name
| flower_id | name | color | price | stock | category | supplier |
|---|---|---|---|---|---|---|
| 1 | Rose | Red | 12.50 | 30 | Rose | BloomCo |
| 2 | Tulip | Yellow | 6.00 | 80 | Tulip | GardenDirect |
| 3 | Lily | White | 9.25 | 15 | Lily | PetalWorks |
| 4 | Orchid | Purple | 19.99 | 6 | Orchid | BloomCo |
| 5 | Daisy | White | 3.50 | 120 | Daisy | GardenDirect |
SELECT flower_id, name, price
FROM flowers
ORDER BY price DESC, name ASC;
In the logical order of a query, ORDER BY is applied:
ORDER BY + LIMIT
= Top N pattern
LIMIT alone
= random-looking subset
| flower_id | name | price | stock | category |
|---|---|---|---|---|
| 1 | Rose | 12.50 | 30 | Rose |
| 2 | Tulip | 6.00 | 80 | Tulip |
| 3 | Lily | 9.25 | 15 | Lily |
| 4 | Orchid | 19.99 | 6 | Orchid |
| 5 | Daisy | 3.50 | 120 | Daisy |
SELECT name, price
FROM flowers
ORDER BY price DESC
LIMIT 3;
Pick the best explanation:
COUNT(*)
= total rows
COUNT(stock)
= rows where stock is NOT NULL
| flower_id | name | stock | supplier |
|---|---|---|---|
| 1 | Rose | 30 | BloomCo |
| 2 | Tulip | 80 | GardenDirect |
| 3 | Lily | 15 | PetalWorks |
| 4 | Orchid | 6 | BloomCo |
| 5 | Daisy | 120 | GardenDirect |
SELECT COUNT(*) AS flower_count
FROM flowers;
Choose the correct statement:
MIN(price) = cheapest flower
MIN(stock) = lowest stock level
| name | price | stock |
|---|---|---|
| Rose | 12.50 | 30 |
| Tulip | 6.00 | 80 |
| Lily | 9.25 | 15 |
| Orchid | 19.99 | 6 |
| Daisy | 3.50 | 120 |
SELECT MIN(price) AS cheapest_price
FROM flowers;
Pick the best answer:
MAX(price) = highest price value
ORDER BY price DESC LIMIT 1
= the row with highest price
| name | price | stock |
|---|---|---|
| Rose | 12.50 | 30 |
| Tulip | 6.00 | 80 |
| Lily | 9.25 | 15 |
| Orchid | 19.99 | 6 |
| Daisy | 3.50 | 120 |
SELECT MAX(price) AS highest_price
FROM flowers;
You want the full flower record with the highest price.
SUM(stock) = total units on hand
SUM(price * stock) = inventory value estimate
| name | price | stock |
|---|---|---|
| Rose | 12.50 | 30 |
| Tulip | 6.00 | 80 |
| Lily | 9.25 | 15 |
| Orchid | 19.99 | 6 |
| Daisy | 3.50 | 120 |
SELECT SUM(stock) AS total_units
FROM flowers;
Pick the most fitting use case:
AVG(price) = typical price
In PostgreSQL:
ROUND(AVG(price), 2)
| name | price | category |
|---|---|---|
| Rose | 12.50 | Rose |
| Tulip | 6.00 | Tulip |
| Lily | 9.25 | Lily |
| Orchid | 19.99 | Orchid |
| Daisy | 3.50 | Daisy |
SELECT ROUND(AVG(price), 2) AS avg_price
FROM flowers;
What does AVG compute?
GROUP BY category
Each category becomes one output row:
- COUNT(*)
- AVG(price)
- SUM(stock)
| name | category | price | stock |
|---|---|---|---|
| Rose | Rose | 12.50 | 30 |
| Rose (Mini) | Rose | 8.75 | 25 |
| Tulip | Tulip | 6.00 | 80 |
| Lily | Lily | 9.25 | 15 |
| Orchid | Orchid | 19.99 | 6 |
SELECT
category,
COUNT(*) AS items,
ROUND(AVG(price), 2) AS avg_price,
SUM(stock) AS total_stock
FROM flowers
GROUP BY category
ORDER BY items DESC;
When you GROUP BY category, each result row represents:
WHERE: individual rows
HAVING: grouped results
If you need COUNT(*) in the filter:
use HAVING
| name | category | price |
|---|---|---|
| Rose | Rose | 12.50 |
| Rose (Mini) | Rose | 8.75 |
| Rose (Garden) | Rose | 10.00 |
| Tulip | Tulip | 6.00 |
| Orchid | Orchid | 19.99 |
SELECT
category,
COUNT(*) AS items
FROM flowers
GROUP BY category
HAVING COUNT(*) >= 2
ORDER BY items DESC;
You want categories where COUNT(*) is at least 2. Which clause?
DISTINCT applies to SELECT output
COUNT(DISTINCT x)
= number of unique values of x
| name | supplier | category |
|---|---|---|
| Rose | BloomCo | Rose |
| Orchid | BloomCo | Orchid |
| Tulip | GardenDirect | Tulip |
| Daisy | GardenDirect | Daisy |
| Lily | PetalWorks | Lily |
SELECT COUNT(DISTINCT supplier) AS supplier_count
FROM flowers;
What is being counted?
ORDER BY category ASC, price DESC
First group by category (visually),
then show expensive items at top
| name | category | price |
|---|---|---|
| Rose | Rose | 12.50 |
| Rose (Garden) | Rose | 12.50 |
| Tulip | Tulip | 6.00 |
| Tulip (Premium) | Tulip | 6.00 |
| Orchid | Orchid | 19.99 |
SELECT name, category, price
FROM flowers
ORDER BY price DESC, name ASC;
Choose the best reason:
Example goal:
Average price of items that are in stock
WHERE stock > 0
then AVG(price)
| name | price | stock | category |
|---|---|---|---|
| Rose | 12.50 | 30 | Rose |
| Tulip | 6.00 | 0 | Tulip |
| Lily | 9.25 | 15 | Lily |
| Orchid | 19.99 | 0 | Orchid |
| Daisy | 3.50 | 120 | Daisy |
SELECT ROUND(AVG(price), 2) AS avg_in_stock_price
FROM flowers
WHERE stock > 0;
WHERE changes the aggregate results because it:
Pattern:
GROUP BY supplier
HAVING SUM(stock) < threshold
Find weak suppliers
| name | supplier | stock |
|---|---|---|
| Rose | BloomCo | 30 |
| Orchid | BloomCo | 6 |
| Tulip | GardenDirect | 80 |
| Daisy | GardenDirect | 120 |
| Lily | PetalWorks | 15 |
SELECT
supplier,
SUM(stock) AS total_stock
FROM flowers
GROUP BY supplier
HAVING SUM(stock) < 25
ORDER BY total_stock ASC;
When does HAVING filter happen?
Example:
Top 2 categories by total stock
GROUP BY category
ORDER BY SUM(stock) DESC
LIMIT 2
| category | name | stock |
|---|---|---|
| Rose | Rose | 30 |
| Rose | Rose (Mini) | 25 |
| Tulip | Tulip | 80 |
| Daisy | Daisy | 120 |
| Lily | Lily | 15 |
SELECT
category,
SUM(stock) AS total_stock
FROM flowers
GROUP BY category
ORDER BY total_stock DESC
LIMIT 2;
Pick the most correct query pattern:
COALESCE(stock, 0)
If stock is NULL:
treat it as 0 for reporting
| name | stock | supplier |
|---|---|---|
| Rose | 30 | BloomCo |
| Tulip | 80 | GardenDirect |
| Lily | 15 | PetalWorks |
| Orchid | NULL | BloomCo |
| Daisy | 120 | GardenDirect |
SELECT SUM(COALESCE(stock, 0)) AS total_units
FROM flowers;
Choose the correct interpretation:
CASE
WHEN price >= 10 THEN 'Premium'
ELSE 'Standard'
END
| name | price |
|---|---|
| Rose | 12.50 |
| Tulip | 6.00 |
| Lily | 9.25 |
| Orchid | 19.99 |
| Daisy | 3.50 |
SELECT
CASE
WHEN price >= 10 THEN 'Premium'
ELSE 'Standard'
END AS tier,
COUNT(*) AS items
FROM flowers
GROUP BY tier
ORDER BY items DESC;
Pick the best description:
Page size = 5
Page 1: LIMIT 5 OFFSET 0
Page 2: LIMIT 5 OFFSET 5
| name | price |
|---|---|
| Orchid | 19.99 |
| Rose | 12.50 |
| Lily | 9.25 |
| Tulip | 6.00 |
| Daisy | 3.50 |
SELECT name, price
FROM flowers
ORDER BY price DESC
LIMIT 2 OFFSET 2;
Choose the correct use:
Example UUID:
550e8400-e29b-41d4-a716-446655440000
Same format everywhere
No central coordination needed
SERIAL:
local, incremental
predictable
UUID:
global, random
safer for APIs
-- UUID column type
id UUID
-- Valid in PRIMARY KEY
PRIMARY KEY (id)
-- Enable extension once
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
-- Generate random UUID
SELECT gen_random_uuid();
CREATE TABLE orders (
order_id UUID
DEFAULT gen_random_uuid(),
customer_name TEXT NOT NULL,
created_at TIMESTAMP NOT NULL
DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (order_id)
);
-- Insert without UUID
INSERT INTO orders (customer_name)
VALUES ('Alice');
SELECT category, COUNT(*) AS items
FROM flowers
GROUP BY category
HAVING COUNT(*) >= 3
ORDER BY items DESC
LIMIT 5;
You want to filter rows where stock > 0 before computing SUM(stock).
SELECT COUNT(DISTINCT category)
FROM flowers;
SELECT category, price, COUNT(*)
FROM flowers
GROUP BY category;
What is wrong?
Goal: return the row for the cheapest flower with stock > 0
SELECT
COUNT(*),
MIN(price),
MAX(price),
ROUND(AVG(price),2)
FROM flowers;
| name | price |
|---|---|
| Rose | 12.50 |
| Tulip | 6.00 |
| Lily | 9.25 |
| Orchid | 19.99 |
| Daisy | 3.50 |
SELECT
COUNT(*) AS items,
MIN(price) AS min_price,
MAX(price) AS max_price,
ROUND(AVG(price), 2) AS avg_price
FROM flowers;
SELECT COUNT(*), MIN(price), MAX(price)
FROM flowers;
Find categories with avg price >= 10
GROUP BY category
HAVING AVG(price) >= 10
| name | category | price |
|---|---|---|
| Rose | Rose | 12.50 |
| Rose (Mini) | Rose | 8.75 |
| Orchid | Orchid | 19.99 |
| Tulip | Tulip | 6.00 |
| Lily | Lily | 9.25 |
SELECT
category,
ROUND(AVG(price), 2) AS avg_price
FROM flowers
GROUP BY category
HAVING AVG(price) >= 10
ORDER BY avg_price DESC;
Which level is filtered?
GROUP BY category, supplier
One output row per (category, supplier) pair
| name | category | supplier | stock |
|---|---|---|---|
| Rose | Rose | BloomCo | 30 |
| Rose (Mini) | Rose | GardenDirect | 25 |
| Tulip | Tulip | GardenDirect | 80 |
| Lily | Lily | PetalWorks | 15 |
| Orchid | Orchid | BloomCo | 6 |
SELECT
category,
supplier,
SUM(stock) AS total_stock
FROM flowers
GROUP BY category, supplier
ORDER BY category ASC, total_stock DESC;
Each result row represents:
Find categories where total stock < 40
GROUP BY category
HAVING SUM(stock) < 40
| category | name | stock |
|---|---|---|
| Rose | Rose | 30 |
| Rose | Rose (Mini) | 25 |
| Orchid | Orchid | 6 |
| Lily | Lily | 15 |
| Tulip | Tulip | 80 |
SELECT
category,
SUM(stock) AS total_stock
FROM flowers
GROUP BY category
HAVING SUM(stock) < 40
ORDER BY total_stock ASC;
You already grouped and computed SUM(stock). Now you filter by that sum using:
ORDER BY AVG(price) DESC
Sort groups by their computed average
| category | prices |
|---|---|
| Rose | 12.50, 8.75, 10.00 |
| Orchid | 19.99 |
| Tulip | 6.00 |
| Lily | 9.25 |
SELECT
category,
ROUND(AVG(price), 2) AS avg_price
FROM flowers
GROUP BY category
ORDER BY AVG(price) DESC;
This ordering is based on:
Pick the clause that defines the buckets:
Pick the clause that filters aggregated results:
Select the correct query:
COUNT(*) produces:
Choose the best answer:
Practice prompts (write the SQL):
-- PostgreSQL: one table for advanced query practice
CREATE TABLE book_inventory (
book_id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
author TEXT NOT NULL,
genre TEXT NOT NULL,
publisher TEXT NOT NULL,
price NUMERIC(10,2) NOT NULL,
stock INT NOT NULL,
published_year INT NOT NULL
);
-- Seed data idea (students can add more rows)
-- genre: 'Databases', 'Software Engineering', 'AI', 'Security'
-- publisher: 'Prentice Hall', 'OReilly', 'Pearson', 'MIT Press'
-- Typical questions :
-- 1) Show each book with publisher name
-- 2) Show each book with author names
-- 3) Find missing relationships (orphans)
-- That is why JOIN exists.
ERD relationship
↓ (implement)
PK/FK columns
↓ (query)
JOIN ON keys
-- PostgreSQL example question:
-- Show each book with its author name
-- Need:
-- books + authors + relationship (book_authors)
CREATE TABLE authors (
author_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
country TEXT
);
CREATE TABLE publishers (
publisher_id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
publisher_id INT REFERENCES publishers(publisher_id),
published_year INT NOT NULL,
price NUMERIC(10,2) NOT NULL
);
CREATE TABLE book_authors (
book_id INT REFERENCES books(book_id),
author_id INT REFERENCES authors(author_id),
PRIMARY KEY (book_id, author_id)
);
| author_id | name | country |
|---|---|---|
| 1 | Ada Lovelace | UK |
| 2 | Alan Turing | UK |
| 3 | Grace Hopper | USA |
| 4 | Edsger Dijkstra | Netherlands |
| book_id | title | publisher_id | published_year | price |
|---|---|---|---|---|
| 10 | Foundations of Computing | 100 | 2020 | 59.00 |
| 11 | Practical SQL Systems | 101 | 2019 | 49.50 |
| 12 | Algorithms the Classic Way | 100 | 2018 | 72.25 |
| 13 | Systems Thinking | NULL | 2021 | 39.99 |
| publisher_id | name |
|---|---|
| 100 | Classic Press |
| 101 | Data House |
| 102 | Future Books |
| book_id | author_id |
|---|---|
| 10 | 1 |
| 10 | 2 |
| 11 | 3 |
| 12 | 4 |
INNER JOIN = intersection
A ∩ B
Only matching keys from both tables
SELECT
b.book_id,
b.title,
p.name AS publisher
FROM books b
INNER JOIN publishers p
ON b.publisher_id = p.publisher_id
ORDER BY b.book_id;
Pick the best statement:
LEFT JOIN = keep left
All A
Plus matching B
Else NULL for B
SELECT
b.book_id,
b.title,
p.name AS publisher
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id
ORDER BY b.book_id;
SELECT
b.book_id,
b.title,
p.name AS publisher
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id
WHERE p.name = 'Classic Press'
ORDER BY b.book_id;
WHERE runs after the join. Rows where p.name is NULL are removed.
| book_id | title | publisher |
|---|---|---|
| 10 | Foundations of Computing | Classic Press |
| 12 | Algorithms the Classic Way | Classic Press |
⚠ Books without a publisher are gone.
LEFT JOIN behaves like INNER JOIN.
SELECT
b.book_id,
b.title,
p.name AS publisher
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id
AND p.name = 'Classic Press'
ORDER BY b.book_id;
Filter is applied during the join. Left rows are preserved.
| book_id | title | publisher |
|---|---|---|
| 10 | Foundations of Computing | Classic Press |
| 11 | Practical SQL Systems | NULL |
| 12 | Algorithms the Classic Way | Classic Press |
| 13 | Systems Thinking | NULL |
✔ All books remain.
✔ Publisher attached only when it matches.
LEFT JOIN then WHERE (right_col = ...)
can behave like INNER JOIN
Safer:
LEFT JOIN ... ON ... AND (right_col = ...)
SELECT
b.book_id,
b.title,
p.name AS publisher
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id
AND p.name = 'Classic Press'
ORDER BY b.book_id;
What can happen?
RIGHT JOIN = keep right
Usually rewrite as LEFT JOIN
by swapping sides
-- RIGHT JOIN form
SELECT p.name, b.title
FROM books b
RIGHT JOIN publishers p
ON b.publisher_id = p.publisher_id;
-- Preferred LEFT JOIN form
SELECT p.name, b.title
FROM publishers p
LEFT JOIN books b
ON b.publisher_id = p.publisher_id;
FULL OUTER JOIN = union of keys
All A and all B
Match when possible
Else NULLs
SELECT
b.title,
p.name AS publisher
FROM books b
FULL OUTER JOIN publishers p
ON b.publisher_id = p.publisher_id
ORDER BY p.name, b.title;
| book_id | title | publisher_id |
|---|---|---|
| 10 | Foundations of Computing | 100 |
| 11 | Practical SQL Systems | 101 |
| 12 | Algorithms the Classic Way | 100 |
| 13 | Systems Thinking | NULL |
Book 13 has no publisher_id.
| publisher_id | name |
|---|---|
| 100 | Classic Press |
| 101 | Data House |
| 102 | Future Books |
Publisher 102 has no books.
SELECT
b.title,
p.name AS publisher
FROM books b
FULL OUTER JOIN publishers p
ON b.publisher_id = p.publisher_id
ORDER BY p.name, b.title;
| title | publisher |
|---|---|
| Systems Thinking | NULL |
| Algorithms the Classic Way | Classic Press |
| Foundations of Computing | Classic Press |
| Practical SQL Systems | Data House |
| NULL | Future Books |
Both “missing sides” appear: book without publisher → publisher is NULL, publisher without books → title is NULL.
CROSS JOIN = all combinations
Rows = |A| × |B|
| author_id | name |
|---|---|
| 1 | Ada Lovelace |
| 2 | Alan Turing |
2 rows
| publisher_id | name |
|---|---|
| 100 | Classic Press |
| 101 | Data House |
| 102 | Future Books |
3 rows
SELECT
a.name AS author,
p.name AS publisher
FROM authors a
CROSS JOIN publishers p
ORDER BY a.name, p.name;
| author | publisher |
|---|---|
| Ada Lovelace | Classic Press |
| Ada Lovelace | Data House |
| Ada Lovelace | Future Books |
| Alan Turing | Classic Press |
| Alan Turing | Data House |
| Alan Turing | Future Books |
Cartesian product: rows = |A| × |B| = 2 × 3 = 6.
SELECT
a.name AS author,
p.name AS publisher
FROM authors a
CROSS JOIN publishers p
ORDER BY a.name, p.name;
authors ← book_authors → books
Two joins:
authors to bridge
bridge to books
SELECT
b.title,
a.name AS author
FROM books b
JOIN book_authors ba
ON ba.book_id = b.book_id
JOIN authors a
ON a.author_id = ba.author_id
ORDER BY b.title, a.name;
SELECT
b.title,
COUNT(*) AS author_count
FROM books b
JOIN book_authors ba
ON ba.book_id = b.book_id
GROUP BY b.title
ORDER BY author_count DESC, b.title;
| book_id | title |
|---|---|
| 10 | Foundations of Computing |
| 11 | Practical SQL Systems |
| 12 | Algorithms the Classic Way |
| 13 | Systems Thinking |
One row per book
| book_id | author_id |
|---|---|
| 10 | 1 |
| 10 | 2 |
| 11 | 3 |
| 12 | 4 |
Many rows per book possible
SELECT
b.title,
COUNT(*) AS author_count
FROM books b
JOIN book_authors ba
ON ba.book_id = b.book_id
GROUP BY b.title
ORDER BY author_count DESC, b.title;
Joined rows (one per book-author match)
| title | author_id |
|---|---|
| Foundations of Computing | 1 |
| Foundations of Computing | 2 |
| Practical SQL Systems | 3 |
| Algorithms the Classic Way | 4 |
Grouped result (one row per book)
| title | author_count |
|---|---|
| Foundations of Computing | 2 |
| Algorithms the Classic Way | 1 |
| Practical SQL Systems | 1 |
After joining books to book_authors, book rows may appear multiple times because:
SELECT
b.book_id,
b.title
FROM books b
LEFT JOIN book_authors ba
ON ba.book_id = b.book_id
WHERE ba.book_id IS NULL
ORDER BY b.book_id;
| book_id | title |
|---|---|
| 10 | Foundations of Computing |
| 11 | Practical SQL Systems |
| 12 | Algorithms the Classic Way |
| 13 | Systems Thinking |
All books (starting point)
| book_id | author_id |
|---|---|
| 10 | 1 |
| 10 | 2 |
| 11 | 3 |
| 12 | 4 |
Bridge rows (missing book_id = 13)
SELECT
b.book_id,
b.title
FROM books b
LEFT JOIN book_authors ba
ON ba.book_id = b.book_id
WHERE ba.book_id IS NULL
ORDER BY b.book_id;
After LEFT JOIN (unmatched become NULL)
| book_id | title | ba.book_id |
|---|---|---|
| 10 | Foundations of Computing | 10 |
| 10 | Foundations of Computing | 10 |
| 11 | Practical SQL Systems | 11 |
| 12 | Algorithms the Classic Way | 12 |
| 13 | Systems Thinking | NULL |
Filtered result (books with no authors)
| book_id | title |
|---|---|
| 13 | Systems Thinking |
Traditional habit: LEFT JOIN + IS NULL reveals missing relationships.
Anti join goal:
Keep A rows where no B match exists
| book_id | title |
|---|---|
| 10 | Foundations of Computing |
| 11 | Practical SQL Systems |
| 12 | Algorithms the Classic Way |
| 13 | Systems Thinking |
We want books that have no match in Table B.
| book_id | author_id |
|---|---|
| 10 | 1 |
| 10 | 2 |
| 11 | 3 |
| 12 | 4 |
Missing match: book_id = 13
Goal: keep A rows where no B match exists.
-- Pattern 1 (common): LEFT JOIN then IS NULL
SELECT b.book_id, b.title
FROM books b
LEFT JOIN book_authors ba
ON ba.book_id = b.book_id
WHERE ba.book_id IS NULL
ORDER BY b.book_id;
-- Pattern 2 (also common): NOT EXISTS
SELECT b.book_id, b.title
FROM books b
WHERE NOT EXISTS (
SELECT 1
FROM book_authors ba
WHERE ba.book_id = b.book_id
)
ORDER BY b.book_id;
Result table
| book_id | title |
|---|---|
| 13 | Systems Thinking |
Traditional habit: use anti joins for data quality (missing links, orphan checks).
SELECT
b.book_id,
b.title
FROM books b
WHERE NOT EXISTS (
SELECT 1
FROM book_authors ba
WHERE ba.book_id = b.book_id
)
ORDER BY b.book_id;
| book_id | title |
|---|---|
| 10 | Foundations of Computing |
| 11 | Practical SQL Systems |
| 12 | Algorithms the Classic Way |
| 13 | Systems Thinking |
| book_id | author_id |
|---|---|
| 10 | 1 |
| 10 | 2 |
| 11 | 3 |
| 12 | 4 |
Missing: book_id = 13 has no rows here.
SELECT
b.book_id,
b.title
FROM books b
WHERE NOT EXISTS (
SELECT 1
FROM book_authors ba
WHERE ba.book_id = b.book_id
)
ORDER BY b.book_id;
Result table
| book_id | title |
|---|---|
| 13 | Systems Thinking |
Traditional habit: use NOT EXISTS when you only need a yes/no match, not joined columns.
Semi join:
Keep A rows where a match exists in B
Use EXISTS
| book_id | title |
|---|---|
| 10 | Foundations of Computing |
| 11 | Practical SQL Systems |
| 12 | Algorithms the Classic Way |
| 13 | Systems Thinking |
| book_id | author_id |
|---|---|
| 10 | 1 |
| 10 | 2 |
| 11 | 3 |
| 12 | 4 |
Exists match for: 10, 11, 12
No match for: 13
Goal: keep A rows where a B match exists.
SELECT
b.book_id,
b.title
FROM books b
WHERE EXISTS (
SELECT 1
FROM book_authors ba
WHERE ba.book_id = b.book_id
)
ORDER BY b.book_id;
Result table
| book_id | title |
|---|---|
| 10 | Foundations of Computing |
| 11 | Practical SQL Systems |
| 12 | Algorithms the Classic Way |
Traditional habit: use EXISTS to avoid duplicates when the right side is “many”.
SELECT
b.book_id,
b.title
FROM books b
WHERE EXISTS (
SELECT 1
FROM book_authors ba
WHERE ba.book_id = b.book_id
)
ORDER BY b.book_id;
authors a1 JOIN authors a2
Same table, different roles
CREATE TABLE editions (
edition_id INT PRIMARY KEY,
book_id INT REFERENCES books(book_id),
edition_number INT NOT NULL,
previous_edition_id INT REFERENCES editions(edition_id)
);
previous_edition_id → edition_id| edition_id | edition_number | previous_edition_id |
|---|---|---|
| 1 | 1 | NULL |
| 2 | 2 | 1 |
| 3 | 3 | 2 |
SELECT
e.edition_number,
prev.edition_number AS prev
FROM editions e
LEFT JOIN editions prev
ON e.previous_edition_id = prev.edition_id;
| edition | prev |
|---|---|
| 1 | NULL |
| 2 | 1 |
| 3 | 2 |
SELECT
e.edition_id,
e.edition_number,
prev.edition_number AS previous_edition_number
FROM editions e
LEFT JOIN editions prev
ON e.previous_edition_id = prev.edition_id
ORDER BY e.edition_id;
JOIN ... ON b.price BETWEEN band.min AND band.max
Not just equality
< > BETWEEN| band | min | max |
|---|---|---|
| Budget | 0 | 39.99 |
| Standard | 40 | 69.99 |
| Premium | 70 | 999 |
SELECT
b.title,
pb.band
FROM books b
JOIN price_bands pb
ON b.price BETWEEN pb.min AND pb.max;
| title | band |
|---|---|
| Foundations of Computing | Standard |
| Algorithms the Classic Way | Premium |
| Systems Thinking | Budget |
CREATE TABLE price_bands (
band TEXT PRIMARY KEY,
min_price NUMERIC(10,2) NOT NULL,
max_price NUMERIC(10,2) NOT NULL
);
-- Example rows:
-- ('Budget', 0.00, 39.99)
-- ('Standard', 40.00, 69.99)
-- ('Premium', 70.00, 9999.99)
SELECT
b.title,
b.price,
pb.band
FROM books b
JOIN price_bands pb
ON b.price BETWEEN pb.min_price AND pb.max_price
ORDER BY b.price DESC;
books → publishers
books → book_authors → authors
Keep each link explicit
SELECT
b.title,
p.name AS publisher,
a.name AS author
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id
JOIN book_authors ba
ON ba.book_id = b.book_id
JOIN authors a
ON a.author_id = ba.author_id
ORDER BY b.title, a.name;
If you want unique books only:
SELECT DISTINCT b.book_id, b.title
...
| book_id | title | author |
|---|---|---|
| 10 | Foundations of Computing | Ada |
| 10 | Foundations of Computing | Turing |
Same book appears multiple times.
SELECT DISTINCT
b.book_id,
b.title
FROM books b
JOIN book_authors ba
ON ba.book_id = b.book_id;
| book_id | title |
|---|---|
| 10 | Foundations of Computing |
One row per book.
SELECT DISTINCT
b.book_id,
b.title
FROM books b
JOIN book_authors ba
ON ba.book_id = b.book_id
ORDER BY b.book_id;
Prefer explicit ON
USING is acceptable if column names are stable
Avoid NATURAL JOIN
| books | publishers |
|---|---|
| publisher_id | publisher_id |
-- Explicit (best)
JOIN publishers p
ON b.publisher_id = p.publisher_id
-- Shorthand
JOIN publishers p
USING (publisher_id)
-- Avoid
NATURAL JOIN publishers
Traditional habit: prefer explicit ON.
-- If both tables have publisher_id
SELECT
b.title,
p.name AS publisher
FROM books b
JOIN publishers p
USING (publisher_id);
SELECT
b.title,
COALESCE(p.name, 'Unknown Publisher') AS publisher
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id;
COALESCE replaces NULL with a label| title | publisher |
|---|---|
| Foundations of Computing | Classic Press |
| Systems Thinking | NULL |
NULL hides missing information.
SELECT
b.title,
COALESCE(p.name, 'Unknown') AS publisher
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id;
| title | publisher |
|---|---|
| Foundations of Computing | Classic Press |
| Systems Thinking | Unknown |
Clear and explicit in reports.
SELECT b.title, p.name
FROM (
SELECT * FROM books
WHERE published_year >= 2020
) b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id;
SELECT b.title, p.name
FROM books b
JOIN publishers p
ON b.publisher_id = p.publisher_id
WHERE p.name = 'Data House';
SELECT name
FROM books b
JOIN publishers p ON ...
-- Ambiguous: which name?
SELECT p.name
-- Clear
FROM books b
LEFT JOIN publishers p ON ...
JOIN book_authors ba ON ...
JOIN authors a ON ...
One step at a time
SELECT
COUNT(*) AS joined_rows,
COUNT(DISTINCT b.book_id) AS unique_books
FROM books b
JOIN book_authors ba
ON ba.book_id = b.book_id;
SELECT
p.name AS publisher,
COUNT(*) AS book_rows
FROM publishers p
LEFT JOIN books b
ON b.publisher_id = p.publisher_id
GROUP BY p.name
ORDER BY book_rows DESC;
What is being counted?
SELECT
b.title,
STRING_AGG(a.name, ', ' ORDER BY a.name) AS authors
FROM books b
JOIN book_authors ba ON ba.book_id = b.book_id
JOIN authors a ON a.author_id = ba.author_id
GROUP BY b.title
ORDER BY b.title;
STRING_AGG| title | author |
|---|---|
| Foundations of Computing | Ada |
| Foundations of Computing | Turing |
Same book, many rows.
SELECT
b.title,
STRING_AGG(a.name, ', ' ORDER BY a.name) AS authors
FROM books b
JOIN book_authors ba ON ba.book_id = b.book_id
JOIN authors a ON a.author_id = ba.author_id
GROUP BY b.title
ORDER BY b.title;
| title | authors |
|---|---|
| Foundations of Computing | Ada, Turing |
One row, readable list.
SELECT
a.name,
COUNT(*) AS books_written
FROM authors a
JOIN book_authors ba
ON ba.author_id = a.author_id
GROUP BY a.name
ORDER BY books_written DESC, a.name
LIMIT 5;
| author | book_id |
|---|---|
| Ada | 10 |
| Ada | 12 |
| Turing | 10 |
Bridge creates one row per (author, book).
SELECT
a.name,
COUNT(*) AS books_written
FROM authors a
JOIN book_authors ba
ON ba.author_id = a.author_id
GROUP BY a.name
ORDER BY books_written DESC, a.name
LIMIT 5;
| author | books_written |
|---|---|
| Ada | 2 |
| Turing | 1 |
Ranked summary.
SELECT
b.title,
b.published_year,
a.name AS author
FROM books b
JOIN book_authors ba ON ba.book_id = b.book_id
JOIN authors a ON a.author_id = ba.author_id
WHERE b.published_year >= 2020
ORDER BY b.published_year DESC, b.title, a.name;
| title | year |
|---|---|
| Systems Thinking | 2021 |
| Foundations of Computing | 2020 |
| Algorithms the Classic Way | 2018 |
Only years ≥ 2020 qualify.
SELECT
b.title,
b.published_year,
a.name AS author
FROM books b
JOIN book_authors ba ON ba.book_id = b.book_id
JOIN authors a ON a.author_id = ba.author_id
WHERE b.published_year >= 2020
ORDER BY b.published_year DESC, b.title, a.name;
| title | year | author |
|---|---|---|
| Systems Thinking | 2021 | Grace |
| Foundations of Computing | 2020 | Ada |
| Foundations of Computing | 2020 | Turing |
Filter first, then join.
SELECT
p.name,
COUNT(b.book_id) AS book_count
FROM publishers p
LEFT JOIN books b
ON b.publisher_id = p.publisher_id
GROUP BY p.name
HAVING COUNT(b.book_id) = 0
ORDER BY p.name;
| publisher | book_id |
|---|---|
| Classic Press | 10 |
| Data House | 11 |
| Future Books | NULL |
NULL means no matching book.
SELECT
p.name,
COUNT(b.book_id) AS book_count
FROM publishers p
LEFT JOIN books b
ON b.publisher_id = p.publisher_id
GROUP BY p.name
HAVING COUNT(b.book_id) = 0
ORDER BY p.name;
| publisher | book_count |
|---|---|
| Future Books | 0 |
Data quality check.
In a LEFT JOIN from publishers to books, COUNT(*) would:
-- Bad: missing ON creates cross join
SELECT b.title, p.name
FROM books b
JOIN publishers p;
-- Good: correct ON
SELECT b.title, p.name
FROM books b
JOIN publishers p
ON b.publisher_id = p.publisher_id;
-- Example: index the FK for faster joins
CREATE INDEX idx_books_publisher_id
ON books(publisher_id);
EXPLAIN
SELECT b.title, p.name
FROM books b
JOIN publishers p
ON b.publisher_id = p.publisher_id;
Goal: list all books, show publisher name when it exists.
SELECT b.title, p.name
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id
WHERE p.name = 'Classic Press';
What is the risk?
Goal: show author names and the titles they wrote.
You must use book_authors.
SELECT
a.name,
b.title
FROM authors a
JOIN book_authors ba
ON ba.author_id = a.author_id
JOIN books b
ON b.book_id = ba.book_id
ORDER BY a.name, b.title;
-- Orphaned books.publisher_id
SELECT
b.book_id,
b.title,
b.publisher_id
FROM books b
LEFT JOIN publishers p
ON b.publisher_id = p.publisher_id
WHERE b.publisher_id IS NOT NULL
AND p.publisher_id IS NULL
ORDER BY b.book_id;
SELECT
p.name AS publisher,
b.title,
b.price
FROM publishers p
JOIN (
SELECT publisher_id, MIN(price) AS min_price
FROM books
WHERE publisher_id IS NOT NULL
GROUP BY publisher_id
) m
ON m.publisher_id = p.publisher_id
JOIN books b
ON b.publisher_id = m.publisher_id
AND b.price = m.min_price
ORDER BY p.name, b.title;
NULL = NULL is not true
Prefer PK and FK joins
Avoid joining by names
Traditional habit:
Write joins so a reader can verify keys
without guessing
Traditional habit: start with the join keys, then refine.
-- Template
SELECT ...
FROM ...
JOIN ... ON ...
LEFT JOIN ... ON ...
WHERE ...
GROUP BY ...
HAVING ...
ORDER BY ...
LIMIT ...;
CREATE TABLE trail (
trail_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
difficulty TEXT NOT NULL,
capacity INT NOT NULL
);
CREATE TABLE reservation (
reservation_id SERIAL PRIMARY KEY,
hiker_name TEXT NOT NULL,
hike_date DATE NOT NULL,
trail_id INT REFERENCES trail(trail_id)
);
trail (1) -------- (many) reservation
r.trail_id = t.trail_id
Write an INNER JOIN
between reservation and trail.
Add a WHERE clause.
Which JOIN keeps
all rows from trail?
Use COUNT()
Use GROUP BY
Use GROUP BY
Use HAVING
Use COUNT(*)
Use ORDER BY
Use LIMIT
Use LIMIT
Use OFFSET
SELECT
r.reservation_id,
r.hiker_name,
r.hike_date,
t.name AS trail_name
FROM reservation r
JOIN trail t
ON r.trail_id = t.trail_id;
SELECT
r.hiker_name,
r.hike_date,
t.name AS trail_name
FROM reservation r
JOIN trail t
ON r.trail_id = t.trail_id
WHERE t.difficulty = 'Hard';
SELECT
t.trail_id,
t.name AS trail_name,
r.reservation_id,
r.hiker_name,
r.hike_date
FROM trail t
LEFT JOIN reservation r
ON r.trail_id = t.trail_id
ORDER BY t.trail_id, r.hike_date;
SELECT
t.trail_id,
t.name AS trail_name,
COUNT(r.reservation_id) AS total_reservations
FROM trail t
LEFT JOIN reservation r
ON r.trail_id = t.trail_id
GROUP BY t.trail_id, t.name
ORDER BY total_reservations DESC, t.trail_id;
SELECT
t.trail_id,
t.name AS trail_name,
t.capacity,
COUNT(r.reservation_id) AS total_reservations
FROM trail t
LEFT JOIN reservation r
ON r.trail_id = t.trail_id
GROUP BY t.trail_id, t.name, t.capacity
HAVING COUNT(r.reservation_id) > t.capacity
ORDER BY total_reservations DESC, t.trail_id;
SELECT COUNT(*) AS total_reservations
FROM reservation;
SELECT
reservation_id,
hiker_name,
hike_date,
trail_id
FROM reservation
ORDER BY hike_date, reservation_id
LIMIT 3;
SELECT
reservation_id,
hiker_name,
hike_date,
trail_id
FROM reservation
ORDER BY hike_date, reservation_id
LIMIT 3 OFFSET 3;
Write 2 queries:
1) MIN/MAX/AVG on trail.capacity
2) GROUP BY trail.difficulty with COUNT(*)
-- 1) capacity stats
SELECT
MIN(capacity) AS min_capacity,
MAX(capacity) AS max_capacity,
AVG(capacity) AS avg_capacity
FROM trail;
-- 2) trails per difficulty
SELECT
difficulty,
COUNT(*) AS trails_count
FROM trail
GROUP BY difficulty
ORDER BY trails_count DESC;
In simple terms:
SQL is important because it gives us
a reliable and consistent way to
store, retrieve, and manage data.
Without SQL, structured data systems
would not scale or communicate well.
SQL:
Structured Query Language
Standard for relational databases.
SQLite:
A software system that
implements SQL.
Language ≠ Database
CREATE TABLE students (
student_id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
major TEXT
);
SELECT name, major
FROM students
WHERE major = 'CS';
INSERT INTO students (student_id, name, major)
VALUES (1, 'Ana', 'CS');
UPDATE students
SET major = 'Data Science'
WHERE student_id = 1;
DELETE FROM students
WHERE student_id = 1;
| Student Question | Instructor Response |
|---|---|
| How are attendance and participation points calculated, and is there a maximum? | Attendance and participation are calculated based on consistent presence and meaningful engagement. |
| For team projects, how are individual contributions evaluated within a single team grade? | There is one overall team project grade, but each submission must clearly state individual responsibilities, and contributions may be reviewed through GitHub activity and documentation, with adjustments made if significant imbalance is observed. |
| Clarification on grading breakdown: What are the percentage values for labs, homework, projects, and exams? | The grading breakdown is as follows: Labs 15 percent, Homework 15 percent, Projects 40 percent, and Exams 30 percent. |
| How is assignment submission on GitHub acknowledged and recorded? | Assignment submission on GitHub is acknowledged by confirming that the code has been uploaded to the repository. |
Relational Algebra: Selection and Projection
Relational Algebra: Join (⨝)
Relational Algebra Optimization
API and Database Interaction
Web vs Console Interface
GROUP BY and Aggregation
CHECK Constraints
INNER JOIN
LEFT JOIN
RIGHT JOIN
Natural Join Risk (Textbook 1, p.128)
Which queries typically require JOINS? (Textbook 1, p.141)
Unmatched rows and OUTER JOIN (Textbook 1, p.149, Fig 7-12)
GROUP BY query difference (Textbook 2, p.67–71)
CHECK constraints meaning (Textbook 2, p.99)
Submission guidance
git clone https://github.com/SE4CPS/2026-COMP-163.git
cd Project/Part-1/Sample
UPDATE teamX_flowers
SET water_level =
water_level - (5 * (CURRENT_DATE - last_watered));
| order_id | customer | items | prices |
|---|---|---|---|
| 1001 | Alice | Rose,Tulip | 12.50,6.00 |
| 1002 | Bob | Lily,Orchid | 9.25,15.00 |
-- Not reliable in UNF SELECT * FROM Orders WHERE items = 'Rose';
-- Hacky and unsafe SELECT * FROM Orders WHERE items LIKE '%Rose%';
| order_id | flower | unit_price |
|---|---|---|
| 1001 | Rose | 12.50 |
| 1001 | Tulip | 6.00 |
| 1002 | Lily | 9.25 |
| 1002 | Orchid | 15.00 |
SELECT * FROM OrderItems WHERE flower = 'Rose';
| customer | phones |
|---|---|
| Alice | 0101,0109 |
| Clara | 0103,0111 |
-- Impossible to enforce uniqueness SELECT COUNT(*) FROM Customers WHERE phones = '0101';
| customer | phone |
|---|---|
| Alice | 0101 |
| Alice | 0109 |
| Clara | 0103 |
| Clara | 0111 |
SELECT * FROM CustomerPhones WHERE phone = '0101';
| supplier | location |
|---|---|
| BloomCo | Stockton,CA |
| FloraSupreme | Sacramento,CA |
-- Cannot filter by state safely SELECT * FROM Suppliers WHERE location LIKE '%CA';
| supplier | city | state |
|---|---|---|
| BloomCo | Stockton | CA |
| FloraSupreme | Sacramento | CA |
SELECT * FROM Suppliers WHERE state = 'CA';
| order_id | tags |
|---|---|
| 1001 | ["gift","red"] |
| 1002 | ["wedding"] |
-- Hard to count per tag SELECT COUNT(*) FROM Orders WHERE tags = 'gift';
| order_id | tag |
|---|---|
| 1001 | gift |
| 1001 | red |
| 1002 | wedding |
SELECT COUNT(*) FROM OrderTags WHERE tag = 'gift';
| order_id | flower |
|---|---|
| 1001 | Rose |
| 1001 | Rose |
-- Which row is correct? DELETE FROM OrderItems WHERE order_id = 1001 AND flower = 'Rose';
| PK | PK |
|---|---|
| order_id | flower |
| 1001 | Rose |
| 1001 | Tulip |
| order_id | items | prices |
|---|---|---|
| 1001 | Rose,Tulip | 12.50,6.00 |
| 1002 | Lily,Orchid | 9.25,15.00 |
| order_id | items | prices |
|---|---|---|
| 1001 | Rose | 12.50 |
| 1001 | Tulip | 6.00 |
| 1002 | Lily | 9.25 |
| 1002 | Orchid | 15.00 |
| supplier | address |
|---|---|
| BloomCo | 123 Market, Stockton, CA |
| FloraSupreme | 45 Garden, Sacramento, CA |
| supplier | street | city | state |
|---|---|---|---|
| BloomCo | 123 Market | Stockton | CA |
| FloraSupreme | 45 Garden | Sacramento | CA |
| PK | PK | |
|---|---|---|
| order_id | flower | unit_price |
| 1001 | Rose | 12.50 |
| 1001 | Tulip | 6.00 |
| 1002 | Lily | 9.25 |
| 1002 | Orchid | 15.00 |
reservation_id | hiker_name | trails | dates --------------------------------------------------------------------------- 5001 | Alice | Half Dome, Full Day | 2026-06-01, 2026-06-02 5001 | Bob | Yosemite Falls, Morning | 2026-06-03 5002 | Clara | Half Dome, Yosemite Falls | 2026-06-04, 2026-06-05 5002 | Tom | Mist Trail | 2026-06-06
| PK | PK | |||
|---|---|---|---|---|
| order_id | flower | unit_price | customer | phone |
| 1001 | Rose | 12.50 | Alice | 0101 |
| 1001 | Tulip | 6.00 | Alice | 0101 |
| 1002 | Lily | 9.25 | Bob | 0102 |
| 1002 | Orchid | 15.00 | Bob | 0102 |
-- Update touches multiple rows (symptom) UPDATE OrderItems SET phone = '9999' WHERE order_id = 1001;
| Column | Depends on | Why this is bad |
|---|---|---|
| customer | order_id | Repeats per item row |
| phone | order_id | Repeats per item row |
| unit_price | flower | Repeats across many orders |
| PK | ||
|---|---|---|
| order_id | customer | phone |
| 1001 | Alice | 0101 |
| 1002 | Bob | 0102 |
| PK | |||
|---|---|---|---|
| order_item_id | order_id | flower | unit_price |
| 1 | 1001 | Rose | 12.50 |
| 2 | 1001 | Tulip | 6.00 |
| 3 | 1002 | Lily | 9.25 |
| 4 | 1002 | Orchid | 15.00 |
-- Phone changes once UPDATE Orders SET phone = '9999' WHERE order_id = 1001;
| PK | ||
|---|---|---|
| flower_id | name | list_price |
| 1 | Rose | 12.50 |
| 2 | Tulip | 6.00 |
| 3 | Lily | 9.25 |
| 4 | Orchid | 15.00 |
| PK | ||
|---|---|---|
| order_item_id | order_id | flower_id |
| 1 | 1001 | 1 |
| 2 | 1001 | 2 |
| 3 | 1002 | 3 |
| 4 | 1002 | 4 |
-- Price changes once UPDATE Flowers SET list_price = 13.00 WHERE flower_id = 1;
| PK | ||||
|---|---|---|---|---|
| flower_id | name | list_price | supplier_name | supplier_addr |
| 1 | Rose | 12.50 | BloomCo | 123 Market |
| 2 | Tulip | 6.00 | BloomCo | 123 Market |
| 3 | Lily | 9.25 | FloraSupreme | 45 Garden |
| 4 | Orchid | 15.00 | FloraSupreme | 45 Garden |
-- Update anomaly (address repeated) UPDATE Flowers SET supplier_addr = '125 Market' WHERE supplier_name = 'BloomCo';
| PK | |||
|---|---|---|---|
| supplier_id | name | street | city |
| 10 | BloomCo | 123 Market | Stockton |
| 20 | FloraSupreme | 45 Garden | Sacramento |
| PK | |||
|---|---|---|---|
| flower_id | name | list_price | supplier_id |
| 1 | Rose | 12.50 | 10 |
| 2 | Tulip | 6.00 | 10 |
| 3 | Lily | 9.25 | 20 |
| 4 | Orchid | 15.00 | 20 |
-- Address changes once UPDATE Suppliers SET street = '125 Market' WHERE supplier_id = 10;
| PK | ||
|---|---|---|
| order_item_id | order_id | flower_id |
| 1 | 1001 | 1 |
| 2 | 1001 | 2 |
| 3 | 1002 | 3 |
| 4 | 1002 | 4 |
SELECT o.order_id, o.customer, f.name, s.name AS supplier FROM Orders o JOIN OrderItems oi ON oi.order_id = o.order_id JOIN Flowers f ON f.flower_id = oi.flower_id JOIN Suppliers s ON s.supplier_id = f.supplier_id;
| PK | PK | |||||
|---|---|---|---|---|---|---|
| order_id | flower | customer | phone | unit_price | supplier_name | supplier_addr |
| 1001 | Rose | Alice | 0101 | 12.50 | BloomCo | 123 Market |
| 1001 | Tulip | Alice | 0101 | 6.00 | BloomCo | 123 Market |
| 1002 | Lily | Bob | 0102 | 9.25 | FloraSupreme | 45 Garden |
| 1002 | Orchid | Bob | 0102 | 15.00 | FloraSupreme | 45 Garden |
| PK | ||
|---|---|---|
| order_id | customer | phone |
| 1001 | Alice | 0101 |
| 1002 | Bob | 0102 |
| PK | PK | |||
|---|---|---|---|---|
| order_id | flower | unit_price | supplier_name | supplier_addr |
| 1001 | Rose | 12.50 | BloomCo | 123 Market |
| 1001 | Tulip | 6.00 | BloomCo | 123 Market |
| 1002 | Lily | 9.25 | FloraSupreme | 45 Garden |
| 1002 | Orchid | 15.00 | FloraSupreme | 45 Garden |
| PK | ||
|---|---|---|
| supplier_id | name | addr |
| 10 | BloomCo | 123 Market |
| 20 | FloraSupreme | 45 Garden |
| PK | ||
|---|---|---|
| flower_id | name | supplier_id |
| 1 | Rose | 10 |
| 2 | Tulip | 10 |
| 3 | Lily | 20 |
| 4 | Orchid | 20 |
| PK | ||
|---|---|---|
| order_id | customer | phone |
| 1001 | Alice | 0101 |
| 1002 | Bob | 0102 |
| PK | PK | |
|---|---|---|
| order_id | flower_id | unit_price |
| 1001 | 1 | 12.50 |
| 1001 | 2 | 6.00 |
| 1002 | 3 | 9.25 |
| 1002 | 4 | 15.00 |
| Year | Normal Form |
|---|---|
| 1970 | 1NF |
| 1971 | 2NF |
| 1971 | 3NF |
| Year | ~Cost per MB |
|---|---|
| 1970 | $1,000,000+ |
| 1980 | $10,000 |
| 1990 | $100 |
| 2000 | $1 |
| 2010 | $0.10 |
| 2020 | <$0.02 |
| Constraint | Works on Atomic Value? |
|---|---|
| PRIMARY KEY | Yes |
| UNIQUE | Yes |
| NOT NULL | Yes |
| CHECK | Yes |
| FOREIGN KEY | Yes |
reservation_id | hiker_name | trails | dates
---------------------------------------------------------------------------
5001 | Alice | Half Dome, Mist Trail | 2026-06-01, Feb
5001 | Bob | Yosemite Falls | 2026-06-03
5002 | Clara | Half Dome, Yosemite Falls | 2026-06-04, Apr
5002 | Tom | Mist Trail | 2026-06-06
reservation_id | hiker_name | trails | dates
---------------------------------------------------------------------------
5001 | Alice | Half Dome, Mist Trail | 2026-06-01, Dec
5001 | Bob | Yosemite Falls | 2026-06-03
5002 | Clara | Half Dome, Yosemite Falls | 2026-06-04, Feb
5002 | Tom | Mist Trail | 2026-06-06
reservation_id | hiker_name | trails | dates
---------------------------------------------------------------------------
5001 | Alice | Half Dome, Mist Trail | 2026-06-01, Jan
5001 | Bob | Yosemite Falls | 2026-06-03
5002 | Clara | Half Dome, Yosemite Falls | 2026-06-04, Apr
5002 | Tom | Mist Trail | 2026-06-06
reservation_id | hiker_name | trails | dates
---------------------------------------------------------------------------
5001 | Alice | Half Dome, Mist Trail | 2026-06-01, Feb
5001 | Bob | Yosemite Falls | 2026-06-03
5002 | Clara | Half Dome, Yosemite Falls | 2026-06-04, Mar
5002 | Tom | Mist Trail | 2026-06-06
reservation_id | hiker_name | trails | dates
---------------------------------------------------------------------------
5001 | Alice | Half Dome, Mist Trail | 2026-06-01, Feb
5001 | Bob | Yosemite Falls | 2026-06-03
5002 | Clara | Half Dome, Yosemite Falls | 2026-06-04, Apr
5002 | Tom | Mist Trail | 2026-06-06
CREATE TABLE trail (
trail_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
difficulty TEXT NOT NULL,
capacity INT NOT NULL
);
CREATE TABLE reservation (
reservation_id SERIAL PRIMARY KEY,
hiker_name TEXT NOT NULL,
hike_date DATE NOT NULL,
trail_id INT REFERENCES trail(trail_id)
);
trail (1) -------- (many) reservation
r.trail_id = t.trail_id
Write an INNER JOIN
between reservation and trail.
-- Imported legacy data (not in 1NF)
"River Path" | "Easy, Scenic" | 25 | "Sunny, Windy"
"Canyon Loop" | "Medium, Shaded" | 15 | "Cloudy"
"Summit Ridge"| "Hard, Rocky, Steep" | 10 | "Snowy, Cold"
Add a WHERE clause.
[
{
"name": "Pine Valley Trail",
"guide": "Emma Johnson, Liam Brown"
},
{
"name": "Red Rock Ridge",
"guide": "Carlos Martinez, Ava Wilson"
},
{
"name": "Silver Lake Path",
"guide": "Olivia Chen"
}
]
Which JOIN keeps
all rows from trail?
trail_name,guide_name,guide_phone
River Path,Emma Johnson,555-0101
Canyon Loop,Emma Johnson,555-0101
Summit Ridge,Carlos Martinez,555-0202
Forest Walk,Carlos Martinez,555-0202
Use COUNT()
Use GROUP BY
Use GROUP BY
Use HAVING
Use COUNT(*)
Use ORDER BY
Use LIMIT
Use LIMIT
Use OFFSET
SELECT
r.reservation_id,
r.hiker_name,
r.hike_date,
t.name AS trail_name
FROM reservation r
JOIN trail t
ON r.trail_id = t.trail_id;
-- Change request: allow legacy raw values
ALTER TABLE trail
ADD COLUMN difficulty_raw TEXT,
ADD COLUMN weather_raw TEXT;
-- Insert legacy data
INSERT INTO trail (name, difficulty_raw, weather_raw, capacity)
VALUES
('River Path', 'Easy, Scenic', 'Sunny, Windy', 25),
('Canyon Loop', 'Medium, Shaded', 'Cloudy', 15),
('Summit Ridge','Hard, Rocky, Steep', 'Snowy, Cold', 10);
SELECT
r.hiker_name,
r.hike_date,
t.name AS trail_name
FROM reservation r
JOIN trail t
ON r.trail_id = t.trail_id
WHERE t.difficulty = 'Hard';
-- Add JSONB column
ALTER TABLE trail
ADD COLUMN guides JSONB;
-- Update existing rows (match by name)
UPDATE trail
SET guides = '["Emma Johnson", "Liam Brown"]'
WHERE name = 'River Path';
UPDATE trail
SET guides = '["Carlos Martinez", "Ava Wilson"]'
WHERE name = 'Canyon Loop';
UPDATE trail
SET guides = '["Olivia Chen"]'
WHERE name = 'Summit Ridge';
-- Select JSON data
SELECT name, guides
FROM trail;
SELECT
t.trail_id,
t.name AS trail_name,
r.reservation_id,
r.hiker_name,
r.hike_date
FROM trail t
LEFT JOIN reservation r
ON r.trail_id = t.trail_id
ORDER BY t.trail_id, r.hike_date;
-- 1) Guide table
CREATE TABLE guide (
guide_id SERIAL PRIMARY KEY,
guide_name TEXT NOT NULL,
guide_phone TEXT NOT NULL
);
-- 2) Trail assignment table
CREATE TABLE trail_assignment (
assignment_id SERIAL PRIMARY KEY,
trail_name TEXT NOT NULL,
guide_id INT NOT NULL,
FOREIGN KEY (guide_id) REFERENCES guide(guide_id)
);
-- Insert guide data
INSERT INTO guide (guide_name, guide_phone) VALUES
('Emma Johnson', '555-0101'),
('Carlos Martinez', '555-0202');
-- Insert assignments (using FK references)
INSERT INTO trail_assignment (trail_name, guide_id) VALUES
('River Path', 1),
('Canyon Loop', 1),
('Summit Ridge', 2),
('Forest Walk', 2);
SELECT
t.trail_id,
t.name AS trail_name,
COUNT(r.reservation_id) AS total_reservations
FROM trail t
LEFT JOIN reservation r
ON r.trail_id = t.trail_id
GROUP BY t.trail_id, t.name
ORDER BY total_reservations DESC, t.trail_id;
SELECT
t.trail_id,
t.name AS trail_name,
t.capacity,
COUNT(r.reservation_id) AS total_reservations
FROM trail t
LEFT JOIN reservation r
ON r.trail_id = t.trail_id
GROUP BY t.trail_id, t.name, t.capacity
HAVING COUNT(r.reservation_id) > t.capacity
ORDER BY total_reservations DESC, t.trail_id;
SELECT COUNT(*) AS total_reservations
FROM reservation;
SELECT
reservation_id,
hiker_name,
hike_date,
trail_id
FROM reservation
ORDER BY hike_date, reservation_id
LIMIT 3;
SELECT
reservation_id,
hiker_name,
hike_date,
trail_id
FROM reservation
ORDER BY hike_date, reservation_id
LIMIT 3 OFFSET 3;
Write 2 queries:
1) MIN/MAX/AVG on trail.capacity
2) GROUP BY trail.difficulty with COUNT(*)
-- 1) capacity stats
SELECT
MIN(capacity) AS min_capacity,
MAX(capacity) AS max_capacity,
AVG(capacity) AS avg_capacity
FROM trail;
-- 2) trails per difficulty
SELECT
difficulty,
COUNT(*) AS trails_count
FROM trail
GROUP BY difficulty
ORDER BY trails_count DESC;
In simple terms:
SQL is important because it gives us
a reliable and consistent way to
store, retrieve, and manage data.
Without SQL, structured data systems
would not scale or communicate well.
SQL:
Structured Query Language
Standard for relational databases.
SQLite:
A software system that
implements SQL.
Language ≠ Database
CREATE TABLE students (
student_id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
major TEXT
);
SELECT name, major
FROM students
WHERE major = 'CS';
INSERT INTO students (student_id, name, major)
VALUES (1, 'Ana', 'CS');
UPDATE students
SET major = 'Data Science'
WHERE student_id = 1;
DELETE FROM students
WHERE student_id = 1;
| Student Question | Instructor Response |
|---|---|
| How are attendance and participation points calculated, and is there a maximum? | Attendance and participation are calculated based on consistent presence and meaningful engagement. |
| For team projects, how are individual contributions evaluated within a single team grade? | There is one overall team project grade, but each submission must clearly state individual responsibilities, and contributions may be reviewed through GitHub activity and documentation, with adjustments made if significant imbalance is observed. |
| Clarification on grading breakdown: What are the percentage values for labs, homework, projects, and exams? | The grading breakdown is as follows: Labs 15 percent, Homework 15 percent, Projects 40 percent, and Exams 30 percent. |
| How is assignment submission on GitHub acknowledged and recorded? | Assignment submission on GitHub is acknowledged by confirming that the code has been uploaded to the repository. |
Relational Algebra: Selection and Projection
Relational Algebra: Join (⨝)
Relational Algebra Optimization
API and Database Interaction
Web vs Console Interface
GROUP BY and Aggregation
CHECK Constraints
INNER JOIN
LEFT JOIN
RIGHT JOIN
Natural Join Risk (Textbook 1, p.128)
Which queries typically require JOINS? (Textbook 1, p.141)
Unmatched rows and OUTER JOIN (Textbook 1, p.149, Fig 7-12)
GROUP BY query difference (Textbook 2, p.67–71)
CHECK constraints meaning (Textbook 2, p.99)
Submission guidance
git clone https://github.com/SE4CPS/2026-COMP-163.git
cd Project/Part-1/Sample
UPDATE teamX_flowers
SET water_level =
water_level - (5 * (CURRENT_DATE - last_watered));
CREATE TABLE posts (
post_id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id BIGINT NOT NULL,
created_at DATETIME NOT NULL,
content TEXT,
likes_count INT DEFAULT 0
) ENGINE=InnoDB;
| Query | Intent | What the engine must do |
|---|---|---|
WHERE user_id = 5001 |
Profile feed | Find all rows for one user |
WHERE created_at >= '2026-01-01' |
Recent posts | Locate start point then read in order |
WHERE user_id = 5001 ORDER BY created_at DESC LIMIT 20 |
Latest 20 posts | Find user range, read newest first, stop early |
SELECT *
FROM posts
WHERE user_id = 5001;
| n (rows) | Worst-case checks | Impact |
|---|---|---|
| 10,000 | 10,000 | Often acceptable |
| 1,000,000 | 1,000,000 | Noticeable latency, higher I/O |
| 50,000,000 | 50,000,000 | Can dominate latency and reduce throughput |
| Plan | Time Complexity | Work Growth |
|---|---|---|
| Full table scan | O(n) | Linear |
| B-Tree lookup | O(log n) | Very slow growth |
| B-Tree range scan | O(log n + k) | Depends on output size |
Root (few pages)
↓
Internal nodes (few pages)
↓
Leaf pages (many pages)

50
/ \
30 70
/ \ / \
20 40 60 80
[30 | 60]
/ | \
[10 20] [40 50] [70 80 90]
[ 5000 | 9000 ]
/ | \
[1000|3000] [6000|7000] [10000|15000]
/ \ | / \
... ... ... ... ...
CREATE INDEX idx_price
ON flowers(price);
Level 1 [50]
Level 2 [30] [70]
Level 3 [20] [40] [60] [80]
Level 4 ...
Level 5 ...
Level 1 [30 | 60 | 90]
Level 2 [10 20] [40 50] [70 80] [100 110]
Example:
Fanout ≈ 1000
Level 1 → 1 node
Level 2 → 1000 nodes
Level 3 → 1,000,000 entries
Only 3 page reads to reach any row.
Assume max 3 keys per node
Before insert:
[ 10 | 20 | 30 ]
Insert 40 → overflow:
[ 10 | 20 | 30 | 40 ]
[30]
/ \
[10 | 20] [40]
High fanout example:
Level 1: [ 5000 | 9000 ]
/ | \
Level 2: [1000|3000] [6000|7000] [10000|15000]
/ \ | / \
Level 3: ... ... ... ... ...
More keys per node → Fewer levels
Fewer levels → Fewer page reads
Height 3 tree:
Root → Internal → Leaf
= 3 page reads
Height 6 tree:
= 6 page reads
user_idEXPLAIN
CREATE INDEX idx_posts_user
ON posts(user_id);
SELECT *
FROM posts
WHERE user_id = 42;
CREATE INDEX idx_posts_created
ON posts(created_at);
SELECT *
FROM posts
WHERE created_at >= '2026-01-01'
AND created_at < '2026-02-01'
ORDER BY created_at;
user_id and order by created_atLIMIT
CREATE INDEX idx_user_created
ON posts(user_id, created_at);
SELECT *
FROM posts
WHERE user_id = 42
ORDER BY created_at DESC
LIMIT 10;
user_idcreated_atSELECT post_id, created_at
FROM posts
WHERE user_id = 5001
ORDER BY created_at DESC
LIMIT 20;
| Predicate | Index Friendly? | Reason |
|---|---|---|
= |
Yes | Exact match |
BETWEEN |
Yes | Range scan |
LIKE 'abc%' |
Often | Prefix search |
LIKE '%abc' |
No | Leading wildcard breaks ordering |
Good: user_id = 5001
Good: created_at BETWEEN a AND b
Bad: content LIKE '%hello%'
content is TEXT (large)SHOW INDEX FROM posts;
ANALYZE TABLE posts;
Improves cardinality estimates used by the optimizer.
| At Design Time | After Deployment | Consequence |
|---|---|---|
| Assumed common filters | Different filters dominate | Need new indexes |
| Estimated row counts | Data grows faster than expected | Scans become too slow |
| Simple queries expected | Composite patterns appear | Need composite indexes |
EXPLAIN to see plan choicesEXPLAIN
SELECT *
FROM posts
WHERE user_id = 5001;
| Write Pattern | What indexes add | Risk |
|---|---|---|
| Frequent INSERT | Each index must be updated | Throughput drops |
| UPDATE of indexed column | Reposition key in B-Tree | Higher write cost |
| Bulk ingestion | Many index page writes | Long ingest times |
| Operation | Cost with 0 extra indexes | Cost with k indexes |
|---|---|---|
| INSERT | Lower overhead | O(k · log n) |
| UPDATE indexed column | Lower overhead | O(k · log n) |
| DELETE | Lower overhead | O(k · log n) |
CREATE TABLE posts (
id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT,
created_at DATETIME,
content VARCHAR(100)
);
WITH RECURSIVE seq AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM seq WHERE n < 500000
)
INSERT INTO posts (user_id, created_at, content)
SELECT
FLOOR(RAND()*10000),
NOW() - INTERVAL FLOOR(RAND()*365) DAY,
CONCAT('Post ', n)
FROM seq;
DROP TABLE IF EXISTS posts;
CREATE TABLE posts (
id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
created_at DATETIME NOT NULL,
content VARCHAR(100) NOT NULL
);
INSERT INTO posts (user_id, created_at, content)
SELECT
FLOOR(RAND() * 10000) AS user_id,
NOW() - INTERVAL FLOOR(RAND() * 365) DAY AS created_at,
CONCAT('Post ', n) AS content
FROM (
SELECT
(a.d + 10*b.d + 100*c.d + 1000*d.d + 10000*e.d) + 1 AS n
FROM (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) e
) nums
WHERE n <= 500000;
SELECT COUNT(*) FROM posts;
EXPLAIN ANALYZE
SELECT *
FROM posts
WHERE user_id = 500
ORDER BY created_at DESC
LIMIT 20;
CREATE INDEX idx_user_created
ON posts(user_id, created_at DESC);
EXPLAIN ANALYZE
SELECT *
FROM posts
WHERE user_id = 500
ORDER BY created_at DESC
LIMIT 20;
SHOW INDEX, EXPLAINSHOW INDEX FROM tasks;
EXPLAIN SELECT * FROM tasks WHERE status='done';
-- Todo example
CREATE INDEX idx_status
ON tasks(status);
Whether clustered or not, the internal structure is still a B-Tree.
| Aspect | B-Tree Index | Clustered Index |
|---|---|---|
| What is sorted? | Only the index | The actual table rows |
| How many allowed? | Many | Only one |
| Example | idx_status |
PRIMARY KEY (id) |
| Data storage | Separate from table | Data stored in that order |
| Database | Is PRIMARY KEY auto-indexed? | Are rows stored in PK order (clustered)? | Beginner takeaway |
|---|---|---|---|
| MariaDB (InnoDB) | Yes (auto B-Tree) | Yes (clustered by PK) | Table data “lives” in the PK B-Tree |
| SQLite | Yes (practically, via rowid / PK) | Often yes (rowid order is the storage order) | Rowid acts like the built-in key ordering |
| PostgreSQL | Yes (auto B-Tree) | No (heap table, not clustered by default) | PK index exists, but rows are stored separately |
CREATE TABLE tasks (
id INT PRIMARY KEY,
title VARCHAR(100),
status VARCHAR(20),
created_at DATETIME
);
id
SELECT *
FROM tasks
WHERE id BETWEEN 100 AND 200;
EXPLAIN QUERY PLAN
SELECT * FROM tasks WHERE status='done';
EXPLAIN (ANALYZE, BUFFERS)EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM tasks WHERE status='done';
| Feature | MariaDB | SQLite | PostgreSQL |
|---|---|---|---|
| Default index | B-Tree | B-Tree | B-Tree |
| Runs as | Server | Embedded | Server |
| Clustered PK | Yes (InnoDB) | Rowid behavior | No (heap table) |
| Advanced index types | Limited | Minimal | Many |
status and sort by created_at-- Logical schema
id, title, status, priority, created_at
DROP TABLE IF EXISTS tasks;
CREATE TABLE tasks (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255) NOT NULL,
status VARCHAR(20) NOT NULL,
priority INT NOT NULL,
created_at DATETIME NOT NULL
) ENGINE=InnoDB;
-- SQLite
CREATE TABLE tasks (
id INTEGER PRIMARY KEY,
title TEXT NOT NULL,
status TEXT NOT NULL,
priority INTEGER NOT NULL,
created_at TEXT NOT NULL
);
-- PostgreSQL
CREATE TABLE tasks (
id SERIAL PRIMARY KEY,
title VARCHAR(255) NOT NULL,
status VARCHAR(20) NOT NULL,
priority INT NOT NULL,
created_at TIMESTAMP NOT NULL
);
INSERT INTO tasks (title, status, priority, created_at)
SELECT
CONCAT('Task ', n),
ELT(FLOOR(1 + RAND()*3),'open','in_progress','done'),
FLOOR(RAND()*5),
NOW() - INTERVAL FLOOR(RAND()*30) DAY
FROM (
SELECT (a.d + 10*b.d + 100*c.d + 1000*d.d) + 1 AS n
FROM (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
) nums
WHERE n <= 10000;
SELECT COUNT(*) FROM tasks;
WITH RECURSIVE seq(n) AS (
SELECT 1
UNION ALL
SELECT n + 1 FROM seq WHERE n < 10000
)
INSERT INTO tasks (title, status, priority, created_at)
SELECT
'Task ' || n,
CASE (ABS(RANDOM()) % 3)
WHEN 0 THEN 'open'
WHEN 1 THEN 'in_progress'
ELSE 'done'
END,
ABS(RANDOM()) % 5,
DATETIME('now', '-' || (ABS(RANDOM()) % 30) || ' days')
FROM seq;
SELECT COUNT(*) FROM tasks;
INSERT INTO tasks (title, status, priority, created_at)
SELECT
'Task ' || n,
(ARRAY['open','in_progress','done'])[floor(random()*3)+1],
floor(random()*5),
NOW() - (floor(random()*30) || ' days')::interval
FROM generate_series(1,10000) AS n;
SELECT COUNT(*) FROM tasks;
SELECT *
FROM tasks
WHERE status = 'done';
status='done'| DB | Plan / Analyzer Command |
|---|---|
| MariaDB |
|
| PostgreSQL |
|
| SQLite |
|
CREATE INDEX idx_tasks_status
ON tasks(status);
CREATE INDEX idx_tasks_status
ON tasks(status);
CREATE INDEX idx_tasks_status
ON tasks(status);
WHERE status = 'done'statusEXPLAIN
SELECT *
FROM tasks
WHERE status = 'done';
EXPLAIN ANALYZE
SELECT *
FROM tasks
WHERE status = 'done';
EXPLAIN QUERY PLAN
SELECT *
FROM tasks
WHERE status = 'done';
idx_tasks_statusEXPLAIN ANALYZE
SELECT *
FROM tasks
WHERE status = 'done'
ORDER BY created_at DESC
LIMIT 20;
idx_tasks_status exists, DB may still need to sort(status, created_at)
statuscreated_at-- Think: ordered groups
(status='done', newest → oldest)
(status='open', newest → oldest)
(status='in_progress', newest → oldest)
| status | created_at | title |
|---|---|---|
| done | 2026-02-25 10:05 | Task A |
| open | 2026-02-25 09:00 | Task B |
| done | 2026-02-25 08:10 | Task C |
| done | 2026-02-24 20:00 | Task D |
done rows are next to each otherLIMIT 20 can stop early (fast)-- MariaDB
CREATE INDEX idx_status_created
ON tasks(status, created_at DESC);
-- SQLite (DESC allowed in index; planner may still optimize)
CREATE INDEX idx_status_created
ON tasks(status, created_at);
-- PostgreSQL
CREATE INDEX idx_status_created
ON tasks(status, created_at DESC);
WHERE status=? ORDER BY created_at DESCEXPLAIN ANALYZE
SELECT *
FROM tasks
WHERE status = 'done'
ORDER BY created_at DESC
LIMIT 20;
SHOW INDEX, EXPLAINSHOW INDEX FROM tasks;
EXPLAIN SELECT * FROM tasks WHERE status='done';
SELECT *
FROM tasks
WHERE id BETWEEN 100 AND 200;
status and sort by created_at-- Logical schema
id, title, status, priority, created_at
DROP TABLE IF EXISTS tasks;
CREATE TABLE tasks (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255) NOT NULL,
status VARCHAR(20) NOT NULL,
priority INT NOT NULL,
created_at DATETIME NOT NULL
) ENGINE=InnoDB;
INSERT INTO tasks (title, status, priority, created_at)
SELECT
CONCAT('Task ', n),
ELT(FLOOR(1 + RAND()*3),'open','in_progress','done'),
FLOOR(RAND()*5),
NOW() - INTERVAL FLOOR(RAND()*30) DAY
FROM (
SELECT (a.d + 10*b.d + 100*c.d + 1000*d.d) + 1 AS n
FROM (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
CROSS JOIN (SELECT 0 d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
) nums
WHERE n <= 10000;
SELECT COUNT(*) FROM tasks;
SELECT *
FROM tasks
WHERE status = 'done';
status='done'| DB | Plan / Analyzer Command |
|---|---|
| MariaDB |
|
| PostgreSQL |
|
| SQLite |
|
CREATE INDEX idx_tasks_status
ON tasks(status);
CREATE INDEX idx_tasks_status
ON tasks(status);
CREATE INDEX idx_tasks_status
ON tasks(status);
WHERE status = 'done'statusEXPLAIN
SELECT *
FROM tasks
WHERE status = 'done';
EXPLAIN ANALYZE
SELECT *
FROM tasks
WHERE status = 'done';
EXPLAIN QUERY PLAN
SELECT *
FROM tasks
WHERE status = 'done';
idx_tasks_statusEXPLAIN ANALYZE
SELECT *
FROM tasks
WHERE status = 'done'
ORDER BY created_at DESC
LIMIT 20;
idx_tasks_status exists, DB may still need to sort(status, created_at)
statuscreated_at-- Think: ordered groups
(status='done', newest → oldest)
(status='open', newest → oldest)
(status='in_progress', newest → oldest)
| status | created_at | title |
|---|---|---|
| done | 2026-02-25 10:05 | Task A |
| open | 2026-02-25 09:00 | Task B |
| done | 2026-02-25 08:10 | Task C |
| done | 2026-02-24 20:00 | Task D |
done rows are next to each otherLIMIT 20 can stop early (fast)-- MariaDB
CREATE INDEX idx_status_created
ON tasks(status, created_at DESC);
-- SQLite (DESC allowed in index; planner may still optimize)
CREATE INDEX idx_status_created
ON tasks(status, created_at);
-- PostgreSQL
CREATE INDEX idx_status_created
ON tasks(status, created_at DESC);
WHERE status=? ORDER BY created_at DESCEXPLAIN ANALYZE
SELECT *
FROM tasks
WHERE status = 'done'
ORDER BY created_at DESC
LIMIT 20;
Application
|
| SQLite library
|
flowers.db ← file on disk
Application Code
(Java · Python · C++ · JS)
│
│ SQLite library
│ parse · plan · execute
│
Database File
flowers.db
(tables · rows · indexes)
SQLite is included out of the box on many devices and operating systems.
Fix typos in climode.html
Open Parts ✔️
Closed or Protected Parts 🔒
-- rename a table
ALTER TABLE flowers
RENAME TO plants;
-- add a new column
ALTER TABLE flowers
ADD COLUMN origin TEXT;
-- rename a column
ALTER TABLE flowers
RENAME COLUMN price TO unit_price;
-- remove a column
ALTER TABLE flowers
DROP COLUMN color;
-- change column type
ALTER TABLE flowers
ALTER COLUMN price REAL;
-- add multiple columns
ALTER TABLE flowers
ADD COLUMN size TEXT,
ADD COLUMN stock INTEGER;
DB Browser for SQLite offers a simple graphical interface to manage SQLite databases.
sqlite3 database_name.db
CREATE TABLE table_name (
column1 datatype,
column2 datatype
);
INSERT INTO table_name (column1, column2)
VALUES (value1, value2);
import sqlite3
conn = sqlite3.connect("db.db")
cur = conn.cursor()
cur.execute("SELECT * FROM table_name;")
val db = SQLiteDatabase.openOrCreateDatabase("db.db", null)
val cursor = db.rawQuery("SELECT * FROM table_name;", null)
#include "sqlite3.h"
sqlite3 *db;
sqlite3_open("db.db", &db);
sqlite3_exec(db, "SELECT * FROM table_name;", 0, 0, 0);
import SQLite3
var db: OpaquePointer?
sqlite3_open("db.db", &db)
sqlite3_exec(db, "SELECT * FROM table_name;", nil, nil, nil)
ACID Basics
Isolation
Multi-User Limits ⚠️
PRAGMA foreign_keys = ON;
PRAGMA foreign_keys;
Plant Table
CREATE TABLE Plant (
PlantID INTEGER PRIMARY KEY,
Name TEXT NOT NULL,
WaterThreshold FLOAT
);
SensorData Table
CREATE TABLE SensorData (
SensorDataID INTEGER PRIMARY KEY,
PlantID INTEGER NOT NULL,
WaterLevel FLOAT,
Timestamp DATETIME,
FOREIGN KEY (PlantID) REFERENCES Plant(PlantID)
);
Create
INSERT INTO Plant (Name, WaterThreshold)
VALUES ('Aloe Vera', 30);
INSERT INTO SensorData (PlantID, WaterLevel, Timestamp)
VALUES (1, 22, '2025-11-27 10:00');
Read
SELECT *
FROM Plant;
SELECT PlantID, WaterLevel
FROM SensorData
ORDER BY Timestamp DESC;
Update
UPDATE Plant
SET WaterThreshold = 40
WHERE PlantID = 1;
UPDATE SensorData
SET WaterLevel = 18
WHERE SensorDataID = 1;
Delete
DELETE FROM Plant
WHERE PlantID = 3;
DELETE FROM SensorData
WHERE Timestamp < '2025-01-01';
Dot Commands
PRAGMA Settings
PRAGMA foreign_keys = ON;
PRAGMA foreign_keys;
Foreign Key Violation
INSERT INTO SensorData (PlantID, WaterLevel, Timestamp)
VALUES (999, 50, '2025-11-27 11:00');
CREATE TABLE User (
Name VARCHAR(2)
);
INSERT INTO User (Name)
VALUES ('ABCDE'); -- Works!
-- Strict mode needed for real enforcement
CREATE TABLE UserStrict (
Name TEXT
) STRICT;
PostgreSQL Database Management System
(formerly known as Postgres, then Postgres95)
Copyright (c) 1996–2024,
The PostgreSQL Global Development Group
Permission to use, copy, modify, and distribute this
software and its documentation for any purpose, without
fee, and without a written agreement is hereby granted,
provided that the above copyright notice and this
paragraph appear in all copies.
IN NO EVENT SHALL THE POSTGRESQL GLOBAL DEVELOPMENT
GROUP BE LIABLE FOR ANY DAMAGES ARISING OUT OF THE
USE OF THIS SOFTWARE.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE
);
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE
);
product_id INTEGER
-- example: 101
product_id INTEGER
-- example: 101
product_name VARCHAR(100)
-- example: 'Oak Dining Table'
product_name VARCHAR(100)
-- example: 'Oak Dining Table'
description TEXT
-- example: 'Solid wood table with natural finish'
description TEXT
-- example: 'Solid wood table with natural finish'
in_stock BOOLEAN
-- example: 1
in_stock BOOLEAN
-- example: true
release_date DATE
-- example: '2025-03-01'
release_date DATE
-- example: '2025-03-01'
product_id INTEGER
PRIMARY KEY AUTOINCREMENT
-- example values: 1, 2, 3
product_id SERIAL PRIMARY KEY
-- example values: 1, 2, 3
price NUMERIC CHECK (price > 0)
-- example: 1299.99
price NUMERIC(10,2)
CHECK (price > 0 AND price < 20000)
-- example: 1299.99
SELECT date('now');
-- example result: '2026-02-04'
SELECT date('now','+7 days');
-- example result: '2026-02-11'
SELECT CURRENT_DATE;
-- example result: 2026-02-04
SELECT CURRENT_DATE + INTERVAL '7 days';
-- example result: 2026-02-11
created_at TIMESTAMP
-- example: '2025-03-01 10:30:00'
created_at TIMESTAMP
-- example: '2025-03-01 10:30:00'
created_at TIMESTAMP
-- example: '2025-03-01 10:30:00'
created_at TIMESTAMPTZ
-- example: '2025-03-01 18:30:00+00'
weight REAL
-- example: 45.5
weight REAL
-- example: 45.5
price NUMERIC
-- example: 1299.99
price NUMERIC(10,2)
-- example: 1299.99
specs JSON
-- example: '{"material":"oak","seats":6}'
specs JSONB
-- example: '{"material":"oak","seats":6}'
-- ENUM
status ENUM('draft','active','discontinued')
-- example: 'active'
-- ARRAY
tags TEXT[]
-- example: {'wood','dining','premium'}
-- JSONB
specs JSONB
-- example: {"material":"oak","seats":6}
-- UUID
product_uuid UUID
-- example: '550e8400-e29b-41d4-a716-446655440000'
-- INTERVAL
warranty INTERVAL
-- example: '2 years 6 months'
-- ENUM
SELECT * FROM products WHERE status = 'active';
-- ARRAY
SELECT * FROM products WHERE 'wood' = ANY(tags);
-- JSONB
SELECT * FROM products WHERE specs->>'material' = 'oak';
-- UUID
SELECT *
FROM products
WHERE product_uuid = '550e8400-e29b-41d4-a716-446655440000';
-- INTERVAL
SELECT * FROM products WHERE warranty > INTERVAL '2 years';
INSERT INTO users (name, email)
VALUES ('Alice', 'alice@example.com');
INSERT INTO users (name, email)
VALUES ('Alice', 'alice@example.com')
RETURNING id;
SELECT id, name, email
FROM users
WHERE name = 'Alice';
SELECT id, name, email
FROM users
WHERE name = 'Alice';
UPDATE users
SET email = 'new@example.com'
WHERE id = 1;
UPDATE users
SET email = 'new@example.com'
WHERE id = 1
RETURNING *;
DELETE FROM users
WHERE id = 1;
DELETE FROM users
WHERE id = 1
RETURNING id;
BEGIN;
UPDATE users SET name = 'Bob';
COMMIT;
BEGIN;
UPDATE users SET name = 'Bob';
COMMIT;
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
user_id INTEGER,
FOREIGN KEY (user_id)
REFERENCES users(id)
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users(id)
);
price TEXT
price INTEGER
price REAL
price NUMERIC(10,2)
data JSONB
id UUID
Application (Frontend / Backend)
|
| network connection (TCP)
|
PostgreSQL Server
| parse · plan · execute
|
Database Storage
(tables · rows · indexes)
Client App
(Python · Java · Node · Go)
│
│ driver / protocol
│ authenticate · send SQL
▼
PostgreSQL Server
parse · plan · execute
│
▼
Storage
tables · rows · indexes
SQLite
PostgreSQL
| Database | License | Restrictions |
|---|---|---|
| SQLite | Public Domain | No restrictions, free for any use |
| PostgreSQL | PostgreSQL License (BSD style) | Requires keeping copyright notice |
PostgreSQL Database Management System
Copyright (c) 1996-2024, The PostgreSQL Global Development Group
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose, without fee, and without a written
agreement is hereby granted, provided that the above copyright notice
and this paragraph appear in all copies.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
pg_stat_statements extension
-- typical enable (often via postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
SELECT query, calls, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 5;
CREATE TABLE users (
id UUID PRIMARY KEY,
name TEXT
);
-- example value
'550e8400-e29b-41d4-a716-446655440000'
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
total NUMERIC(10,2)
);
INSERT INTO orders (total)
VALUES (99.99)
RETURNING id;
id SERIAL PRIMARY KEY
-- 1,2,3,4
id UUID PRIMARY KEY
-- random pattern
CREATE TABLE products (
id SERIAL PRIMARY KEY,
tags TEXT[]
);
INSERT INTO products (tags)
VALUES ('{wood,dining,premium}');
SELECT *
FROM products
WHERE 'wood' = ANY(tags);
-- overlap operator
SELECT *
FROM products
WHERE tags && '{wood,oak}';
&& checks overlap
data JSON
data JSONB
CREATE TABLE events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
payload JSONB
);
INSERT INTO events (payload)
VALUES (
'{"type":"click","device":"mobile"}'
);
SELECT *
FROM events
WHERE payload->>'device' = 'mobile';
->> extracts text
SELECT *
FROM events
WHERE payload @>
'{"type":"click"}';
mysql.user table already exists!
Run mariadb-upgrade, not mariadb-install-db
ERROR 2002 (HY000):
Can't connect to local server
through socket '/tmp/mysql.sock' (2)
FATAL ERROR: Upgrade failed
brew services start mariadb
mysqladmin ping
ls /opt/homebrew/var/mysql/*.sock
sudo ln -sf /opt/homebrew/var/mysql/mysql.sock /tmp/mysql.sock
mariadb-upgrade -uroot
$ python --version
Python 3.11.x
$ pip --version
pip 23.x from ... (python 3.9)
# Result:
# Package installs to different Python
# ModuleNotFoundError
which python
python --version
python -m pip install package_name
python3 -m pip install package_name
# or explicitly:
python3.11 -m pip install package_name
python -m pip --version
MariaDB installed via Windows installer
mysql -u root -p
ERROR 2003 (HY000):
Can't connect to MySQL server
on 'localhost'
sc query MariaDB
net start MariaDB
Win + R
services.msc
→ Find "MariaDB"
→ Right-click → Start
mysql -u root -p
mysqladmin ping
This program is free software; you can redistribute it
and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation;
either version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY.
| Feature | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| AutoPrimary | ROWID | SERIAL | AUTO_INCREMENT |
| Keyword | AUTOINCREMENT | GENERATED | AUTO_INCREMENT |
| Feature | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| UniqueConstraint | YES | YES | YES |
| MultipleNULL | YES | YES | YES |
| Feature | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| NotNull | YES | YES | YES |
| StrictTyping | NO | YES | YES |
| Feature | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| DefaultValue | YES | YES | YES |
| NowFunction | datetime | NOW | NOW |
| Feature | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| CheckConstraint | YES | YES | YES |
| Deferrable | NO | YES | NO |
| Feature | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| ForeignKey | PRAGMA | ON | INNODB |
| Cascade | YES | YES | YES |
| TypeRule | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| Typing | AFFINITY | STRICT | STRICT |
| TypeRule | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| BooleanType | INTEGER | BOOLEAN | TINYINT |
| TypeRule | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| DateStorage | TEXT | DATE | DATE |
| TimestampTZ | NO | YES | NO |
| Type | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| JSON | TEXT | JSONB | JSON |
| ARRAY | NO | YES | NO |
| Database | Example Syntax | Notes |
|---|---|---|
| SQLite |
|
PRAGMA foreign_keys = ON required |
| PostgreSQL |
|
Enforced by default Supports DEFERRABLE |
| MariaDB |
|
Requires InnoDB engine |
| Database | Engine Model | Default Engine | Multiple Engines |
|---|---|---|---|
| SQLite | Embedded | Single | NO |
| PostgreSQL | Integrated | Native | NO |
| MariaDB | Pluggable | InnoDB | YES |
| Concept | Meaning |
|---|---|
| Engine | Storage + Locking + Transaction implementation |
| InnoDB | Row-level locking + ACID |
| MyISAM | Table-level locking + No transactions |
| Use Case | SQLite | PostgreSQL | MariaDB |
|---|---|---|---|
| Prototyping | GOOD | GOOD | GOOD |
| SingleUser | GOOD | OK | OK |
| EmbeddedApp | GOOD | NO | NO |
| WebApp | OK | GOOD | GOOD |
| MultiUser | OK | GOOD | GOOD |
| LAMPStack | NO | OK | GOOD |
| Production | OK | GOOD | GOOD |
| Property | SQLite | MariaDB | PostgreSQL |
|---|---|---|---|
| Large ANSI / ISO | LOW | MEDIUM | HIGH |
| FullOuterJoin | NO | NO | YES |
| Deferrable | NO | NO | YES |
| CheckEnforced | BASIC | GOOD | FULL |
| FilterClause | NO | NO | YES |
| StrongTyping | NO | YES | YES |
CREATE TABLE flowers (
id INT,
name TEXT,
color TEXT,
price NUMERIC
);
INSERT INTO flowers VALUES
(1, 'Rose', 'Red', 5),
(2, 'Lily', 'White', 7),
(3, 'Tulip', 'Red', 4),
(4, 'Daisy', 'White', 3);
SELECT
COUNT(*) AS total_flowers,
COUNT(*) FILTER (WHERE color = 'Red') AS red_flowers,
AVG(price) FILTER (WHERE color = 'White') AS avg_white_price
FROM flowers;
Note: FILTER supported in PostgreSQL
Class Schedule:
Time: 08:00 AM – 09:15 AM on Monday, Wednesday, Friday
Location: John T Chambers Technology Center 114 (CTC 114)
A Database Management System (DBMS) is designed for the efficient storage, access, and update of data. This course explores database systems from two main perspectives.
| Teaching Assistant: | Mr. Qazi Haad |
| Email: | q_haad@u.pacific.edu |
| Office: | Student Support Center |
| Office Hours: | By appointment |
| Instructor: | Dr. Solomon Berhe |
| Email: | sberhe@pacific.edu |
| Zoom: | Zoom Meeting Link |
| Office: | CTC 117 |
| Office Hours: | Mon/Wed, 2:00–3:00 PM (or by appointment) |
Textbooks:
| Homework: | Includes written exercises and programming assignments. Submit via Canvas or GitHub. |
| Team Projects: | Two projects (database design, development, and application integration) |
| Exams: | One midterm and one final. Students may bring one handwritten note sheet (8.5x11) |
All course materials, announcements, and assignments will be posted on Canvas.
Check regularly
for updates
Students are expected to act with integrity, honesty, and responsibility. Violations of the Honor Code will be reported to the Office of Student Conduct and Community Standards
For more details, see the University Academic Honesty Policy
Source Document: Course Syllabus
Due: February 13
Submission
What is data integrity and why is it important in a database?
Data integrity ensures data remains accurate and consistent by using constraints that prevent errors, duplicates, and invalid entries.
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
model TEXT NOT NULL,
price REAL NOT NULL CHECK(price > 0)
);
Provide examples of key data integrity use cases in a Bike system.
CREATE TABLE Stores (
storeID INTEGER PRIMARY KEY
);
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
model TEXT NOT NULL,
stock INTEGER CHECK(stock >= 0),
storeID INTEGER,
FOREIGN KEY(storeID) REFERENCES Stores(storeID)
);
Explain key SQL constraints.
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
serialNumber TEXT UNIQUE,
model TEXT NOT NULL,
bikeType TEXT DEFAULT 'road',
price REAL CHECK(price > 0)
);
Write SQLite tables demonstrating constraints.
CREATE TABLE Stores (
storeID INTEGER PRIMARY KEY,
storeName TEXT NOT NULL UNIQUE,
city TEXT NOT NULL DEFAULT 'Stockton'
);
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY AUTOINCREMENT,
serialNumber TEXT NOT NULL UNIQUE,
model TEXT NOT NULL,
bikeType TEXT NOT NULL DEFAULT 'road',
storeID INTEGER NOT NULL,
FOREIGN KEY(storeID) REFERENCES Stores(storeID)
);
What does NOT NULL DEFAULT 'Unknown' mean?
The column cannot be NULL. If no value is given, it automatically becomes "Unknown."
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
brand TEXT NOT NULL DEFAULT 'Unknown',
price REAL NOT NULL
);
Difference between PRIMARY KEY, UNIQUE, and NOT NULL?
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
serialNumber TEXT UNIQUE,
model TEXT NOT NULL
);
Risk of too many constraints?
Excessive constraints can make valid inserts fail and reduce flexibility.
-- Overly strict example
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
model TEXT NOT NULL UNIQUE,
price REAL NOT NULL CHECK(price > 5000),
stock INTEGER CHECK(stock = 1)
);
Why must each table have a PRIMARY KEY?
It uniquely identifies each row so records can be reliably updated or referenced.
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
model TEXT,
bikeType TEXT
);
Why is AUTOINCREMENT useful?
It automatically generates the next unique ID for each new record.
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY AUTOINCREMENT,
model TEXT NOT NULL
);
Compare PRIMARY KEY and FOREIGN KEY.
PRIMARY KEY uniquely identifies rows in its table. FOREIGN KEY links rows to another table.
CREATE TABLE Stores (
storeID INTEGER PRIMARY KEY
);
CREATE TABLE Bicycles (
bikeID INTEGER PRIMARY KEY,
storeID INTEGER,
FOREIGN KEY(storeID) REFERENCES Stores(storeID)
);
Style
Due: February 13
FlowerShop.db using:
.open FlowerShop.db
.databases
.schema.schema
PRAGMA foreign_key_list(<table_name>);
Stock with default value 50.schema.schema after the alter
PRAGMA foreign_keys = ON;PRAGMA foreign_keys;
SELECT * FROM OrderDetails;
DROP TABLE Flowers;.tables
Will dropping the table only delete the table, or will it also delete the data permanently?
Time: 8:00 - 9:00 AM
Table of Contents | Office Hours: Mon, Wed, 2:00 PM - 3:00 PM, CTC 117 | sberhe@pacific.edu | q_haad@u.pacific.edu | Syllabus | Join Zoom | SQL Docs | SQL OneCompiler | PlantUML ERD Editor | GitHub Repository