3 Hands-On with SQL Code

Remember, SQL is a language that can help answer questions about the data in a database.

3.1 Viewing Data

What if we want to know, who are the users in the database?

3.1.1 SELECT & FROM

Let’s see how we can view the users table with SQL by writing our first query! Type this into the SQL Editor Pane:

SELECT * FROM users;

Place your cursor anywhere before the semicolon and then click the Execute query button alt text or F9 key on your keyboard.

This is the most common query you’ll write as you start exploring your data. SELECT is used to display the columns of tables you want and FROM is used to specify which table to get the columns from.

The query asks the database to select everything (* means “everything”) from the table users. It ends with a semicolon to tell the database that this is the end of the request.

SQL ignores capitalization, spaces, and new lines in a query. Some tools which use SQL also ignore semicolons. However, it’s conventional to:

  • Write SQL keywords (SELECT, FROM, and so on) in UPPERCASE
  • Write table and column names in lowercase
  • Write a semicolon ; at the end of the query

You can use new lines to help organize queries and make them easier to read. For instance, this query does the same thing as the query above:

SELECT * 
FROM users;

3.1.2 Selecting Columns

What if we don’t want to see all of the columns in the table? We can ask for just the columns we want to see. Let’s get the first name, birth date, phone, city, and zip code.

SELECT
    first_name,
    birth_date,
    phone,
    city, 
    zip_code 
FROM users;

NOTE:
You can rearrange the columns however you’d like in the SELECT statement and select a column multiple times.

3.1.3 Unique Values

What if we now want to know, in what cities do the people live in? We could scroll through the results and try to keep track of unique values, but that is tedious.

Instead we can use the SELECT DISTINCT keywords on one or more columns to show all the unique values.

Let’s look at the users table again and see which cities our users live in.

SELECT DISTINCT city
FROM users;

If we wanted to see unique combinations across multiple columns, we just add the columns we’re interested in. The query below shows unique city and zip code combinations.

SELECT DISTINCT 
    city, 
    zip_code
FROM users;

3.1.4 Ordering Results

What if we want to know, are the users young or old? We could scroll through the table and look their birth dates.

While we continue our data exploration, we also might want to sort the results in a specific way.

With SQL, you can sort on one or more columns with a combination of ascending or descending order using the ORDER BY keyword.

Let’s sort the users table by birth date.

SELECT *
FROM users
ORDER BY birth_date;

By default, ORDER BY sorts in ascending order. We can sort in descending order to get users born more recently by adding DESC after the column.

SELECT *
FROM users
ORDER BY birth_date DESC;

CHALLENGE:
How would you sort users by zip code in descending order and then birth date in ascending order?

3.1.5 Limiting Number of Rows

Sometimes you’ll be working with a large table to analyze with lots of columns and rows. You can use LIMIT to reduce the number of rows the query returns to give you a snapshot of the data you’re working with.

SELECT *
FROM users
LIMIT 10;

3.1.6 Commenting

As we’re writing queries, sometimes we want to write helpful comments to ourselves and others. There’s two ways to write comments so that text won’t be interpreted as SQL.

  1. Single line comments: text following two dashses “”, like so -- comment here
  2. Multiline comments: text between the characters “/* */”, like so /* comment here */
/*
all users in the library system 
sorted by most recent birth date
*/
SELECT *
FROM users
ORDER BY birth_date DESC; -- sort most recent here

CHECKPOINT:
Try and use all viewing keywords together to give you insight to the dataset. Write your insights in comments.

3.2 Filtering Data

Now that we’ve seen some ways to view our data, let’s learn how we can filter our data. This is really the core of SQL, where we can start to answer our own questions about the data!

We use the WHERE clause to filter rows of a query by specifying one or more conditions. The table below shows comparison operators that can be used and combined with WHERE to create conditions, some of which you may have seen before in other programming languages.

Comparison Operator Description
= equals
> greater than
>= greater than or equal to
< less than
<= less than or equal to
<> or != not equal to

In general, the type of data on each side of the operator needs to be the same: compare numbers to numbers, text to text, and so on. When working with text or date data, it’s necessary to wrap values in single quotes. For instance:

  • 'text_value' is a text value
  • '2000-01-01' is a date

Now let’s write a query to find all users that live in Davis:

SELECT * 
FROM users
WHERE city = 'Davis';

SQL also provides a variety of arithmetic operators for working with numeric data:

Arithmetic Operators Description
+ addition
- subtraction
* multiplication
/ division
% modulus

3.2.1 AND & OR Operators

Often we’ll need to filter data based on more than one condition. We can ask WHERE to check multiple conditions with the keywords:

  • AND, meaning a row must satisfy both conditions
  • OR, meaning a row must satisfy at least one of two conditions

These are logical operators in SQL that evaluate to TRUE or FALSE.

Let’s find all users who live in Davis and were born before 1980.

SELECT * 
FROM users
WHERE city = 'Davis' AND
      zip_code < '1980-01-01';

Let’s now find all users who live in Davis or Sacramento.

SELECT * 
FROM users
WHERE city = 'Davis' OR
      city = 'Sacramento';

NOTE:
AND & OR will follow the order of operations. To use AND & OR in the same query, wrap parenthesis around the OR conditions.

3.2.2 IN Operator

Sometimes we find ourselves writing multiple OR conditions on the same column which can make our query look more complex than it really is. In this scenario we can condense multiple OR conditions using IN.

Let’s rewrite our previous query to use IN.

SELECT * 
FROM users
WHERE city IN ('Davis', 'Sacramento');

3.2.3 BETWEEN Operator

We can also also filter using ranges of values with BETWEEN. This is handy when you’re working with numerical or date values and you don’t want to list out all possible values to meet your conditions.

Let’s write a query to find all users born in the 1990’s.

SELECT * 
FROM users
WHERE birth_date BETWEEN '1990-01-01' AND '1999-12-31';

CHALLENGE:
Find all users with a zip code from 95000 - 96000.

3.2.4 LIKE Operator

The LIKE keyword tests whether text values match a given pattern. There are two different wildcard characters that you can use in the pattern:

  • _ matches exactly one character
  • % matches zero or more characters

Let’s look at an example to indicate we want to match the beginning of a string, but the end is allowed to vary. To show this, let’s find users that have a phone number area code that begins with 530 and can end with anything.

SELECT * 
FROM users
WHERE phone LIKE '530%';

NOTE:
The wildcard % can be used multiple times in one pattern.

You can also use regular expressions in SQLite to match in more complicated situations. Read more about using regular expressions here.

The LIKE keyword differs between dialects of SQL, so it’s also a good idea to check the documentation for your RDBMS before using LIKE.

CHALLENGE:
How would you write a query so that the wild card finds users with an email that ends with icloud.com?

3.2.5 IS NULL Operator

So far we’ve worked with complete data, but how do we work with missing data? In databases, NULL means missing data. IS NULL is used to test whether there is missing data in a column.

Let’s look at an example to find users with a missing address.

SELECT * 
FROM users
WHERE address IS NULL;

CHALLENGE:
How would you write a query to find users with a missing phone number or missing email?

3.2.6 NOT Operator

There will be times where we want to find only the rows that do not satisfy some condition. To do this, use NOT combined with other operators:

  • NOT IN
  • NOT BETWEEN
  • NOT LIKE
  • IS NOT NULL

Below is a query to find users that do not have a phone number area code that begins with 530.

SELECT * 
FROM users
WHERE phone NOT LIKE '530%';

3.3 Aggregating Data

We’ve just looked at a number of ways to filter data, but now let’s look at some ways to aggregate data.

3.3.1 Count

Suppose we want to find out how many books have been checked out.

We can count the total number of rows in a table using the COUNT function. The function takes a column or * as an argument, but the argument doesn’t actually affect the count.

Here’s how we can use COUNT to answer our question:

SELECT 
    COUNT(id)
FROM checkouts;

NOTE:
You can combine DISTINCT with COUNT using COUNT(DISTINCT column_name) to get a unique count of values in a column when duplicate values exist.

CHALLENGE:
Find the total number of users that have checked out a book.

3.3.1.1 Renaming/Aliasing Columns

In the previous query, notice that the name of the column in the result is COUNT(id), which isn’t easy to use in subsequent SQL queries or with other data programming tools.

We can use AS to rename or alias a column in the result of the query. This is handy if you’re planning to export the result for future use, especially if you’re sending it to someone else.

Let’s rename the column to total_checkouts:

SELECT 
    COUNT(id) AS total_checkouts
FROM checkouts;

We can also include WHERE. Let’s see how many books were checked out just today.

SELECT 
    COUNT(id) AS total_checkouts
FROM checkouts
WHERE checkout_date = '2022-04-11';

3.3.2 Average

The AVG function returns the average value of a numeric column. Let’s find the average number of days a book is checked out for:

SELECT 
    AVG(days_checking_out) AS avg_days_checking_out
FROM checkouts;

3.3.3 Sum

We can also sum the values in a numeric column with the SUM function. Let’s find the total number of books that were returned with damage:

SELECT 
    SUM(returned_with_damage) AS books_returned_with_damage
FROM checkouts;

Here’s another way to accomplish the same task using COUNT and WHERE:

SELECT 
    COUNT(id) AS books_returned_with_damage
FROM checkouts
WHERE returned_with_damage = 1;

3.3.4 Grouping Data

So now you’ve seen several functions working on a single column, but we sometimes want to summarize our data in more sophisticated ways. Let’s see what grouping can do for our data. Let’s write a query that counts the number of books checked out per day:

SELECT 
    checkout_date, 
    COUNT(book_id) AS books_checked_out
FROM checkouts
GROUP BY checkout_date;

Notice here how we asked for two columns - the checkout_date and the count of book_id.

CHALLENGE:
You can also GROUP BY more than one column. How would you find the total number of times a person checked out a book on a given day?

3.3.5 Having

HAVING is similar to WHERE, but it specifically works with GROUP BY. Perhaps we’re only interested in days that had more than 5 checkouts. Let’s see what that looks like:

SELECT 
    checkout_date, 
    COUNT(book_id) AS books_checked_out
FROM checkouts
GROUP BY checkout_date
HAVING COUNT(book_id) > 5;

Now we’ve seen how we can use functions to aggregate data and how grouping data can give us meaningful insights. There are, of course, other functions available in SQL and we can’t go over all of them here, but now you’ve seen how they work and can apply your knowledge to new functions you find.

3.4 Joining Data

Joining tables allows us to combine information from more than one table into a new table. The tables need to have a key column to be able to link the tables together. In our Library Checkouts ERD, the id column in books is a key column that links to

  • book_id in checkouts

    and

  • book_id in book_genre_link

3.4.1 JOIN Types

SQL has 4 main kinds of joins:

INNER JOIN LEFT JOIN

RIGHT JOIN FULL OUTER JOIN

NOTE:
The above images come from the W3Schools’ SQL join page, an excellent resource for learning more about SQL.

What kinds of joins are there?

  • INNER JOIN: Returns rows that have matching values in both tables; it gets you what’s in the middle of the venn diagram.

  • LEFT JOIN: Returns all rows from the left table, and the matched rows from the right table; the “left” table is the first table you write in the query.

  • RIGHT JOIN: Returns all rows from the right table, and the matched rows from the left table; the “right” table is the second table you write in the query or the “join” table.

    NOTE:
    This is not supported in SQLite, however switching the tables in a LEFT JOIN will emulate a RIGHT JOIN.

  • FULL OUTER JOIN: Returns all rows when there is a match in either left or right table; nulls are generated in the table when a row in one table doesn’t have a match in the other table.

    NOTE:
    This is not supported in SQLite, however you can emulate a FULL OUTER JOIN described here.

3.4.2 JOIN steps

The anatomy of a JOIN:

SELECT 
  left_table.column1,
  right_table.column1,
  ...
FROM left_table
  [INNER | LEFT] JOIN right_table ON left_table.key_column = right_table.key_column;

Below are the steps for writing a JOIN:

  1. SELECT the columns we want in the output (using column references).

NOTE:
Column referencing helps you distinguish from what table the columns come from. Use the table name followed by a period to reference the column: table_name.column_name.

  1. Then we have the FROM statement to tell it which table to start with (this is our “left” table).
  2. Then we need our JOIN statement to say which table should get joined (this is our “right” table)
  3. Finally, we have to say which columns the join should be based on with ON .

3.4.3 INNER JOIN

Let’s try an INNER JOIN to see how this works:

SELECT 
    books.title AS book,
    checkouts.checkout_date
FROM books
    INNER JOIN checkouts ON books.id = checkouts.book_id;

We interpret the INNER JOIN query as, “all books that have been checked out.”

3.4.4 LEFT JOIN

Now let’s try a LEFT JOIN:

SELECT 
    books.title AS book,
    checkouts.checkout_date
FROM books
    LEFT JOIN checkouts ON books.id = checkouts.book_id;

We interpret the LEFT JOIN query as, “all books and if they have been checked out or not.”

You might be thinking, what would happen if the tables in the LEFT JOIN were flipped? We would get the same result as the INNER JOIN query! That’s because there’s no instances where a checkout without a book could ever happen!

CHALLENGE:
Can you write a query that contains the title of the books and the names of the genres they’re categorized to?

3.5 Subqueries

So far we’ve been working with one SELECT statement, but we can actually combine multiple SELECT statements using subqueries. Subqueries are nested queries enclosed in parentheses that can be used with other keywords like JOIN and WHERE. Below are 2 examples of these use cases.

Let’s first look at a subquery in the WHERE clause:

-- main query
SELECT * 
FROM checkouts
WHERE book_id IN (
    -- subquery
    SELECT id
    FROM books
    WHERE format = 'Hardcover'
);

We retrieve hardcover books in a subquery. We then use the results of the subquery to filter the checkouts table using checkouts.book_id. In the end we get, “all checked out hardcover books.”

NOTE:
When writing a subquery with WHERE and IN, the subquery must select only one column for IN to filter on.

Now let’s look at a subquery with JOIN:

-- main query
SELECT 
    checkouts.user_id,
    hardcover_books.title AS book,
    checkouts.checkout_date,
    checkouts.return_date
FROM checkouts
    INNER JOIN (
        -- subquery
        SELECT *
        FROM books
        WHERE format = 'Hardcover'
    ) AS hardcover_books ON checkouts.book_id = hardcover_books.id;

We retrieve hardcover books in a subquery, just like before. Next, we write the subquery after JOIN and alias it hardcover_books. Finally, we join on the keys and return the columns we want in the main SELECT statement.

NOTE:
If you’re still a bit confused, just remember that every query results in a table. Subqueries give you the ability to create a “new” table on the fly even if that table didn’t exist in the database before.

Also, note that both subquery examples are just different ways to get to the same result, “checked out hardcover books.”

3.6 Saving Query Results

There will be times when we want to save the results of a query so we can reuse it later when needed. Two commands to save a query as a new database object follow:

  1. CREATE TEMPORARY TABLE
    • This is a new table added to the database, just like the tables you’ve been working with, except it is only available in the current session. You typically do this to break down a complex problem into intermediate steps and pass your saved results to the final query.
  2. CREATE VIEW
    • A view is simply a saved query that can be executed when called. The query you save will usually be made up of multiple tables with added conditions if needed. You can use it in pretty much the same way you would a table. The only major difference is that a view, because it is updating from other tables, is not able to be edited.

If we want to create a temporary table, we just need to add CREATE TEMPORARY TABLE our_new_table_name AS at the beginning of the query (adding in our own table name, of course). This is what it looks like:

CREATE TEMPORARY TABLE davis_resident_checkouts AS
    SELECT 
        users.id AS user_id,
        users.first_name,
        users.last_name,
        books.title AS book,
        checkouts.checkout_date,
        checkouts.return_date
    FROM users
        INNER JOIN checkouts ON users.id = checkouts.user_id
        INNER JOIN books ON checkouts.book_id = books.id
    WHERE users.city = 'Davis';

In much the same way we made the new table, we can make a view:

CREATE VIEW davis_resident_checkouts AS
    SELECT 
        users.id AS user_id,
        users.first_name,
        users.last_name,
        books.title AS book,
        checkouts.checkout_date,
        checkouts.return_date
    FROM users
        INNER JOIN checkouts ON users.id = checkouts.user_id
        INNER JOIN books ON checkouts.book_id = books.id
    WHERE users.city = 'Davis';

3.6.1 Export Query Results

SQLiteStudio has the ability to export out query results to CSV files for further analysis.

Click the “Export” icon at the top of the interface and follow the export dialog.

3.7 Data Management

3.7.1 Update Tables

You might have noticed at the beginning the users table has NULL values across different columns. We can fix this fairly easily, but we need to be careful. It’s challenging to undo something in a database so we want to be sure we’re doing it right. Let’s update the NULL values in the users.address to “N/A”.

It’s first helpful to write a query to be sure these are the rows you want to update:

SELECT *
FROM users
WHERE address IS NULL;

Once you’ve confirmed, the below statement updates the NULL values to “N/A

UPDATE users
SET address = 'N/A'
WHERE address IS NULL;

The SET keyword specifically targets just the address column and replaces NULL values with “N/A” when the condition is met in the WHERE clause. It leaves the other values alone. If the WHERE clause is removed, it will set all values in the whole column to “N/A” overwriting the users address, so proceed with caution!

3.7.2 Add & Populate a Column

Sometimes we want to make a new column and add data into it. Let’s make a new column called country in the users table and populate it with “USA” if there is a value in the column state.

First we’ll add the column and set the default value to “N/A”:

ALTER TABLE users
ADD country TEXT DEFAULT 'N/A';

NOTE:
The DEFAULT argument is optional, but if you leave it blank, it will make the default value NULL.

Now we update all values in the column to be “USA” where the state is not null.

UPDATE users
SET country = 'USA'
WHERE state IS NOT NULL;