SQL DML – The SQL Starter Series – Book 1

By Lewis Cunningham

Buy the book

Vendor agnostic, no nonsense, no fluff! This book offers a solid foundation in SQL DML. Intended for anyone who wants to start using SQL databases, with this book you can take that first step. This is an invaluable reference for any user of any SQL database. With this book, you will learn:

  • How to write SELECT statements, including sub-queries
  • Understand Transaction Control Commands
  • Select clauses: FROM, WHERE, ORDER BY, GROUP BY, HAVING, UNION, MINUS
  • How to Insert, Update and Delete data

The topics in this book have been garnered from almost four years of questions and answers on the author’s blog.  This book is written for everyone: managers who need to understand the technology, developers looking to improve their code and their skills, professionals seeking a career change and even hobbyists home learners.

This reference and tutorial shows that SQL is not brain surgery or rocket science.  Theory is great for engineers building new and better engines but a beginner should not be bogged down in minutia and irrelevant details.  This book will get you started.

In this book, readers will dive right into the differences between DDL and DML (see the excerpt below).  Short on theory and long on practicality, this plain english, no nonsense approach to SQL DML enables anyone to start using SQL today, right now.  This book is also vendor agnostic and concentrates on SQL basics rather than vendor specific syntax.

From Oracle ACE director and the author of An Expert’s Guide to Oracle Technology, Lewis Cunningham, this book also includes a chapter describing where to get software and documentation for 6 leading free databases: Oracle Express Edition, MySQL, Postgres, MS SQL Server Express, DB2 Express and Firebird.

Publication Date: Nov 20 2008
ISBN/EAN13: 1440464111 / 9781440464119
Page Count: 118
Type: Paperback
Page Size: 8″ x 10″
Language: English
Related Categories: Computers / Programming Languages / SQL

Buy the book

Excerpt: Chapter 1 – Introduction

This book is not intended to teach you everything you might ever want to know about Structured Query Language (SQL) coding. The intention of this book is to give someone new to the language a solid foundation to begin using SQL.

As the intention is a foundation, little time will be spent on theory. I will explain some of the concepts you must know to be successful but I will spend most of the book giving you information that you can immediately use.

In this book, I will concentrate on using SQL rather than covering relational database technology, database design or database structures. These topics will be found in my Intro to Relational Databases “T8″>and SQLStarter – Intro to DDL “T8″>books.

This book is also meant to be database agnostic; that is, I will not try to present any particular database’s approach over any other database’s approach. Most of the examples below use SQL*Plus in an Oracle 10g XE environment. That is just the choice I made when beginning the book (it was the database I already had installed on my EeePC laptop). The concepts presented within this book are pertinent to any SQL database.

SQL

SQL has been around since the 1970s. It was defined by IBM and has since become the standard query and manipulation language for relational databases. The standard is governed by several standards bodies but the most frequently cited is ANSI.

SQL commands are terminated with a semi-colon (;). Some tools do not require the semi-colon but, for the most part, when you write a SQL statement, expect to terminate it with a semi-colon.

There are many parts to the SQL standard but what this book will concentrate on is called the Data Manipulation Language (DML). A follow on book to this one will cover the Data Definition Language (DDL). To understand DML, I think it is important to understand how DML differs from DDL.

DML

The Data Manipulation Language, DML, allows a person to query data (view it), modify it (add new or change existing) and to delete it. DML has evolved into a very robust, set based, data management and manipulation language.

DML includes many keywords such as SELECT, INSERT, UPDATE, DELETE, FROM, etc. each of these keywords will be explained below and examples will be given. This book will not cover every possible DML keyword or even every possible syntactical variant of those that it does cover. This book is the foundation. A future book will cover more advanced topics.

The important thing to remember about DML is that it manages data, not the structures that data is stored in. The Data Definition Language is concerned with the underlying structures.

DDL

DDL, Data Definition Language, is concerned with creating, modifying and removing database objects. Creating tables and indexes, adding or removing columns from tables, creating or removing synonyms, creating storage areas to store data, etc. Even renaming a database object is considered DDL.

So, DML is concerned with managing data and DDL is concerned with managing database objects. A solid understanding of SQL requires a solid understanding of both DML and DDL. However, not every user of SQL really requires a knowledge of DDL.

DDL is a design time language. You create the database once, you create your tables and then use them. Your applications will constantly use DML but DDL should, in general, be complete before the developers or users begin using the database.

For this reason, I have split DML and DDL into two books. I will not cover any DDL commands or other design time issues. This book is for people ready to start writing queries and manipulating data.

Data Types

A data type defines the type of data that can be stored in a column or variable. In programming, everything starts with variables. In a database, everything starts with columns. Some languages let you change data types on the fly; in one case you might store date data and then ten lines later, store a number.

SQL is not so forgiving. Because the database’s role is the management of data and because data integrity is paramount, data types and type checking are major components of a relational database engine.

As you will find when using various databases, even though SQL is a standard and the SQL standard identifies data types, each vendor can choose what to implement, how to implement it and when to extend it. That is a long way for me to say that while you will find the same classes of data type in almost all databases, the exact names, features and constraints change per the vendor’s chosen implementation.

See Appendix C for a cross reference of data types by database vendor.

Some vendors support many custom data types, including object types and user defined types. Because the number of types can be so large, this book does not try to cover all of those. Most databases also support binary data types. I will not try to cover that at this point; I believe binary topics are better served in an advanced SQL book. This book will explain the four most common classes of data type: string, numeric, date and boolean.

String

String data types store character data. Name, address and descriptive text are all examples of character data. The legal values are usually the entire range of alphanumeric characters, including punctuation and sometimes special characters.

By default, string data is delimited in single quotes: ‘this is a text string’. A string literal is a chunk of text data, inside delimiters. Some databases offer additional delimiters such as [ or {. Some databases will even let you define your own delimiters. Having alternative delimiters is beneficial when you want to include the default delimiter within a string, such as:

v_char_field := [This is a single quote: ' and I don't get an error.];

Or:

v_char_field := ‘This is a single quote: ” and I don”t get an error.’;

I find the first example easier to read than the second.

There are two primary types of string data types: fixed length and variable length. A fixed length string is declared to be a certain size and will always be that size. If the data stored in the column or variable is less than the full length of the column or variable, the data will be padded with spaces to be the maximum length.

A variable length string may or may not be a declared maximum size. Data stored within the column or variable is no larger than it needs to be and is not padded with spaces unless the user or application pads it manually.

The classic fixed length string data type is the CHAR data type. A CHAR defaults to a single character in many databases. Parenthesis is used to declare a length: CHAR(10). If a column is declared as CHAR(10) and you store the string ‘Lewis’ in it, the database
will save ‘Lewis ‘.

CHARACTER is synonymous for CHAR.

Variable data types come by many names: VARCHAR, VARCHAR2, TEXT, CLOB, and many more. Like the CHAR (and any other data type) the length is declared within parenthesis: VARCHAR2(10). If a column is declared as VARCHAR2(10) and you store the string ‘Lewis’ in it, the database will save ‘Lewis’; no padding is added. If however, you were to store ‘Lewis ‘, the database would store it that way. It would not add any padding, but it would not remove any either.

When comparing strings, it is important to understand the differences between fixed and variable. Each vendor can implement this differently but in general, if a comparison of two strings uses two CHAR variables, they are padded to the same size, even if the variables are declared as different lengths.

For example, if:

Var1 CHAR(5);

Var2 CHAR(10);

Var1 := ‘Lewis’;

Var2 := ‘Lewis ‘;

Then Var1 = Var2 is true, and

Var1 CHAR(5);

Var2 CHAR(10);

Var1 := ‘Lewis’;

Var2 := ‘Lewis’;

Var1 = Var2 is also true.

If one of the variables is declared as a variable length data type, the padding does not happen:

If:

Var1 CHAR(10);

Var2 VARCHAR(10);

Var1 := ‘Lewis’;

Var2 := ‘Lewis ‘;

Then Var1 = Var2 is true, but:

Var1 CHAR(10);

Var2 VARCHAR(10);

Var1 := ‘Lewis’;

Var2 := ‘Lewis’;

Var1 = Var2 is false.

If we make the two variables the same size and fill them both up:

Var1 CHAR(5);

Var2 VARCHAR(5);

Var1 := ‘Lewis’;

Var2 := ‘Lewis’;

Then Var1 = Var2 is true.

Numeric

The Numeric data type holds numeric data, i.e. numbers. If you are going to add it, subtract it, count it, average it or do any other mathematical operation, it is a number. Most databases support integer and floating point data types.

Integer types may be named INT, INTEGER, SMALLINT, BIGINT, NUMBER or NUMERIC. You may be able to set the precision and scale within parenthesis.

Floating point data types may be called FLOAT, REAL, BINARY FLOAT, DOUBLE, NUMERIC or NUMBER.

Notice that NUMBER and NUMERIC are in both categories. I’ve seen them both ways.

An example using NUMBER:

Var1 NUMBER;

Var2 NUMBER(7);

Var3 NUMBER(7,2);

NUMBER(7) is the equivalent of NUMBER(7,0). NUMBER(7,2) means a number that is 7 digits long with 2 digits to the right of the decimal
point.

Date

In some databases, a DATE stores only a date but in others it can store date and time. In those databases where it stores date only, there is usually a separate data type called DATETIME (or something close to that).

Dates (and times) are usually stored internally as binary numbers. They really do not have a format until query time. When queried, the database or client will automatically format the date using a default format mask (ex. DD-MON-YYYY or MM/DD/YYYY). It is always best to explicitly format your dates and times when querying them to avoid misunderstandings. I show how to do that below in the sections on expressions and functions.

A related data type that stores dates and times is the Timestamp. While most DATETIME data types will stored time down to the seconds, a Timestamp stores minute fractions of time. Timestamps may also store timezone information.

Boolean

The Boolean data type stores TRUE or FALSE. Internally, it may store them as a 1 and 0 but when displaying them and referring to them, you interact with the TRUE and FALSE. Boolean is a conditional data type. If statements resolve to a boolean value and boolean values can be used in if statements.

Var1 CHAR(1) := ‘A’;

Var2 CHAR(1) := ‘B’;

Var3 BOOLEAN := TRUE;

If (var1 <> var2)

Equals True

If (var3)

Equals True

Some databases, Oracle most notably, do not support SQL booleans.

See Appendix C for a cross reference of data types by database vendor.

Expressions and Operators

In SQL (and other programming languages), an expression is nothing more than a statement that returns a value.

1 + 2

A + B

Column1 + column2

Column1 || Column2

SQLFunction(column1)

Column1 = Column2

The above are all expressions. When we talk about expressions, we usually talk about the type of expression, a character expression, boolean expression, numeric expression or date/time expression.

1 + 2 is a numeric expression. We know this because both input values are numeric and the results of the + operator on two numeric values is a numeric output. More on operators below.

In the case of A + B, we can’t be positive that we know what type of expression that is. We would either need to see the definitions of A and B or we would need to see the result of the expression. We can make a good guess based on the operator is uses, though. The + operator tells me that it is probably a numeric expression; it is adding two numeric values. Technically, A could be a date and B could be a numeric, in which case the expression would return a date value, making it a date expression. Less likely, it also might possibly be a string expression.

The case of Column1 = Column2 is a boolean expression. The = operator always returns a boolean result.

In SQL, you can use expressions almost anywhere in a statement. A column in a select list is an expression (more on that below). A function call is an expression. The components of a WHERE clause are expressions.

Through out the book, I will use the term expression. When I do, I mean a statement that will return a value.

On the topic of operators, there are many available to a SQL developer. All of the expected mathematical operators are available: +, *, -, /, %, (), [], {}, etc. Normal mathematical operator precedence is in effect in SQL.

Logical, i.e. boolean, operators include: AND, OR, =, <>, >, <, <=, and >=.

I speak more on some of these operators below in the section on viewing data.

Common SQL Functions

See Appendix B for a cross reference of SQL functions by database vendor.

SQL provides a very large list of built in functions. Each vendor also extends this list of functions with their own. In this section, I will describe some of the more commonly available SQL functions. It is important to remember that functions, like data types, are very vendor dependent. Vendors feel free to use different names for functions than what the SQL standard calls for. The best way to see if a particular function is available is to check your vendor’s documentation. If you don’t find a particular function, look for names similar.

The list below is not a list of all possible functions, nor is it a list of all SQL Standard functions. It is a list of functions that you will commonly run across. This list is to assist you and give you an idea about what kind of functionality you may have available to you. As always, check your vendor documentation for exact syntax. See Appendix B for a cross reference of SQL functions by database vendor.

String Functions

SUBSTR – Extracts parts of a string

ASCII – convert a string to its ASCII codes

CHR – Convert ASCII code to a string

COALESCE, NVL – Test for null and optionally return a non-null result

CONCATENATE, || – Concatenate two strings to a single string

INSTR – Finds the location of a specific character or substring within a string

UPPER, LOWER, INITCAP – Convert a string to uppercase, lowercase or initial caps

LPAD, RPAD – Pad a string with a character or string. Lpad puts the pad character(s) before the string, rpad puts the pad character(s) after the string

USER, CURRENT_USER, SYSTEM_USER – Return the current user

LENGTH – Return the length of a string

Numeric Functions

SQL provides plenty of standard math functions such as ACOS, COS, TAN, ATAN, ASIN, CEILING, FLOOR, etc. They work like the math concepts you may have studied in school. Most are self explanatory.

In addition, SQL provides ROUND and TRUNC. ROUND will round a number to the nearest integer (or decimal position), and TRUNC(ATE) will truncate a number at the chosen decimal position.

Date Functions

CURRDATE, CURRTIME, SYSDATE, NOW – Returns the current date or time.

ADDMONTHS, ADDYEARS – Add months or years to a date. You can add days to a Date field by using the + operator. SYSDATE + 1 adds 1 day to the current system date.

MONTHS_BETWEEN – Return a the number of days (fractional) between two dates. You can get the number of days between two dates by subtracting them.

SYSTIMESTAMP, CURRTIMESTAMP, TIMESTAMP – Returns current date and time as a timestamp.

Conversion Functions

TO_CHAR – Convert a number or date to a string. Usually takes a format mask to format the output to your preference.

TO_DATE – Convert a string to a date.

CAST, CONVERT – Cast (Convert) one data type to another

TO_NUMBER – Convert a string to a number

TO_TIMESTAMP – Convert a string to a timestamp

Aggregate Functions

AVG – Returns the average value of a column over a result set

SUM – Returns a sum of a column over a result set

COUNT – Returns a count of rows over a result set

FIRST – Returns the first value of a column in a result set

LAST – Returns the last value of a column in a result set

MIN – Returns the minimum value of a column over a result set

MAX – Returns the maximum value of a column over a result set

Remember that functions are expressions. You can use a function in most places where you can use a column or other type of expression.

Examples:

SELECT sysdate FROM dual;

SELECT current_user();

SELECT TO_CHAR(hire_date, ‘DD-MON-YYYY’) FROM employees;

SELECT COUNT(*) FROM employees;

SELECT department_id, SUM(salary), AVG(salary), MAX(salary)

FROM employees

GROUP BY department_id;

See Appendix B for a cross reference of SQL functions by database vendor.

Chapter 4, section “Calling SQL Functions” provides examples of calling functions within a select statement.

Buy the book