2.2 DATA REPRESENTATION
2.2.1 General Data Type Information
Every data value belongs to some data type such as INTEGER, CHARACTER, etc. Microsoft SQL Server Transact-SQL has a number of native data types, which are described in the next section.
In addition to the built-in, native data types, Transact-SQL provides for the definition of user-defined types, covered later in this chapter.
2.2.2 Domain and NULL Value
2.2.2.1 Domain
The domain of a column is the set of legal data type values for that column as determined by the data type assigned to the column and applicable constraints.
A domain value is any value within the domain. For example, a column called Age may have a data type of INT and a CHECK constraint that values must be between 0 and 200. Then the domain is the set of all integers between 0 and 200 inclusive. In this case, 21 would be a valid domain value for the Age column.
2.2.2.2 NULL
NULL means "no domain value assigned." It indicates that the value is unknown. NULL can also be read as "not available," "not applicable" or "unknown." If you want to add an employee to the employees table who does not have a telephone number, NULL would be assigned to the field value in the database. Note that NULL is very different from a value of zero for INT or an empty string for VARCHAR, both of which are domain values.
A constraint of NOT NULL may be placed on any column that is not allowed to have a NULL value. This is true for a primary key, for example. Every row must have a valid domain value for each column specified as NOT NULL. The NULL value is represented by the keyword NULL.
The ANSI SQL-99 Standard says the following about NULL.3
null value: |
A special value that is used to indicate the absence of any data value. Every data type includes a special value, called the null value, sometimes denoted by the keyword NULL. This value differs from other values in the following respects: Since the null value is in every data type, the data type of the null value implied by the keyword NULL cannot be inferred; hence NULL can be used to denote the null value only in certain contexts, rather than everywhere that a literal is permitted. Although the null value is neither equal to any other value nor not equal to any other valueit is unknown whether or not it is equal to any given valuein some contexts, multiple null values are treated together; for example, the <group by clause> treats all null values together. |
Every column of every row in a table must be assigned either a column domain value for that column or NULL. SQL Server Query Analyzer displays the word NULL when a query returns NULL for a column.
2.2.3 Constant (Literal) Defined
A constant or literal is a non-NULL specific data value of an indicated data type.
-
String Literal is one or more characters enclosed in single quotes, e.g., 'Mary'.
-
Unicode String Literal has capital N preceding the string, e.g., N'Mary'.
-
Date Literal is a date contained in single quotes, '03-May-2000'.
-
Numeric Literal is an integer (int) or floating point number (no quotation marks), e.g., 12 , 2.3.
The format for constants for each data type are given below.
2.2.4 IdentifiersNaming SQL Objects
An identifier is the name a database user assigns to a database object such as a table, column or view. This section describes the rules for creating identifier names. The next section describes how to use identifiers in one-, two-, three-, and four-part object names. The complete rules for forming legal Microsoft SQL Server identifiers are given below, but the safest and most portable subset of these rules is given here.
2.2.4.1 Identifier Format Rules, Short FormSuggested Form
-
The first character may be any alphabetic character in upper or lower case (a-z, A-Z).
-
Subsequent characters may be any upper or lower case alphabetic or numeric or underscore character ( a-z, A-Z, 0-9, _ ).
-
It must contain between 1 and 128 characters (max of 116 for local temporary table).
-
The identifier must not be a Transact-SQL reserved word (see Appendix A for list).
Examples
Table1, employees, hire_date
Every database object in SQL Server can have an identifier. Some objects, such as tables, are required to have an identifier; for other objects, such as constraints, identifiers are optional. Remember that identifiers are case-insensitive in SQL statements (unless installed case sensitive).
The maximum of 116 for a local temporary table name allows the system to add a unique suffix. This permits different users to concurrently call the same global stored procedure and each have a uniquely identified table.
2.2.4.2 Identifier Format Rules, Complete Form Regular and Delimited
Microsoft SQL Server has two classes of identifiers: regular and delimited.
Regular Identifier Rules
A regular identifier starts with a letter followed by an alphanumeric character or underscore, and it does not contain embedded spaces. A regular identifier is the most common and is the suggested form of name to give a database object.
Regular identifier characters must conform to the following rules.
-
The first character must be
-
an upper or lower case alphabetic character ( a-z, A-Z ), or
-
a Unicode Latin character of another language, or
-
underscore, "at" sign or number sign ( _, @, # ). See First Character below.
-
-
Subsequent characters must be
-
an upper or lower case alphabetic or numeric character ( a-z, A-Z, 0-9 ), or
-
a Unicode Latin character of another language, or
-
underscore, "at" sign, number sign or dollar sign ( _, @, #, $ ).
(Note that embedded spaces are not allowed by these rules.)
-
-
Identifiers must contain between 1 and 128 characters (max 116 for local temp table).
-
The identifier must not be a Transact-SQL reserved word (see Appendix A for list).
First Character identifiers starting with some characters have special meaning as shown in Table 2-2.
Table 2-2. First Characters
First Character(s) |
Meaning |
Examples |
---|---|---|
@ |
local variable or parameter name |
@variablename |
@@ |
system function (do not start your own object names with @@) |
@@version |
# |
temporary table (max of 116 characters) or a local temporary procedure (max of 128 characters) |
#localtable #localproc |
## |
global temporary object |
##globaltable, ##globalproc |
Example: Typical use of regular identifiers
SQL |
---|
-- Regular Identifiers: table1, column_av CREATE TABLE table1 ( column_a VARCHAR(40) ) |
SQL |
INSERT INTO table1 ( column_a ) VALUES ( 'String Data in single quotes' ) |
SQL |
SELECT * FROM table1 |
Result |
|
I recommend restricting your identifiers to regular identifiers such as table1 and column_a.
2.2.4.3 Delimited Identifiers
A delimited identifier is an identifier enclosed in brackets ([ ]) or double quotes (" ") and may include special characters such as embedded spaces or tabs in the identifier name.
Remember the following comments about delimited identifiers.
-
Many people, like me, don't recommend special characters such as spaces because they can cause problems.
-
If you must use spaces or other special characters in an identifier, it is suggested that you use brackets instead of double quotes since the latter require the session setting QUOTED_IDENTIFIER be set to ON, but brackets are always valid.
-
A regular identifier enclosed in brackets refers to the same object.
E.g., [table1] and table1 are the same object.
Note: When SQL Server generates scripts, it puts all identifiers in brackets, so don't be concerned that pubs.dbo.jobs looks a little funny when it appears as [pubs.[dbo].[jobs].
2.2.4.4 Delimited Identifier Rules
First and subsequent characters may be any alphanumeric, punctuation or special character (including space or tab) except for the delimiter characters themselves. Delimited identifiers may include basically any character on the keyboard (except the delimiters) including spaces, tabs and punctuation marks.
Identifier must contain between 1 and 128 characters not counting the delimiters (max of 116 for local temporary table, see page 299). Also, Transact-SQL keywords may be used as identifiers if delimited.
CREATE TABLE [table] ( column_a INT ) -- Keyword as table name is a bad idea!
Example: Using a delimited identifier using [ ] to delimit the identifier
SQL |
---|
CREATE TABLE [three word table] ( column_a VARCHAR(40) ) |
SQL |
INSERT [three word table] ( column_a ) VALUES ( 'String Data in single quotes' ) |
SQL |
SELECT * FROM [three word table] |
Result |
|
Underscore or capital letters can be used to avoid embedded spaces: three_word_table or ThreeWordTable.
I recommend adhering to ANSI SQL and good form as follows.
-
Use the ON setting for QUOTED_IDENTIFIER.
-
Use regular identifiers (no embedded spaces, see Identifier Format Rules, Short FormSuggested Form above).
-
If you must delimit an identifier use brackets as they are always valid.
-
Use single quotes to delimit all string literals.
If you follow these suggestions then you may skip Section 2.2.4.5.
2.2.4.5 QUOTED_IDENTIFIER
This section is applicable if you have identifiers, which are delimited by double quotes.
A QUOTED_IDENTIFIER is a database option that, when ON, causes adherence to ANSI SQL rules regarding quotation mark delimiting identifiers and literal strings. When the option QUOTED_IDENTIFIER is set to ON (usual and recommended) follow these recommendations.
-
Either brackets ([ ]) or double quotes (" ") may be used to delimit identifiers.
-
All strings delimited by double quotes are interpreted as object identifiers.
-
String literals must be delimited by single quotes and NOT by double quotes.
When database option QUOTED_IDENTIFIER is set to OFF follow these guidelines.
-
Only brackets ([ ]) may be used to delimit identifiers.
-
String literals may be delimited by either single or double quotes, though double quotes do not conform to the ANSI SQL and so single quotes are always recommended to enclose string literals.
The default setting for the QUOTED_IDENTIFIER is technically OFF, effectively ON. Although the default database setting for QUOTED_IDENTIFIER is OFF , both the ODBC driver and the OLE DB Provider for SQL Server automatically set QUOTED_IDENTIFIER to ON when connecting which overrides the default database setting.
This ON setting for QUOTED_IDENTIFIER is in effect unless the user explicitly executes
-
SET QUOTED_IDENTIFIER OFF
as is done in the following example. So clients using ODBC or OLE DB (almost all SQL Server clients today including Query Analyzer) see an ON setting for QUOTED_IDENTIFIER. (ON is good.)
See a complete discussion in Server, Database and Session Settings on p.174 and also see Which Session Setting Is in Effect? on p. 219.
Example: QUOTED_IDENTIFIER is ON, so either [ ] or " " may be used to delimit identifier.
SQL |
---|
CREATE TABLE [three word table] ( a VARCHAR (40) ) |
SQL |
INSERT INTO "three word table" ( a ) VALUES ( 'String Data in single quotes' ) |
SQL |
SELECT * FROM [three word table] |
Result |
|
Example: Setting QUOTED_IDENTIFIER is OFF allows only [ ] to delimit identifier.
SQL |
SET QUOTED_IDENTIFIER OFF |
SQL |
CREATE TABLE [multi word table name in brackets] ( a VARCHAR(40) ) |
SQL |
INSERT INTO "multi word table name in dbl quotes" ( a ) VALUES ( 'String Data in single quotes' ) |
Result |
Server: Msg 170, Level 15, State 1, Line 1 Line 1: Incorrect syntax near 'three word table'. |
2.2.5 Using Identifiers as Qualified Object Names
Names of objects used in examples in this book usually use a one-part name, the object name itself. This section describes SQL Server one-, two-, three- and four-part names.
The complete name of an object consists of four identifiers: the server name, database name, owner name, and object name. They appear in the following format:
[ [ [ server. ] [ database ] . ] [ owner_name ] . ] object_name
server defaults to the server of the current connection.
database defaults to the current database.
owner_name defaults to current login.
Qualifiers may be omitted as follows.
-
Leading default qualifiers may be omitted resulting in three-, two- or one-part names.
-
Intermediate default qualifier(s) may be replaced by a period. owner_name marked by a period defaults first to the current login if the object owned by that owner can be found and, if not, then to dbo.
Valid forms of object names are as follows.
-
server.database.owner_name.object_name: Fully Qualified Object Name
-
database.owner_name.object_name: Current server
-
database..object_name: Current server and current login or dbo
-
owner_name.object_name: Current server and current database
-
object_name: Current server and database and current login or dbo
Example: The local server is named amy.
Create a link to SQL Server instance CAT2 on host CAT and do a query to it.
SQL |
---|
EXEC sp_addlinkedserver @server = 'CAT2_Link' -- Specify Server Link Name , @srvproduct = ' ' , @provider = 'SQLOLEDB' , @datasrc = 'CAT\CAT2' -- hostname\SQLServerInstanceName USE pubs go SELECT * FROM authors -- OR: .authors SELECT * FROM dbo.authors -- OR: ..authors SELECT * FROM pubs.dbo.authors -- OR: pubs. .authors SELECT * FROM .pubs.dbo.authors -- OR: ...authors OR: .pubs..authors SELECT * FROM amy.pubs.dbo.authors SELECT * FROM northwind.dbo.orders -- Etc. for any database on the current server SELECT * FROM CAT2_Link.pubs.dbo.authors -- OR: CAT2_Link.northwind.dbo.orders |
The following forms do not work.
SQL |
---|
SELECT * FROM amy.pubs..authors -- OR: amy..dbo.authors OR: amy...authors SELECT * FROM CAT2_Link.pubs..authors -- OR: CAT2_Link..dbo.authors OR: CAT2_Link...authors |
2.2.5.1 Qualified Column Names
Column names of a table or view may be qualified using the form:
table_name.column_name, view_name.column_name, or table_alias. column_name
where table_name or view_name may be a one-, two-, three- or four-part name as described above.
Example: Create a link to SQL Server instance CAT2 on host CAT and do a query to it.
Run a distributed query from the local host to the linked server using tables from both.
SQL |
---|
USE pubs go |
SQL |
SELECT TOP 2 p.pub_name , e.lname + ', ' + e .fname EmpName FROM publishers p , CAT2_Link.pubs.dbo .employee e WHERE p.pub_id = e.pub_id |
Result |
|
Notice that pub_id column names p.pub_id and e.pub_id must be qualified by table alias (or table name if no table alias had been declared) to avoid ambiguity.
The SELECT list uses the form table_alias.column_name. The FROM clause uses the fully qualified four-part name for the employee table on CAT2.