Data Types
Implicit / automatic casting
VARCHAR(255) — or not
Suggested max len
First, I will mention some common strings that are always hex, or otherwise limited to ASCII. For these, you should specify CHARACTER SET ascii
(latin1
is ok) so that it will not waste space:
UUID CHAR(36) CHARACTER SET ascii -- or pack into BINARY(16)
country_code CHAR(2) CHARACTER SET ascii
ip_address CHAR(39) CHARACTER SET ascii -- or pack into BINARY(16)
phone VARCHAR(20) CHARACTER SET ascii -- probably enough to handle extension
postal_code VARCHAR(20) CHARACTER SET ascii -- (not 'zip_code') (don't know the max
city VARCHAR(100) -- This Russian town needs 91:
Poselok Uchebnogo Khozyaystva Srednego Professionalno-Tekhnicheskoye Uchilishche Nomer Odin
country VARCHAR(50) -- probably enough
name VARCHAR(64) -- probably adequate; more than some government agencies allow
Why not simply 255? There are two reasons to avoid the common practice of using (255) for everything.
- When a complex
SELECT
needs to create temporary table (for a subquery,UNION
,GROUP BY
, etc), the preferred choice is to use theMEMORY
engine, which puts the data in RAM. ButVARCHARs
are turned intoCHAR
in the process. This makesVARCHAR(255) CHARACTER SET utf8mb4
take 1020 bytes. That can lead to needing to spill to disk, which is slower. - In certain situations, InnoDB will look at the potential size of the columns in a table and decide that it will be too big, aborting a
CREATE TABLE
.
VARCHAR versus TEXT
Usage hints for *TEXT
, CHAR
, and VARCHAR
, plus some Best Practice:
- Never use
TINYTEXT
. - Almost never use
CHAR
— it is fixed length; each character is the max length of theCHARACTER SET
(eg, 4 bytes/character for utf8mb4). - With
CHAR
, useCHARACTER SET ascii
unless you know otherwise. VARCHAR(n)
will truncate at n characters;TEXT
will truncate at some number of bytes. (But, do you want truncation?)*TEXT
may slow down complexSELECTs
due to how temp tables are handled.
INT as AUTO_INCREMENT
Any size of INT
may be used for AUTO_INCREMENT
. UNSIGNED
is always appropriate.
Keep in mind that certain operations “burn” AUTO_INCREMENT
ids. This could lead to an unexpected gap. Examples: INSERT IGNORE
and REPLACE
. They may preallocate an id before realizing that it won’t be needed. This is expected behavior and by design in the InnoDB engine and should not discourage their use.
Others
There is already a separate entry for “FLOAT, DOUBLE, and DECIMAL” and “ENUM”. A single page on datatypes is likely to be unwieldy — I suggest “Field types” (or should it be called “Datatypes”?) be an overview, then split into these topic pages:
- INTs
- FLOAT, DOUBLE, and DECIMAL
- Strings (CHARs, TEXT, etc)
- BINARY and BLOB
- DATETIME, TIMESTAMP, and friends
- ENUM and SET
- Spatial data
- JSON type (MySQL 5.7.8+)
- How to represent Money, and other common ‘types’ that need shoehorning into existing datatypes
Where appropriate, each topic page should include, in addition to syntax and examples:
- Considerations when ALTERing
- Size (bytes)
- Contrast with non-MySQL engines (low priority)
- Considerations when using the datatype in a PRIMARY KEY or secondary key
- other Best Practice
- other Performance issues
(I assume this “example” will self-distruct when my suggestions have been satisfied or vetoed.)
Introduction (numeric)
Integer Types
Minimal unsigned value is always 0.
Type | Storage (Bytes) |
Minimum Value (Signed) |
Maximum Value (Signed) |
Maximum Value (Unsigned) |
---|---|---|---|---|
TINYINT |
1 | -27 -128 |
27-1 127 |
28-1 255 |
SMALLINT |
2 | -215 -32,768 |
215-1 32,767 |
216-1 65,535 |
MEDIUMINT |
3 | -223 -8,388,608 |
223-1 8,388,607 |
224-1 16,777,215 |
INT |
4 | -231 -2,147,483,648 |
231-1 2,147,483,647 |
232-1 4,294,967,295 |
BIGINT |
8 | -263 -9,223,372,036,854,775,808 |
263-1 9,223,372,036,854,775,807 |
264-1 18,446,744,073,709,551,615 |
Fixed Point Types
MySQL’s DECIMAL
and NUMERIC
types store exact numeric data values. It is recommended to use these types to preserve exact precision, such as for money.
Decimal
These values are stored in binary format. In a column declaration, the precision and scale should be specified
Precision represents the number of significant digits that are stored for values.
Scale represents the number of digits stored after the decimal
salary DECIMAL(5,2)
5 represents the precision
and 2 represents the scale
. For this example, the range of values that can be stored in this column is -999.99 to 999.99
If the scale parameter is omitted, it defaults to 0
This data type can store up to 65 digits.
The number of bytes taken by DECIMAL(M,N)
is approximately M/2
.
Floating Point Types
FLOAT
and DOUBLE
represent approximate data types.
Type | Storage | Precision | Range |
---|---|---|---|
FLOAT | 4 bytes | 23 significant bits / ~7 decimal digits | 10^+/-38 |
DOUBLE | 8 bytes | 53 significant bits / ~16 decimal digits | 10^+/-308 |
REAL
is a synonym for FLOAT
. DOUBLE PRECISION
is a synonym for DOUBLE
.
Although MySQL also permits (M,D) qualifier, do not use it. (M,D) means that values can be stored with up to M total digits, where D can be after the decimal. Numbers will be rounded twice or truncated; this will cause more trouble than benefit.
Because floating-point values are approximate and not stored as exact values, attempts to treat them as exact in comparisons may lead to problems. Note in particular that a FLOAT
value rarely equals a DOUBLE
value.
Bit Value Type
The BIT
type is useful for storing bit-field values. BIT(M)
allows storage of up to M-bit values where M is in the range of 1 to 64
You can also specify values with bit value
notation.
b'111' -> 7
b'10000000' -> 128
Sometimes it is handy to use ‘shift’ to construct a single-bit value, for example (1 << 7)
for 128.
The maximum combined size of all BIT columns in an NDB
table is 4096.
CHAR(n)
CHAR(n)
is a string of a fixed length of n
characters. If it is CHARACTER SET utf8mb4
, that means it occupies exactly 4*n
bytes, regardless of what text is in it.
Most use cases for CHAR(n)
involve strings that contain English characters, hence should be CHARACTER SET ascii
. (latin1
will do just as good.)
country_code CHAR(2) CHARACTER SET ascii,
postal_code CHAR(6) CHARACTER SET ascii,
uuid CHAR(39) CHARACTER SET ascii, -- more discussion elsewhere
DATE, DATETIME, TIMESTAMP, YEAR, and TIME
The DATE
datatype comprises the date but no time component. Its format is 'YYYY-MM-DD'
with a range of ‘1000-01-01’ to ‘9999-12-31’.
The DATETIME
type includes the time with a format of ‘YYYY-MM-DD HH:MM:SS’. It has a range from ‘1000-01-01 00:00:00’ to ‘9999-12-31 23:59:59’.
The TIMESTAMP
type is an integer type comprising date and time with an effective range from ‘1970-01-01 00:00:01’ UTC to ‘2038-01-19 03:14:07’ UTC.
The YEAR
type represents a year and holds a range from 1901 to 2155.
The TIME
type represents a time with a format of ‘HH:MM:SS’ and holds a range from ‘-838:59:59’ to ‘838:59:59’.
Storage Requirements:
|-----------|--------------------|----------------------------------------|
| Data Type | Before MySQL 5.6.4 | as of MySQL 5.6.4 |
|-----------|--------------------|----------------------------------------|
| YEAR | 1 byte | 1 byte |
| DATE | 3 bytes | 3 bytes |
| TIME | 3 bytes | 3 bytes + fractional seconds storage |
| DATETIME | 8 bytes | 5 bytes + fractional seconds storage |
| TIMESTAMP | 4 bytes | 4 bytes + fractional seconds storage |
|-----------|--------------------|----------------------------------------|
Fractional Seconds (as of Version 5.6.4):
|------------------------------|------------------|
| Fractional Seconds Precision | Storage Required |
|------------------------------|------------------|
| 0 | 0 bytes |
| 1,2 | 1 byte |
| 3,4 | 2 byte |
| 5,6 | 3 byte |
|------------------------------|------------------|
See the MySQL Manual Pages DATE, DATETIME, and TIMESTAMP Types, Data Type Storage Requirements, and Fractional Seconds in Time Values.