MySQL Storage Engine

MySQL Storage Engine

MySQL Server supports a plugin architecture in handling and manipulating DB data (Storage Engines). The default storage engine is MyISAM. A developer can use different storage engine for each table based on the features/trade off for each storage engine.

Syntax

mysql> CREATE TABLE tab1 (...) ENGINE=storage_engine

Example

mysql> CREATE TABLE tab1 (...) ENGINE=INNODB;

MySQL Storage Engine Commands

Show all installed Storage Engine

mysql> SHOW ENGINES\G

Set the default storage engine for the session

mysql> SET storage_engine=MYISAM;

Show INNODB information including deadlock

SHOW ENGINE INNODB STATUS

Convert tbl to a different storage engine

mysql> ALTER TABLE tbl type=INNODB;

MyISAM Pro & Con

  • The default storage engine
  • Support compression
  • Support concurrent INSERT *
  • Support auto recovery
  • Cache index but depends on OS to cache data
  • One data file per table: Easy to relocate a table to another directory/drive
  • NULL values are indexed also
  • No transaction support
  • Table level lock for UPDATES. Other concurrent UPDATES and SELECT will be blocked
  • Maximum key length is 1000 bytes
  • Good for read mostly applications

* Concurrent INSERT: Usually, INSERT and SELECT rows in a MyISAM table needs to be serialized. However, when there is no holes in the table's data file, MySQL allows concurrent INSERT and SELECT. This happens when no rows are deleted from a table ever or not after OPTIMIZE TABLE.

Each MyISAM table contains three files:

  • my_tbl.frm: Describe the table structure.
  • my_tbl.MYD: Data file
  • my_tbl.MYI: Index file

InnoDB Pro & Con

  • The default if our installation procedure is followed
  • Transaction support
  • Support auto recovery
  • Row level locking
  • Support foreign key
  • Oracle-like non-blocking SELECT
  • Cache both data and index
  • Store tables in tablespace
  • Hot backup tool is not free
  • An alternative to MyISAM for developer requiring Oracle-like capabilities

If you use READ COMMITTED, you must use row-based binary logging

binlog_format = 'ROW'

InnoDB Files

For meta data

  • my_tbl.frm : Describe the table structure.

Data can be stored in a share tablespace (default) or individual tablespace which each table has its own file

Share tablespace:

c:\mysql\data\ibdata*

 

  • Contain the data dictionary and hence used by individual tablespace also
  • By default, the files locate under the data root directory

Individual tablespace:

c:\mysql\data\*.ibd

 

  • One tablespace file per table
  • Enable by option: --innodb_file_per_table
  • Easier to relocate data file
  • Easier table level data recovery

Log:

ib_logfile0, ib_logfile1

Using Multiple Data Files for InnoDB's shared tablespace. Add more files if needed

# Auto-extend with max file size 100M
# Path relative to data root directory
[mysqld]
innodb_data_file_path=inno_data1:100M;inno_data2:100M:autoextend:max:200M
[mysqld]
# For absolute path, put an empty string in innodb_data_home_dir
innodb_data_home_dir =
innodb_data_file_path=/db/ibdata1:100M;/db/ibdata2:100M:autoextend

NOTE:

  • NFS system is NOT recommended for storing InnoDB files
  • InnoDB creates new data file automatically but not directory
    • Directory must be created manually before use
    • New data file must not exist
  • All data directory MUST belong to user mysql with group mysql (chown -R & chgrp -R)
  • All data directory MUST DENY read/write/execute access from other users (chmod -R og-rwx)
  • When data file configuration is changed, use mysqld --console to start the server for the first time and check any error message in the console
  • Use AUTO INCREMENT column as primary key if the alternative primary key is very long

InnoDB Sample Configuration

MySQL InnoDB Deployment: 512MB available memory and one hard disk

[mysqld]
innodb_data_file_path=ibdata1:100M:autoextend:max:200M

# Set buffer pool size to 50-80% of the available memory
innodb_buffer_pool_size=256M

#Store data dictionary and other internal data structures
innodb_additional_mem_pool_size=20M

# Set the log file size to about 25% of the buffer pool size
# The log buffer stores the transactions in memory before they are flushed
# Suggest 4 to 8M
innodb_log_file_size=64M
innodb_log_buffer_size=8M
innodb_flush_log_at_trx_commit=1

MySQL InnoDB Deployment: 2GB available memory and 3 hard disks

[mysqld]
innodb_data_home_dir =

# Data files must be able to hold your data and indexes
innodb_data_file_path = /db/ibdata1:2G;/dr2/db2/ibdata2:2G:autoextend

# Set buffer pool size to 50-80% of your computer's memory,
innodb_buffer_pool_size=1400M
innodb_additional_mem_pool_size=20M
innodb_log_group_home_dir = /dr3/iblogs

# Set the log file size to about 25% of the buffer pool size
innodb_log_file_size=350M
innodb_log_buffer_size=8M

innodb_flush_log_at_trx_commit=1
innodb_lock_wait_timeout=50

InnoDB Transaction

Start/End transaction

start transaction;
insert into ...;
insert into ...;
commit;
start transaction;
insert into ...;
insert into ...;
rollback;

MEMORY Storage Engine

  • Store in memory for transient data
  • Data deleted upon restart or crash
  • Use MAX_ROWS in the CREATE TABLE to limit the max memory size
  • Memory space reclaimed by dropping table but not row
  • Does not support BLOB or TEXT
  • Good for small but frequent accessed data
mysql> CREATE TABLE tbl1 ENGINE=MEMORY SELECT * FROM tbl2;
mysql> SELECT * FROM tbl1;
mysql> DROP TABLE tbl1;

With Hash or BTree index

CREATE TABLE tbl (id INT, INDEX USING HASH (id)) ENGINE=MEMORY;
CREATE TABLE tbl (id INT, INDEX USING BTREE (id)) ENGINE=MEMORY;

MERGE Storage Engine

  • A logical table composed of multiple MyISAM tables
  • aka Oracle view for multiple MyISAM tables

Each table contains:

  • my_tbl.frm : Describe the table structure
  • my_tbl.MRG: For each line, it names a MyISAM table

ARCHIVE Storage Engine

  • Allow large data with small footprint
  • Support INSERT and SELECT only. (No DELETE or UPDATE)
  • Good for large amounts of historical, archived or audit information

Compare MySQL Storage Engines

Feature MyISAM Memory InnoDB Archive NDB
Storage limits 256TB RAM 64TB None 384EB
Transactions No No Yes No Yes
Locking Level Table Table Row Row Row
Foreign key No No Yes No No
Oracle-like Data Snapshot No No Yes No No
B-tree indexes Yes Yes Yes No Yes
Hash indexes No Yes No No Yes
Cache Data No In memory Yes No Yes
Cache Index Yes In memory Yes No Yes
Full-text search indexes Yes No No No No
Clustered indexes No No Yes No No
Compressed data Yes No With plugin Yes No
Clustering No No No No Yes
Replication Yes Yes Yes Yes Yes
Backup/Recovery Yes Yes Yes Yes Yes
Query cache Yes Yes Yes Yes Yes
Statistics Yes Yes Yes Yes Yes
Encryption Yes Yes Yes Yes Yes

Transaction Support

A transaction session often includes multiple SQL statements. When the auto-commit is off, a transaction engine COMMIT/ROLLBACK all SQLs atomically. For a non-transaction engine, all executed statements are committed even a later SQLs fails. A transaction engine allows better concurrency in reading/writing tables.

For non-transactional engine, it offer lower disk space requirement, less memory requirement for updates. Without the transaction overhead, MyISAM claims faster engine. However, recent benchmark indicates competitive performance from the InnoDB engine.

Table Level Locking

Table level locking allows concurrent reads on a table. But if a session writes to a table, all other sessions will be blocked. MySQL also gives higher priority to UPDATE than SELECT.

MyISAM implements table level locking because of its simplicity and easier to write fast engine code. Hence, it is for read mostly application.

MyISAM is good for

  • Most SQLs are SELECT
  • Or SQLs are a mix of or
    • reads and
    • writes that updates or deletes a single row that can be fetched with one key read:
      UPDATE tbl SET col1=value WHERE unique_col=key_value;
      DELETE FROM tbl WHERE unique_col=key_value;
      
  • Or SELECT combined with
    • INSERT that can use Concurrent INSERT
    • And very few UPDATE or DELETE statements

MyISAM Best Practices

  • Use the same (or similar) type and size for the join columns
    mysql> SELECT * from tbl1, tbl2 where tb1.value1 = tbl2.value2;
    

    Perform best if value1 and value2 are say both VARCHAR(10), or value1 is CHAR(10) and value2 is VARCHAR(10)

    Change the schema
    ALTER TABLE tbl1 MODIFY value1 VARCHAR(10);
    
  • For MyISAM tables that change frequently, try to avoid any variable-length columns
  • Declare a column NOT NULL if it does not allow NULL value because MySQL will create better execution plan.

InnoDB Best Practices

  • Use small data type for the primary key since it is ALWAYS included in every secondary index. Create an AUTO_INCREMENT column for primary key if the original primary key is too long
  • Use transaction to wrap around multiple SQLs for transaction requirement and improve performance by reducing log flushing
  • If about 1 second of transaction lost during OS crashes is acceptable, set innodb_flush_log_at_trx_commit to 0
  • Use the VARCHAR data type instead of CHAR if you are storing variable-length strings
  • Rollback of massive INSERT is expensive
  • SELECT count(*) ... is expensive
  • InnoDB has a limit of 1023 concurrent transactions
  • Under certain condition, TRUNCATE tbl is translated into DELETE FROM tbl which can be expensive. Use Drop/Create table is a very good alternative

Views & Triggers

View is stored in my_view.frm. All triggers in a table is stored in my_tbl.trg