Idempotent MySQL DDL

Idea and collection of SQL snippets

Idempotent SQL are a way to maintain a database and adapt it to changing demands.

All bigger applications need a database to store all data. As the application gets developed and improved, the demands on the database change. Keeping a database updated requires some effort. Tools like Flyway will make it easier, but require the installation of a tool. However, this is not necessary. I am sketching the idea of a idempotent SQL script that can achieved the same goal.

Idempotency means that once a target state is reached, another application will not change it. It is well defined in Mathematics. Repeated applications have the same effect as the one-time application.

What kind of application #

One must adapt the maintenance procedure to the kind of application.

Some application are deployed in a cluster and the cluster must be operative at all times. Then a rolling update is required. Some nodes of the cluster get updated, while the other nodes are carying. If the system allows it, additional nodes are spun up to cover for the load. The database has to be functional at all time.

If down-time of the application is possible, then more options are available. Full locks on the database might be possible. Indices might be dropped and that database reorganized.

Wishful thinking #

It would be nice, if we could tell Mariadb/MySQL what we would like to achieve. However, the current DDL SQL is a sequence of modifications of the database. Instead of CREATE, DROP or ALTER, we could have had ENSURE statements that specify the final state.

Except, most DDL statements do specify what should be achieved. However, they will fail the scripts if the conditions are not met. Hence, one need to add safeguards. One needs to test the existence or value of a feature and dependend on that engage.

Requirements #

Lets find out, what has to be done. The final product could be a single SQL script that achieves the goal. This script needs to be extended and also modified, when the demands change.

Functions and procedures #

Functions and procedures can be dropped and recreated in a single transaction. Since there is no state to be maintained, this would solve the maintenance problem for the programmed functionality.

Since there is not much support for if exists in MySQL, those determination of state must be based on content in the information_schema database.

Tables #

Tables are the easiest, since MySQL supports create table if not exists. At every stage in development, missing tables should be created with their final state.

Fields #

Fields are more complicated.

Missing fields can be detected by checking the information_schema. That can be simplified by a function that carries out that check.

Fields that have the wrong data type are difficult. One would need to wrap the modification inside a transaction. One could add another field, carry out the conversion, deal with the failures and then drop the original and rename the field. That might be almost too much and the application should take care of the transition in those cases. Rather then changing the data type, one should add a new and drop the old. Maybe even working with both for a certain amount of time.

Some modification would just mean extending the length of the field. This requires the retrieval of field lengths.

Indices and keys #

It might not be possible to change the primary key. The primary key plays an important role in Microsoft SQL server where it determines the storage order of rows. Both Mariadb and MySQL might not have the same approach to the organization of datarows. Hence, it would not be necessary to change the primary key. Primary keys play the same role as unique indices.

Guidelines and approach #

Database part	condition	approach
Tables	`create table if not exist <tab>`	always create missing tables with the latest details
Fields	`if field_exist(name)`	add missing fields with latest details
Indices	`if index_exist(name)`	create missing indices with latest details
Functions		recreate within a transaction
Procedure		recreate within a transaction

Several functions need to be created that check the information_schema for the existence of related elements.

SQL helper functions #

For determining if a new column needs to be added, one needs to count the existing columns of the same name in the same table:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


drop function if exists ColumnCount;

create function `ColumnCount` (tabName varchar(64), colName varchar(64)) returns int
begin
    declare result int;
    set result = (select count(1) from information_schema.columns  
         where TABLE_SCHEMA = database() and
         TABLE_NAME = tabName and COLUMN_NAME = colName 
        );
    return result; 
end;

Similar, one needs to count the number of indices with the same name in the same table:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


drop function if exists IndexCount;

create function `IndexCount` (tabName varchar(64), idxName varchar(64)) returns int
begin
    declare result int;
    set result = (select count(1) from information_schema.statistics  
         where TABLE_SCHEMA = database() and
         TABLE_NAME = tabName and INDEX_NAME = idxName 
        );
    return result; 
end;

And if a column needs to be modified to accommodate larger data, one needs to retrieve the current data type:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


drop function if exists ColumnType;

create function `ColumnType` (tabName varchar(64), colName varchar(64)) returns varchar(64)
begin
    declare result varchar(64);
    set result = (
        SELECT COLUMN_TYPE FROM information_schema.COLUMNS
         where TABLE_SCHEMA = database() and
         TABLE_NAME = tabName and COLUMN_NAME = colName );
    return result; 
end;

Example #

In the following example an initial table AppUser is created and filled with some data. Then a column is added, some indices created and a column modified.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


create table if not exists `AppUser` (
    `UserId` int primary key not null,
    `Name` varchar(3) not null
);

insert into `AppUser`(`UserId`,`Name`) values( 47, 'xin');

if ColumnCount('AppUser','Key') < 1 then
alter table `AppUser` add column `Key` char(40);
end if;

if IndexCount('AppUser','IDX_Name') < 1 then
create unique index `IDX_Name` on `AppUser` (`Name`);
end if;

if IndexCount('AppUser', 'IDX_Key') < 1 then
create unique index `IDX_Key` on `AppUser` (`Key`); 
end if;

if ColumnType('AppUser', 'Name') <> 'varchar(50)' then
alter table `AppUser` modify column `Name` varchar(50) not null;
end if;

Final remarks #

If all the SQL is combined in one script and run several times, the outcome will not change.

Note, the naming of the functions revealed that there are internal native functions like column_exists that block the usage of similar user defined functions. Hence the name with CamelCase that follows a different scheme than the native one.

Note, the normal MySQL command uses ‘;’ as a delimiter. If one wants to run the script from the command line, one should use something like the following:

1

mysql -h hostname -uuser -ppassword --verbose  --delimiter=$$ <example.sql

Good luck!