MySQL schema version control
2019年5月5日 更新
开启更多功能,提升办公效能




1. Background


It's well known that SQL is schematic, meaning that

  • creating some tables in an empty database is easy
  • adding a new table to an non-empty database is easy
  • adding a new column to an existing table is easy
  • deleting a table which contains some old data in an existing database could be a pain in the ass
  • deleting a column in a table which contains some old data in an existing database could be a pain in the ass
  • deleting a row in a table which contains some old data in an existing database could be a pain in the ass when foreign key constraints are involved
  • renaming a column in a table which contains some old data in an existing database could be a pain in the ass when foreign key constraints are involved
  • renaming a table which contains some old data in an existing database could be a pain in the ass
  • ......(and whatever pains you might imagine)


There's no definite way to do this, when updating your MySQL schema you're at your own risk most of the time, ESPECIALLY for schema update of a live database server with existing data. And things become even worse when you have to use an ORM library for it -- because most ORM libraries are less capable than the native SQL query in terms of operation flexibility.


You'd better keep your schema as one or several organized "*.sql" files in a version control system, e.g. git. For simplicity, we assume that no foreign key constraint is used in this article.





2. Using MySQL CLI Client & SchemaSync


In case you don't know the abbreviation, "CLI" stands for "Command Line Interface".


This article assumes that you know how to

  • write a "*.sql" file which contains SQL queries for schema creation, data CRUD and all the painful schema updates listed above, as well as
  • execute such a "*.sql" file by a MySQL CLI client to take effect in your MySQL server via an automatically managed db connection.

When executing a "*.sql" file to update the schema of a "local MySQL server", you should be prepared for failures upon weird reasons.


A typical use of this approach is to

  • track your "latest_expected_schema.sql" in version control system, e.g. git or svn, then
  • after every checkout of centralized remote codes, apply "latest_expected_schema.sql" to a `preempted MySQL server`, then
  • stop your `live MySQL server` from accepting client queries, then
  • make use of SchemaSync to generate a "latest_diff_between_live_and_expected_schema.sql (SHOULDN'T be tracked in version control)" between your `live MySQL server` and the `preempted one`, then
  • execute "latest_diff_between_live_and_expected_schema.sql" for your `live MySQL server`, and finally
  • resume your `live MySQL server` from accepting client queries.

You should be aware of that SchemaSync is not perfect for the following reasons.

  • It only generates "latest_diff_between_live_and_expected_schema.sql" from 2 live MySQL servers.
  • It requires stopping the `live MySQL server` from accepting client queries, which is yet in many cases inevitable


.



3. Using MySQL Workbench


This is the recommended approach, please refer to https://dev.mysql.com/doc/workbench/en/wb-database-synchronization.html for details.


The advantage of using MySQL Workbench over SchemaSync is that it allows you to sync `live MySQL server` against all types of expected schema, e.g. "EER diagram", "*.sql file (PREFERRED)", and "preempted MySQL server".





4. Using `skeema`


A CLI alternative to MySQL Workbench is `skeema`, and you need Golang development toolchain to use it.


Please install/uninstall Golang development toolchain by your OS-specific steps, e.g. given in https://github.com/genxium/Go111ModulePrac.


The following screenshot shows by skeema how to generate "latest_diff_between_live_and_expected_schema.sql" from "live MySQL server localhost:3306" to "/path/to/<skeema-repo-root>/<dbname>/<individual-table>.sql", where "<skeema-repo-root>" contains a ".skeema" file which owns live database connection info.



The working scope of `skeema` for a "live MySQL server" is "all databases EXCEPT several reserved names 'information_schema', 'mysql', 'performance_schema', 'sys' and 'test'".


(All existing databases in "live MySQL server".)

(Initialize <skeema-repo-root>.)

(I added several new tables "food_*.sql" under "tunjing", i.e. updated my expected schema of the database named "tunjing".)

(A dry-run of "skeema diff" under <skeema-repo-root>. You might need "skeema diff -u root -p" if authentication is required.)

(I'm satisfied with the diff strategy and decided to execute "skeema push" to update the live MySQL server. You might need "skeema push -u root -p" if authentication is required.)


⚠️You might have to execute
-- `skeema push development`, or
-- `skeema push --allow-unsafe`,or
-- `skeema push development --allow-unsafe`, or even as verbose as
-- `skeema push development --allow-unsafe -u root -p`

depending on your specific setup, where "development" is an environment name written in "<skeema-repo-root>/.skeema" file.

(Now the schema of my "live MySQL server" is synchronized to "/path/to<skeema-repo-root>/<dbname>/<individual-table>.sql".)


(However I sometimes wants customization to the generated "latest_diff_between_live_and_expected_schema.sql".)

4.1 Mapping Skeema to Database from ".skeema" file

Sometimes you'll need a new database to be tracked by `skeema`, e.g. "tunjing_test" for testing environment setup.


Recommended steps are as follows.

  • Create a new directory "<skeema-repo-root>/tunjing_test/".
  • Put your preferred "*.sql" files for tables under the created directory, e.g. "<skeema-repo-root>/tunjing_test/".
  • Put your preferred "*.skeema" file containing "schema=tunjing_test" under "<skeema-repo-root>/tunjing_test/".


  • Execute `skeema push` from <skeema-repo-root>.


You can also add the a single expected dbname to `<skeema-repo-root>/.skeema` for convenience, e.g. if "travelpp" is the only dbname expected to be accessed by skeema.

```
[production]
socket=/tmp/mysql.sock
user=root
host=localhost
schema=travelpp

[development]
socket=/tmp/mysql.sock
user=root
host=localhost
allow-unsafe
skip-alter-wrapper
skip-alter-algorithm
skip-alter-lock
schema=travelpp
```

By such configuration all "*.sql" files associated with the tables under dbname "travelpp" would be maintained at `<skeema-repo-root>/`.


4.2 Required privileges

The MySQL user you configured in corresponding `.skeema` file must have to "privilege to create & drop database" of the "target live MySQL server", because automation will need create a database named "_skeema_tmp" in the "target live MySQL server".




5. Alternative tools

It's possible that `skeema` is not alone for its service. Check the following alternatives for your best fit.

  • SchemaSync
  • Introduced above in #2.
  • Only generates diff report for different live MySQLServer connections.
  • gh-ost
  • https://github.com/github/gh-ost
  • Pros & cons unknown yet.
  • DBDiff
  • https://github.com/DBDiff/DBDiff
  • Only generates diff report for different live MySQLServer connections.
  • The README of it shows many other alternatives too.
  • mysqldiff
  • https://dev.mysql.com/doc/mysql-utilities/1.6/en/mysqldiff.html
  • Only generates diff report for different live MySQLServer connections.
  • php-mysql-diff
  • https://github.com/camcima/php-mysql-diff
  • Only generates diff report for different *.sql files.
  • sequelize-cli
  • https://github.com/sequelize/cli
  • It's NOT a useful tool for db migration if you uses "sequelize/NodeJs" in backend because the model class files needs MANUAL SPECIFICATION TO INITIALIZE.
  • Note that a similar tool "node-db-migrate" lacks initialization guidance and thus not as competant as this one.


However most of the above are NOT truely alternatives to `skeema`, as they DON'T directly provide "file v.s. livedb" differentiation.


True alternatives are marked in purple.