Bookmark and Share

Keep replication in mind when writing SQL code.

Posted: Sunday, February 8th, 2009 at 8:02 pmUpdated: Monday, March 2nd, 2009 at 7:26 am

This is something I got "bitten" once, especially if you have a development server with its own database not replicated and production database is replicated. That slight difference is enough to cause production problems that may not be noticeable days after production release. As a wise man say:

When programming SQL statements, be mindful of replication.

Well, I gotta admit, it didn’t come from a wise man (I wish it did). It was what I learned after getting bitten by replication problem. Hey, it’s all part of life rite? You live, you make mistakes, you learn, you continue on living (hopefully without making the same mistakes again).

The Problems

Ok so for example, my other article Emulating nextval() function to get sequence in MySQL I mentioned that it is not safe for replication. Do glance through it and pay attention to the function if you haven’t done so.

OK below is the snippet of the code that has potential replication problems.

UPDATE
	sequence.sequence_data
SET
	sequence_cur_value = IF (
		(sequence_cur_value + sequence_increment) > sequence_max_value,
		IF (
			sequence_cycle = TRUE,
			sequence_min_value,
			NULL
		),
		sequence_cur_value + sequence_increment
	)
WHERE
	sequence_name = seq_name

Now the problem with this code is that there’s no guarantee that the update will produce the same result in master and in slave. If someone issue a nextval() for the sequence in slave, the value returned would be different. Let’s take an example like this.

  1. In master, you run query INSERT INTO some_table (id) VALUE (nextval(‘sq_my_sequence’));
  2. Let’s assume in master, nextval(‘sq_my_sequence’) returns 23
  3. Before the statement propagates to the slaves, someone issue a SELECT nextval(‘sq_my_sequence’) on a slave and got value 23.
  4. Let’s say now it’s time for the slave to execute the query that was done in master, it would insert a value 24 since value 23 is already been used by statement executed in #3.
  5. And now you got data inconsistency problem between master and slave.

Another example is when you’re using UUID() function from MySQL. The base idea for UUID() function is to return universally unique identifier. So if you have one of the statements like below executed in master

INSERT INTO some_table (id, col1) VALUE (UUID(), 'col1_value');
UPDATE some_table SET name = UUID() WHERE is_deleted = TRUE;

When the same statement is executed in slave, the value of UUID() is always different, even though you have a 0.000000s lag on slave (a.k.a. the statement is executed at the exact same time).

One way to solve this

Well, the problem on the statements above is that we have an unknown / unpredictable result from calling the same function with the same parameters. So if you can, break up the unpredictable statements outside of statements that modify data. If you’re working in PHP, here’s one way to rewrite the statement above

// Assume that connection to MySQL has been established.
$result = mysql_db_query("SELECT UUID()");
$row = mysql_fetch_row($result);

$result = mysql_db_query("
INSERT INTO
	some_table (id, col1)
VALUE
	(".$row[0].", 'col1_value')"
);

I hope this example illustrates an example of what needs to be noted and thought about when writing SQL statements. And as always, I welcome comments / questions / critics that will help me and other readers understand better

Leave a Reply