Monday, November 5, 2007

What is Innodb doing?

I've been running a series of sysbench tests on Innodb with MySQL 4.1 My tests include doing both reads and writes on the table.

I've noticed that after the test is complete, and while there are no processes in show processlist, I see a fairly large amount of buffer pool reads and writes, as well as a large number of Modified db pages. Now, I know that it could take a while for innodb to flush out the dirty buffers after a large amount of writes, but I'm seeing the Modified db pages actually getting larger for a while, and then gradually getting to the point where it finally settles down and returns to 0.

The only thing I can think of is Innodb is doing some sort of table optimization, which begs the question of whether I should rerun my tests with the optimized table or not.

Has anyone else seen this?

UPDATE: I believe I have figured out what's happening. My sysbench test is working on a 100M row table, and the last run I do is 10M inserts with these options:

--oltp-test-mode=nontrx
--oltp-nontrx-mode=insert

I found out that this test deletes all of my rows before it runs, so I see around 50k deletes per second for quite a while, and then the test runs. After the test finishes, I suspect Innodb is cleaning up a lot of unused pages in the data file, hence all the read and write activity on the buffer pool.

2 comments:

Eric Bergen said...

Innodb also defers secondary index writes using the 'insert merge buffer'. This process can even continue after MySQL restarts. More details here: http://dev.mysql.com/doc/refman/5.0/en/innodb-insert-buffering.html

This is one of the reasons why doing a flush tables with read lock isn't enough to make innodb 'hold still' for a backup like myisam does.

fields said...

This raises some questions, with respect to running backups.

In a replicated environment, you either want to:

a) take a backup off a slave by stopping the slave i/o, waiting for it to catch up, waiting for the buffers to flush, tarring up the data/log directories, and then restarting the slave.

or

b) take a backup off of the master with ibbackup/innobackup.

I think the following is likely:

1) In case a, the insert merge buffer is contained within the regular buffer pool, so 'Modified db pages' won't read 0 until that insert buffer is cleared.

2) In case b, the secondary index merge buffer is irrelevant, because transactions won't be marked as completed/flushed (not sure of the innodb terminology for this) until that buffer is fully merged, so in your backup, an unmerged transaction will simply be replayed from the beginning from the innodb transaction log.

Is #1 correct? If not, how do you tell if the insert merge buffer still has data in it, and where is that stored when the server is restarted?

Is #2 correct? If not, how does this work to maintain database integrity with respect to the innodb transaction log, where you can have transactions that are marked as completed in the transaction log but have not been fully flushed to disk?