Ok, I've got a real head scratcher... I'm going bald!
This is a pretty simple problem. Inserting data into the table normally works fine, except for a few times, the insert query takes a few seconds. This isn't very good, so I setup a simulation of the insert process. I am NOT trying to bulk insert data. I am trying to find out why the insert query occasionally takes more than 2 seconds to run. Joshua suggested that the index file may be being adjusted; I have removed the id (primary key field), but the delay still happens.
I have a MyISAM table: daniel_test_insert (this table starts COMPLETELY empty):
create table if not exists daniel_test_insert ( id int unsigned auto_increment not null, value_str varchar(255) not null default '', value_int int unsigned default 0 not null, primary key (id) )
I insert data into it, and sometimes, a insert query takes > 2 seconds to run. THERE ARE NO READS on this table. All writes, in serial, by a single threaded program.
This same row; 100,000 times. I run the exact same query 100,000 times, because once in a while the query takes a long time, and I'm trying to find out why. It appears to be a random occurrence so far though.
This query for example took 4.194 seconds (a very long time for an insert)
Query: INSERT INTO daniel_test_insert SET value_int=12345, value_str='afjdaldjsf aljsdfl ajsdfljadfjalsdj fajd as f' - ran for 4.194 seconds status | duration | cpu_user | cpu_system | context_voluntary | context_involuntary | page_faults_minor starting | 0.000042 | 0.000000 | 0.000000 | 0 | 0 | 0 checking permissions | 0.000024 | 0.000000 | 0.000000 | 0 | 0 | 0 Opening tables | 0.000024 | 0.001000 | 0.000000 | 0 | 0 | 0 System lock | 0.000022 | 0.000000 | 0.000000 | 0 | 0 | 0 Table lock | 0.000020 | 0.000000 | 0.000000 | 0 | 0 | 0 init | 0.000029 | 0.000000 | 0.000000 | 1 | 0 | 0 update | 4.067331 | 12.151152 | 5.298194 | 204894 | 18806 | 477995 end | 0.000094 | 0.000000 | 0.000000 | 8 | 0 | 0 query end | 0.000033 | 0.000000 | 0.000000 | 1 | 0 | 0 freeing items | 0.000030 | 0.000000 | 0.000000 | 1 | 0 | 0 closing tables | 0.125736 | 0.278958 | 0.072989 | 4294 | 604 | 2301 logging slow query | 0.000099 | 0.000000 | 0.000000 | 1 | 0 | 0 logging slow query | 0.000102 | 0.000000 | 0.000000 | 7 | 0 | 0 cleaning up | 0.000035 | 0.000000 | 0.000000 | 7 | 0 | 0
This is an abbreviated version of the SHOW PROFILE command, I threw out the columns that were all zero.
Now the update has an incredible number of context switches and minor page faults.
Opened_Tables increases about 1 per 10 seconds on this database (not running out of table_cache space)
Hardware: 32 Gigs of ram / 8 cores @ 2.66GHz; raid 10 SCSI harddisks (SCSI II???) I have had the harddrives and raid controller queried: no errors are being reported. CPU's are about 50% idle.
iostat -x 5 (reports less than 10% utilization for harddisks) top report load average about 10 for 1 minute (normal for our db machine)
Swap space has 156k used (32 gigs of ram :)
I'm at a loss to find out what is causing this performance lag! Does anyone have any suggestions?
This does NOT happen on our low-load slaves, only on our high load master. This also happens with memory and innodb tables.
Warning: This is a production system, so nothing exotic!
-daniel (I'm going to have use my dogs hair for a tuopee!!!)
Updated: Sept 20th, 2010: I'm going bald!
I have noticed the same phenomenon on my systems. Queries which normally take a millisecond will suddenly take 1-2 seconds. All of my cases are simple, single table INSERT/UPDATE/REPLACE statements --- not on any SELECTs. No load, locking, or thread build up is evident.
I had suspected that it's due to clearing out dirty pages, flushing changes to disk, or some hidden mutex, but I have yet to narrow it down.
Also Ruled Out
- Server load -- no correlation with high load
- Engine -- happens with InnoDB/MyISAM/Memory
- MySQL Query Cache -- happens whether it's on or off
- Log rotations -- no correlation in events
The only other observation I have at this point is derived from the fact I'm running the same db on multiple machines. I have a heavy read application so I'm using an environment with replication -- most of the load is on the slaves. I've noticed that even though there is minimal load on the master, the phenomenon occurs more there. Even though I see no locking issues, maybe it's Innodb/Mysql having trouble with (thread) concurrency? Recall that the updates on the slave will be single threaded.
MySQL Verion 5.1.48
I think I have a lead for the problem on my case. On some of my servers, I noticed this phenomenon on more than the others. Seeing what was different between the different servers, and tweaking things around, I was lead to the MySQL innodb system variable
I found the doc a bit awkward to read, but
innodb_flush_log_at_trx_commit can take the values of 1,2,0:
- For 1, the log buffer is flushed to the log file for every commit, and the log file is flushed to disk for every commit.
- For 2, the log buffer is flushed to the log file for every commit, and the log file is flushed to disk approximately every 1-2 seconds.
- For 0, the log buffer is flushed to the log file every second, and the log file is flushed to disk every second.
Effectively, in the order (1,2,0), as reported and documented, you're supposed to get with increasing performance in trade for increased risk.
Having said that, I found that the servers with
innodb_flush_log_at_trx_commit=0 were performing worse (i.e. having 10-100 times more "long updates") than the servers with
innodb_flush_log_at_trx_commit=2. Moreover, things immediately improved on the bad instances when I switched it to 2 (note you can change it on the fly).
So, my question is, what is yours set to? Note that I'm not blaming this parameter, but rather highlighting that it's context is related to this issue.