Best mysql questions in November 2011

"where 1=1" statement

74 votes

Possible Duplicate:
Why would someone use WHERE 1=1 AND <conditions> in a SQL clause?

I saw some people use a statement to query a table in a MySQL database like the following:

select * from car_table where 1=1 and value="TOYOTA"

But what does 1=1 mean here?

It's usually when folks build up SQL statements.

When you add and value = "Toyota" you don't have to worry about whether there is a condition before or just WHERE. The optimiser should ignore it

No magic, just practical


Example Code:

commandText = "select * from car_table where 1=1";

if (modelYear <> 0)     commandText += " and year="+modelYear
if (manufacturer <> "") commandText += " and value="+QuotedStr(manufacturer)
if (color <> "")        commandText += " and color="+QuotedStr(color)
if (california)         commandText += " and hasCatalytic=1"

Otherwise you would have to have a complicated set of logic:

commandText = "select * from car_table"
whereClause = "";
if (modelYear <> 0)
{
   if (whereClause <> "") 
      whereClause = whereClause + " and ";
   commandText += "year="+modelYear;
}
if (manufacturer <> "")
{    
   if (whereClause <> "") 
      whereClause = whereClause + " and ";
   commandText += "value="+QuotedStr(manufacturer)
}
if (color <> "")
{
   if (whereClause <> "") 
      whereClause = whereClause + " and ";
   commandText += "color="+QuotedStr(color)
}
if (california)
{
   if (whereClause <> "") 
      whereClause = whereClause + " and ";
   commandText += "hasCatalytic=1"
}

if (whereClause <> "")
   commandText = commandText + "WHERE "+whereClause;

mysql Query slow - IP lookup (banned or not)

12 votes

I have on my PHP file a function that check if an IP is banned or not. For some reason my site is very slow and the problem is when I check if the IP is banned or not.

(I remove the code that checks and my site was fast again)

Here's my code:

// index.php - everything redirects to this file in the .htaccess
<?php
include('config.php');
if(isIpBanned($_SERVER['REMOTE_ADDR'])) {
 die('access denied');
}
// rest of the code

here's my function

// config.php
<?php
function isIpBanned($db, $ip) { // $db is declared correctly
 $goodIP = $db->getRecord("SELECT is_banned FROM security.ip WHERE ip = '$ip'"); // this function works and return 1 or 0
 return (bool)$goodIP;
}

This query takes about 2 seconds to 3 seconds to run. Why? I don't have left join or other tables.

Thanks

  1. Put a (unique?) index on the IP column
  2. Use the correct datatype by converting the textual representation to a "native" one (an ipv4 fits in a INT UNSIGNED, an ipv6 in 2 BIGINT UNSIGNED): this will make your tables smaller, and will require less I/O during scans

and, as a side note, even if $_SERVER["REMOTE_ADDR"] should be safe, NEVER FORGET TO ESCAPE THE DATA IN SQL QUERIES!

MYSQL - NOT vs var=false

11 votes

In the last past days I noticed something weird optimizing my query. I have a simple query which does something like:

   SELECT id,name,amount FROM reservations WHERE NOT canceled ORDER BY name ASC

I noticed mysql wasn't using any index, so I started doing some experiments. Accidentally I replaced the "NOT canceled" with "canceled=false", and then, Mysql started using "canceled" as index. After that I tried using the opposite:

   SELECT ... FROM reservations WHERE canceled ORDER BY ...

Same result! When I change that to "canceled=true" the index works again.

My question is: HOW COME?! Isn't using "NOT" the "elegant" way? Anyhow I didn't expect for it to make any difference.

I'm using InnoDB as the engine, but i get same result using MyISAM. Can someone clarify things up? Thanks.

Edit: Table structure

CREATE TABLE `reservations` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `trip_code` varchar(10) DEFAULT NULL,
  `departure_date` date DEFAULT NULL,
  `amount` float DEFAULT NULL,
  `name` varchar(45) DEFAULT NULL,
  `canceled` tinyint(1) NOT NULL DEFAULT '0',
  `created_date` date NOT NULL,
  `creator_user` int(11) NOT NULL DEFAULT '1',
  `last_update_user` int(11) NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`),
  KEY `trip_code` (`trip_code`),
  KEY `departure_date` (`departure_date`),
  KEY `created_date` (`created_date`),
  KEY `canceled` (`canceled`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=123181 ;

I am not familiar with MYSQL, but thinking logically, I understand it like this:
Index is like a phone book, when you are searching for "Cohen", you can get it right away.
But if you are looking for NOT "Cohen", you will have to run over every entry, and check if it's different from "Cohen".
So when you are looking for specific value, it looks just for it. And when you are using NOT, it looks for any other value that can fit inside tinyint(1) (as I understand it's not only 1 or 0, is it?).

coupon code statistics

10 votes

I have a system that allows users to enter coupons (for example like dell or amazon)

I want to know how much a coupon has been used so after 500 times the coupon is no longer valid.

The way I am thinking right now is to insert the coupon into a table and then select count(*) and if it reaches 50 then I don't insert and I return a message saying it's expired.

I think this solution works but I wanted to validate and maybe check if there's a better way to do it.

I assume you have coupons stored in some table with code, description, etc, etc. add num_used column to it and increment it every time the coupon is used. This way you can still insert the coupon into orders if need be but you don't need to do counts. You simply check how many times it was already used when you load your coupon info. I assume you check validity by coupon code anyway so throw another one in for number of times.

Why does "_" (underscore) match "-" (hyphen)?

10 votes

I have to look for some pdf manual using this query:

root@localhost:test> select * from a where name like '%taz_manual%.pdf%';
+--------------------+------------------+-------------+
| name               | description      |        size | 
+--------------------+------------------+-------------+
| taz-manual-1.1.pdf | Manual v1.0 TA-Z |    31351902 |
| taz-manual-0.2.pdf | Manual v1.0 T1-A |     3578278 |
| taz_manual-2.0.pdf | Manual v2.0 GA-X |   542578278 |
etc........
+--------------------+------------------+-------------+
132 row in set (0.00 sec)

Now, why am I seeing the the one with dashes when I specify the name to be taz_manual%.pdf?

Thanks

Because the underscore (_) is a wildcard like the % (percent) except you only look for one character.

SQL pattern matching enables you to use "_" to match any single character and "%" to match an arbitrary number of characters (including zero characters).

(From section 3.3.4.7. Pattern Matching in the MySQL documentation.)

If you want to use the underscore in like as a literal, you have to escape it:

select * from a where name like '%taz\_manual%.pdf%';

Spawning a "Pokemon" type graph in php from MYSQL data

9 votes

If you remember when you check your pokemon's stats (In Red, Blue, Yellow, Silver, Gold etc.), the graph that it generates scales based on the HP, attack, defense, special attack, and special defense of your pokemon.

I want to generate a graph such as this using PHP and MYSQL for in-office personnel data. I thought it would be cool to create a stat card/sheet for each employee showing their strengths and weaknesses based on their experience with different subjects of their work.

Anyone have an idea how I may accomplish this? If it can't be done w/PHP what can I use?

These are called Radar/Spider Charts and there are a variety of libraries and tools you can use to make them.

Image from Wikipedia

Unique record mysql on inserts

9 votes

I want to implement a new system for referal/promotions. my new system where when a user register a specific source is assig to him. A source can be anything from referal (clients domain) and a promotion code (eg: 50% off). This source also have a date (y-m-d). Each source id is unique and multiple users can have the same source id

For example:

source id : 1 = www.domain1.com / 2011-11-20 / (empty referal code) 
source id : 2 = www.domain1.com / 2011-11-20 / referalcode1
source id : 3 = www.domain2.com / 2011-11-20 / referalcode1
source id : 4 = www.domain2.com / 2011-11-20 / referalcode2

The referal code can be mixed from client1 to client 2

How can i make sure when someone register (its free registration and we have over 1000 per hours now) we dont have duplicate records and risk the mysql to generate errors?

From mysql page:

The main idea is to insert records using the ignore statement. If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row still is not inserted, but no error is issued. Then retrieve the last ID using mysql_insert_id

If the record does not exists, it will insert it and the last ID will be returned.

Now, if there's not record returned, you can do your select. This means a record already exists, so simply use it.

Example:

// make sure $id is safe (preventing sql injections)
$sql = "INSERT IGNORE INTO yourtable SET `field1` = 'value1' ... "; 
mysql_query($sql);

if(mysql_affected_rows() == 1) { // no record found and then the inserts worked
  $id = mysql_insert_id(); // id from the primary key
  // add to cache
}
else {
  // you can also cache them when they were inserted (faster to run than a select statement)
  $id = retreiveFromCache($field1, $field2 /* etc...*/);
  if(!$id) { // no record found in cache
    // now the select can be done using the fields received
    // make sure your use the right index otherwise it can be slow
    $sql = "SELECT id FROM promotions_referrals WHERE `field1` = 'value1' ... "; // 
    $query = mysql_query($sql);
    $row = mysql_fetch_assoc($query);
    $id = $row['id'];
    // add to cache
  }
}

At this point you will have $id assigned and you are sure there's no duplicate/mysql errors.

Note: if a domain belongs to one client, use the client id instead of using the domain. This way you will use INT as index which is faster than VARCHAR. Same idea for referral code.

php unique identifier for clients

8 votes

I have a unique identifier for client i generate everytime a client request a service support.

once the id is generate, the id is inserted in the database

here my code:

function makeUnique() {
 $start_time = uniqid(microtime(1));
 $duration = sprintf('%0.24f', $start_time);
 return date('dmYHis').$duration;
}

echo makeUnique();

this output: 071120112032291320715949.928639888763427734375000

for some reason i get 071120112032291320715949 as the number. what am i doing wrong?

you need to remove the "." (dot) in your function. you are probably using an INT which remove numbers after the dot.

return str_replace('.', '', date('dmYHis').$duration);

and make sure the field is big enough - like varchar(50)

my recommendation will be to simply hash the client id using md5

md5($clientid)

and you have the mysql field as a char(32)

How to find all the tags of each page and then concat the tags into one column?

8 votes

How can I find all the tags of each page and then concat the tags into one column?

For an example,

pages table,

page_id 
1
2
3
4

tags table,

tag_id      tag_name
1           A
2           B
3           C
4           D

tagged table,

page_id     tag_id
1           1
1           2
1           3
3           2
3           3

My working query,

SELECT
    p.page_id,
    GROUP_CONCAT(t.tag_name ORDER BY t.tag_name ASC SEPARATOR ",") AS tags

FROM root_pages AS p

LEFT JOIN root_mm_pages_tags AS m
ON p.page_id = m.page_id

LEFT JOIN root_tags AS t
ON t.tag_id = m.tag_id

WHERE p.page_id = t.page_id

This is the result I am after,

page_id         tags
1               A, B, C
2               NULL
3               B, C
4               NULL

try it without ON p.page_id = tt.page_id and WHERE p.page_id = t.page_id; also make the last join a regular one for performance.

SELECT
    p.page_id,
    GROUP_CONCAT(t.tag_name ORDER BY t.tag_name ASC SEPARATOR ",") AS tags

FROM root_pages AS p

LEFT JOIN root_mm_pages_tags AS m
ON p.page_id = m.page_id

JOIN root_tags AS t
ON t.tag_id = m.tag_id

GROUP BY p.page_id

MySQL, C++ - Programmatically, How does MySQL Autoincrement Work?

7 votes

From the latest source code (not certain if it's C or C++) of MySQL, how does it do an autoincrement? I mean, is it efficient in that it stores like a metadata resource on the table where it last left off, or does it have to do a table scan to find the greatest ID in use in the table? Also, do you see any negative aspects of using autoincrement when you look at how it's implemented versus, say, PostgreSQL?

That will depend on which engine the database is using. InnoDB is storing the largest value in memory and not on disk. Very efficient. I would guess most engines would do something similar, but cannot guarantee it.

InnoDB's Auto Increment Is going to run the below query once when DB is loaded and store the variable in memory:

 SELECT MAX(ai_col) FROM t FOR UPDATE;

Comparing that to PostgreSQL's complete lack of an auto_increment depends on how you would implement the field yourself. (At least it lacked it last time I used it. They may have changed) Most would create a SEQUENCE. Which appears to be stored in an in memory pseudo-table. I'd take InnoDBs to be a simpler better way. I'd guess InnoDB would be more efficient if they are not equal.

Updating massive number of records -- performance optimization

6 votes

I have a baseball tool that allows users to analyze a player's historical batting stats. For example, how many hits does A-Rod have over the past 7 days during night-time conditions? I want to expand the timeframe so a user can analyze a player's batting stats to as far back as 365 days. However, doing so requires some serious performance optimization. Here are my current set of models:

class AtBat < ActiveRecord::Base
  belongs_to :batter
  belongs_to :pitcher
  belongs_to :weather_condition

  ### DATA MODEL ###
  # id
  # batter_id
  # pitcher_id
  # weather_condition_id
  # hit (boolean)
  ##################
end

class BattingStat < ActiveRecord::Base
  belongs_to :batter
  belongs_to :recordable, :polymorphic => true # e.g., Batter, Pitcher, WeatherCondition

  ### DATA MODEL ###
  # id
  # batter_id
  # recordable_id
  # recordable_type
  # hits7
  # outs7
  # at_bats7
  # batting_avg7
  # ...
  # hits365
  # outs365
  # at_bats365
  # batting_avg365
  ##################
end

class Batter < ActiveRecord::Base
  has_many :batting_stats, :as => :recordable, :dependent => :destroy
  has_many :at_bats, :dependent => :destroy
end

class Pitcher < ActiveRecord::Base
  has_many :batting_stats, :as => :recordable, :dependent => :destroy
  has_many :at_bats, :dependent => :destroy
end

class WeatherCondition < ActiveRecord::Base
  has_many :batting_stats, :as => :recordable, :dependent => :destroy
  has_many :at_bats, :dependent => :destroy
end

For the sake of keeping my question at a reasonable length, let me narrate what I am doing to update the batting_stats table instead of copying a bunch of code. Let's start with 7 days.

  1. Retrieve all the at_bat records over the past 7 days.
  2. Iterate over each at_bat record…
  3. Given an at_bat record, grab the associated batter and associated weather_condition, find the correct batting_stat record (BattingStat.find_or_create_by_batter_and_recordable(batter, weather_condition), then update the batting_stat record.
  4. Repeat Step 3 for batter and pitcher (recordables).

Steps 1-4 are repeated for other time periods as well -- 15 days, 30 days, etc.

Now I imagine how laborious this would be to run a script every day to make these updates if I were to expand the time periods from a mangeable 7/15/30 to 7/15/30/45/60/90/180/365.

So my question is how would you approach getting this to run at the highest levels of performance?

AR isn't really meant to do bulk processing like this. You're probably better off doing your batch updates by dropping into SQL proper and doing an INSERT FROM SELECT (or perhaps using a gem that did this for you.)

Query producing unexpected results (sha1)

6 votes

I have a form for updating user data. It posts to this page:

<?php
    //Update user table
    session_start();
    include 'sql_connect_R.inc.php';

    $id = mysql_real_escape_string($_POST['userID']);
    $password = mysql_real_escape_string($_POST['user_passwrd']);

    $salt = time();
    $hash = sha1($password . $salt);

    mysql_query("UPDATE users SET user_passwrd = '$hash', stamp = '$salt', pending = 'yes'
    WHERE userID = '$id'");

    mysql_close($con);
?>

(I have edited out the things not pertinent to this question)

I believe what is happening is when the 'stamp' field is being populated with the $salt it is getting a different value than when the $hash is being calculated. Therefore, when a user signs in and is checked here:

$qry="SELECT * FROM users WHERE userlogin = '$login' AND user_passwrd = sha1(CONCAT('$password', stamp))";
    $result=mysql_query($qry);
    $row = mysql_fetch_assoc($result);
    $num = mysql_num_rows($result);

When I echo $num it returns a value of 0. I'm wondering if there is a way to ensure that the value of $salt remains the same when it is being used in $hash and then when it is updating the field 'stamp'. Can anyone help me with this or point me in the right direction? Thanks in advance. Cheers

Thank you for all comments. I want to report that I've 'solved' the problem. I had made a change in the name of the password input field late one night and neglected to change the $_POST value. What this did, of course, was not supply the $password value to the $hash. Though I'm embarrassed about this, I think it is important for me to share my oversight to exemplify how important it is to check ALL places where errors can occur. I failed to double-check everything and made incorrect assumptions about the nature of the problem. The code worked fine, it was the loose screw in front of the keyboard that caused the problems. Cheers

MySQL ALTER TABLE ADD COLUMN AFTER COLUMN - on a large table

6 votes

I want to achieve the following use the following command to add a column to an existing table:

   ALTER TABLE foo ADD COLUMN bar AFTER COLUMN old_column;

Can this option take substantially longer than the same command without the AFTER COLUMN option, as follows?

   ALTER TABLE foo ADD COLUMN bar;

Will the first command use a greater amount of tmp table space during execution to perform the action?

Context: I have a very large table (think over a billion rows) and I want to add an additional column using the AFTER COLUMN option, but I don't want to be penalized too much.

While the other answers are useful as examples of the syntax required to add columns to a table, the answer to the actual question was provided by N.B.:


You'd get more CPU usage since records would have to be shifted.

From the memory usage point of view - it'd be the same with AFTER COLUMN option and without it.

In most cases, a tmp table is created. There are MySQL engines that support hot schema changes (TokuDB being one) that don't create the tmp table and waste tons of resources.

However, if you're doing this with MyISAM or InnoDB - I'd say that "AFTER COLUMN" option will take slightly more time due to record shifting.

– N.B.

Bash: how to easily edit one line in a 5.4G SQL dump

6 votes

We're running into issues where we need to work on recovering some lost data (yay improper use of maatkit on a master-to-master replication setup), and I'm trying to import an old dump of the database. The problem is that at the top of the file, it's explicitly specifying the database (mysqldump was run with the --all-databases option), and I need to change that DB to something else so I can get another in there alongside it for comparison. The line reads:

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `dms` /*40100 DEFAULT CHARACTER SET latin1 */

I've been unsuccessful at opening the file in vi to edit due to the large size of the dump, and am a little hesitant to use sed because of how it's going to read line-by-line for any pattern matching. What is the easiest and most efficient way to change the above line in the sql dump to read:

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `dms_old` /*40100 DEFAULT CHARACTER SET latin1 */

Or simply use some mysql black magic to import it into database dms_old?

sed should have no problems with that. Just do:

`sed '/CREATE DATABASE/s/dms/dms_old/'`

with appropriate redirections.

How do I know if a Prepared Statement is being Cached?

6 votes

I have been reusing the same variable $stmt in my PHP script to write prepared statements:

$stmt = $dbh->prepare("SELECT column_A FROM Table1 WHERE id=?");
$stmt->bindValue(1, $id, PDO::PARAM_INT);
$stmt->execute();
....

$stmt = $dbh->prepare("UPDATE Table2 SET column_B=? WHERE column_A=?");
$stmt->bindValue(1, $name);
$stmt->bindValue(2, $column_A);
$stmt->execute();

My question is, how do I know if the two statements are being written to cache and that the second statement did not overwrite the first statement though both statements are sharing the same variable name?

Statements are prepared by the database engine and not PHP, see:

So reusing the same variable name in PHP won't invalidate the MySQL prepare "cache".

Difficulties with setting up a difficult mySQL query

6 votes

I am trying to make a query to get data from a database, but I can't get it done.

I have the following table with dummy data:

id | date  

0  | 2011-11-25 20:12:32
1  | 2011-11-15 20:12:32
2  | 2011-11-05 20:12:32
3  | 2011-10-25 20:12:32
4  | 2011-10-15 20:12:32
5  | 2011-10-05 20:12:32
6  | 2010-10-25 20:12:32
7  | 2010-04-25 20:12:32
8  | 2009-07-25 20:12:32

I want to make a query that:

  • Sorts the date by Year, then Month and then the Day
  • Then it should count how many time of every month in these years there are.

If you don't get what I mean: It should give a result something like this (using the dummy data from the table):


Year | Month | amount (of rows with that month in that year)

2011 | 11    | 3
2011 | 10    | 3
2010 | 4     | 2
2009 | 7     | 1

I have some knowledge of mySQL, but this is to much for me. :)

Thanks in advance

SELECT YEAR(`date`)  AS `year`, 
       MONTH(`date`) AS `month`, 
       COUNT(*)      AS amount 
FROM   `table` 
GROUP  BY YEAR(`date`), 
          MONTH(`date`) 
ORDER  BY `date` DESC 

can't update the attribute with ActiveRecord

5 votes

I want to swap the content in answers table with ActiveRecord.
code 1:

Archieve::Answer.find_each do |answer|
  str = answer.content
  dosomething() #change the value
  answer.update_attribute(:content,str)
end

But It doesn't change the value of content.

code 2:

Archieve::Answer.find_each do |answer|
  str = answer.content
  dosomething() #change the value
  answer.reload
  answer.update_attributes(
    :content => str
  )
end

Before update the :content attributes, I reload the record every time.
It can indeed change the the value.
Why?
What's the difference between code 1 & code 2?
Source Code

###1 Post Debug Message:
Updated Post:

Changed?: false 
valid?: true 
errors: #<ActiveModel::Errors:0xa687568> 
errors: #<ActiveModel::Errors:0xa687568 @base=#<Archieve::Answer id: 9997190932758339, user_id: 4163690810052834, question_id: 3393286738785869, content: "狗狗生病,好可怜呀,", is_correct: false, votes_count: 0, comments_count: 0, created_at: "2011-11-06 18:38:53", updated_at: "2011-11-06 18:38:53">, @messages={}>

possible ActiveRecord 3.1.1 bug

The OP mentioned to me that he uses require "active_record" in a stand alone script (not using rails runner). There is no separate Rails application for his task, he just uses a script. This is not necessarily bad, and has worked in earlier ActiveRecord versions, e.g. 2.x AFAIK -- maybe this is a regression in Rails 3.1 due to a new dependency?

# the OP's require statements:
require 'rubygems'
require 'logger'
require 'yaml'
require 'uuidtools'
require 'active_record'

complete code here: https://raw.github.com/Zhengquan/Swap_Chars/master/lib/orm.rb

maybe a dependency is missing, or problem with AR 3.1.1 when initialized stand alone?

It could be a bug actually

It could be that update_attribute() triggers a bug in the dirty-tracking of attributes, which then incorrectly assumes that the object has not changed, and as a result it will not be persisted, although the implementation of update_attribute() calls save() (see code fragment below).

I've seen something like this with an older version of Mongoid -- could be that there is a similar hidden bug in your ActiveRecord version for update_attribute()

In the Rails Console monkey-patch update_attribute like this:

class ActiveRecord::Base
  def update_attribute(name, value) # make sure you use the exact code of your Rails Version here
    send(name.to_s + '=', value)
    puts "Changed?: #{changed?}"        # this produced false in the OP's scenario
    puts "valid?: #{valid?}"
    puts "errors: #{errors.inspect}"
    save
  end
end

then try to run your Code 1 again...

you shouldn't see "Changed?: false".. if it returns false, although you changed the attribute, then there is a bug in your ActiveRecord version and you should report it.

Code 1:

NOTE: check the definition of update_attribute() (singular) here: (please read the fine-print regarding validations -- it doesn't sound like a good idea to use that method)

http://ar.rubyonrails.org/classes/ActiveRecord/Base.html#M000400

See also:

Rails: update_attribute vs update_attributes

The source code for update_attribute() looks like this:

2260:       def update_attribute(name, value)
2261:         send(name.to_s + '=', value)
2262:         save
2263:       end

it could fail if there is a bug with the dirty-tracking of attributes...

Code 2:

The second code looks correct.

There are a couple of things to also consider:

1) which attributes did you define as accessible, via attr_accessible ?

e.g. only accessible attributes will be updated via update_attributes()

http://apidock.com/rails/ActiveRecord/Base/update_attributes

2) which validations do you use?

are you sure the validations pass for the record when you call update_attribute?

See also:

http://guides.rubyonrails.org/active_record_querying.html

http://m.onkey.org/active-record-query-interface

http://api.rubyonrails.org/classes/ActiveRecord/Base.html

Why is mySQL query, left join 'considerably' faster than my inner join

5 votes

I've researched this, but I still cannot explain why:

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155

Is significantly slower than:

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155

115ms Vs 478ms. They are both using InnoDB and there are relationships defined. The 'card_legality' contains approx 200k rows, while the 'legality' table contains 11 rows. Here is the structure for each:

CREATE TABLE `card_legality` (
  `card_id` varchar(8) NOT NULL DEFAULT '',
  `legality_id` int(3) NOT NULL,
  `cl_boolean` tinyint(1) NOT NULL,
  PRIMARY KEY (`card_id`,`legality_id`),
  KEY `legality_id` (`legality_id`),
  CONSTRAINT `card_legality_ibfk_2` FOREIGN KEY (`legality_id`) REFERENCES `legality` (`legality_id`),
  CONSTRAINT `card_legality_ibfk_1` FOREIGN KEY (`card_id`) REFERENCES `card` (`card_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And:

CREATE TABLE `legality` (
  `legality_id` int(3) NOT NULL AUTO_INCREMENT,
  `l_name` varchar(16) NOT NULL DEFAULT '',
  PRIMARY KEY (`legality_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=latin1;

I could simply use LEFT-JOIN, but it doesn't seem quite right... any thoughts, please?

UPDATE: As requested, I've included the results of explain for each. I had run it previously, but I dont pretend to have a thorough understanding of it..

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  cl  ALL PRIMARY NULL    NULL    NULL    199747  Using where
1   SIMPLE  l   eq_ref  PRIMARY PRIMARY 4   hexproof.co.uk.cl.legality_id   1   

AND, inner join:

id  select_type table   type    possible_keys   key key_len         ref                         rows    Extra
1   SIMPLE  l   ALL PRIMARY NULL    NULL    NULL    11  
1   SIMPLE  cl  ref PRIMARY,legality_id legality_id 4   hexproof.co.uk.l.legality_id    33799   Using where

It is because of the varchar on card_id. MySQL can't use the index on card_id as card_id as described here mysql type conversion. The important part is

For comparisons of a string column with a number, MySQL cannot use an index on the column to look up the value quickly. If str_col is an indexed string column, the index cannot be used when performing the lookup in the following statement:

SELECT * FROM tbl_name WHERE str_col=1;

The reason for this is that there are many different strings that may convert to the value 1, such as '1', ' 1', or '1a'.

If you change your queries to

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'

and

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'

You should see a huge improvement in speed and also see a different EXPLAIN.

Here is a similar (but easier) test to show this:

> desc id_test;
+-------+------------+------+-----+---------+-------+
| Field | Type       | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| id    | varchar(8) | NO   | PRI | NULL    |       |
+-------+------------+------+-----+---------+-------+
1 row in set (0.17 sec)

> select * from id_test;
+----+
| id |
+----+
| 1  |
| 2  |
| 3  |
| 4  |
| 5  |
| 6  |
| 7  |
| 8  |
| 9  |
+----+
9 rows in set (0.00 sec)

> explain select * from id_test where id = 1;
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | id_test | index | PRIMARY       | PRIMARY | 10      | NULL |    9 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)


> explain select * from id_test where id = '1';
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | id_test | const | PRIMARY       | PRIMARY | 10      | const |    1 | Using index |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)

In the first case there is Using where; Using index and the second is Using index. Also ref is either NULL or CONST. Needless to say, the second one is better.

Order of condition execution in MySQL

5 votes

Suppose I have a MySQL query with two conditions:

SELECT * FROM `table` WHERE `field_1` = 1 AND `field_2` LIKE '%term%';

The first condition is obviously going to be a lot cheaper than the second, so I'd like to be sure that it runs first, limiting the pool of rows which will be compared with the LIKE clause. Do MySQL query conditions run in the order they're listed or, if not, is there a way to specify order?

MySQL has an internal query optimizer that takes care of such things in most cases. So, typically, you don't need to worry about it.

But, of course, the query optimizer is not foolproof. So...

Sorry to do this to you, but you'll want to get familiar with EXPLAIN if you suspect that a query may be running less efficiently than it should.

http://dev.mysql.com/doc/refman/5.0/en/explain.html

How to find out if store open or close - dealing with hours?

4 votes

What is the best way to store shop opening and closing time in the database and also how to calculate the time in PHP?

I have come up with this table design:

+----+---------+----------+-----------+------------+
| id | shop_id | week_day | open_hour | close_hour |
+----+---------+----------+-----------+------------+
|  1 |       3 |        1 | 15:00:00  | 23:00:00   |
|  2 |       3 |        2 | 15:00:00  | 23:00:00   |
|  3 |       3 |        3 | 18:00:00  | 02:00:00   |
|  4 |       3 |        4 | 18:00:00  | 02:00:00   |
|  5 |       3 |        5 | 18:00:00  | 03:00:00   |
+----+---------+----------+-----------+------------+

+------------+---------+------+-----+---------+----------------+
| Field      | Type    | Null | Key | Default | Extra          |
+------------+---------+------+-----+---------+----------------+
| id         | int(11) | NO   | PRI | NULL    | auto_increment |
| shop_id    | int(11) | NO   |     | NULL    |                |
| week_day   | int(11) | NO   |     | NULL    |                |
| open_hour  | time    | NO   |     | NULL    |                |
| close_hour | time    | NO   |     | NULL    |                |
+------------+---------+------+-----+---------+----------------+

For example, on Tuesday (week_day = 2) it open at 3PM and close at 11PM (Tuesday).

On Wednesday (`week_day = 2'), it open at 6PM and close after midnight at 2AM which would be Thursday. How should midnight (00:00:00 or after) closing time should be stored in the database?

Let say customer want to place an order (shop_id = 3) at 10PM on Tuesday, they should be able to do so according to the database data. However if customer want to place an order at 1AM on Thursday but the database show that week_day = 3 it close at 02:00:00

How to write in PHP to work out if the shop open or not? it seem complicated!

Do I need to change the the table design so it would much easier to write in PHP?

You can consider making the table information truly correct

+----+---------+----------+-----------+------------+
| id | shop_id | week_day | open_hour | close_hour |
+----+---------+----------+-----------+------------+
|  1 |       3 |        1 | 15:00:00  | 23:00:00   |
|  2 |       3 |        2 | 15:00:00  | 23:00:00   |
|  3 |       3 |        3 | 18:00:00  | 23:59:59   |
|  4 |       3 |        4 | 00:00:00  | 02:00:00   |
|  5 |       3 |        4 | 18:00:00  | 23:59:59   |
|  6 |       3 |        5 | 00:00:00  | 02:00:00   |
|  7 |       3 |        5 | 18:00:00  | 23:59:59   |
|  8 |       3 |        6 | 00:00:00  | 03:00:00   |
+----+---------+----------+-----------+------------+

Then use the following kind of (this query is for Tuesday 10:00PM as you mentioned):

SELECT count(*) FROM `shop` 
 WHERE week_day=3 
 and open_hour<='22:00:00' 
 and close_hour>='22:00:00'