Best mysql questions in October 2011

Why not specify every VARCHAR as VARCHAR (65535)?

20 votes

Since the storage requirements for a Varchar field are based on the actual length of the string entered, what would be the downside of specifying every Varchar field as the max possible: Varchar (65535)? Well, aside from 1 extra byte for max fields > 255 characters?

[Storage Reqts for strings of length L: L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes]

Thank you!

I think varchar column lengths are not only about storage. They're about data semantics as well.

I.E. specifying a name column as varchar(100) means that names stored on your system must not be longer than 100 characters.

On the storage side of things, they should be the same. Although, row size estimations would be more accurate with a specific length on varchar columns that without them (without needing a statistics gathering system keeping data distributions on varchar sizes).

Is a 'blackhole' table evil?

14 votes

Reading to this question i've just learned the existence of the blackhole table trick: basically consist in using a single table to insert data, and then a trigger that split the data in many other tables.

Im wondering if this could cause problems, once the developers whos working on the project are aware of that.

What are the pro and cons of this tecnique?

Edit: The blink I got in mind when I saw the example, is about transactions: if for some reason the transaction fail, you'll find the blackhole row with the original data, for historical purpose and maybe a help with debug - but this seems to be the only +1 i can see with blackholes. Ideas?

I don't think blackhole has any real pros.

Writing the trigger code to move data around is probably not noticably less work than writing the code to insert the data in the right place in the first place.

As Christian Oudard writes, it doesn't reduce complexity - just moves it to a place where it's really hard to debug.

On the downside:

"Side effects" are usually a bad idea in software development. Triggers are side effects - I intend to do one thing (insert data in a table), and it actually does lots of other things. Now, when I'm debugging my code, I have to keep all the side effects in my head too - and the side effects could themselves have side effects.

most software spends far more time in maintenance than it does in development. Bringing new developers into the team and explaining the black hole trick is likely to increase the learning curve - for negligible benefit (in my view).

Because triggers are side effects, and it's relatively easy to set off a huge cascade of triggers if you're not careful, I've always tried to design my databases without a reliance on triggers; where triggers are clearly the right way to go, I've only let my most experienced developers create them. The black hole trick makes triggers into a normal, regular way of working. This is a personal point of view, of course.

MySQL `FORCE INDEX` use cases?

10 votes

Almost everywhere I read the use of FORCE INDEX is highly discouraged and I perfectly understand and know why - there are immensely huge chances that MySQL knows better what indexes to choose than the (average) developer.

However, recently I have found a case where FORCE INDEX improved my execution times in the range of hundred of times:

  • JOIN on 4 tables
  • first table has about 500 000 records
  • 2 of the INNER JOINed table have more than 1 milion records
  • first table has a field called published_date, stored as varchar in YMD format (could not changed to datetime)
  • needed a range on published_date of at most 5 000 records
  • some other GROUP BY and ORDER BY clauses on the first table on different fields than published_date were needed for this query

Although I've rewritten the query in many ways, I was not able to get execution times smaller than 130 seconds (with highest being over 700 ). After using FORCE INDEX with published_date, the execution time dropped below 5 seconds.

It took me a few days to remember about the infamous FORCE INDEX option.

Questions:

  • What are other use cases you found where FORCE INDEX saved you?
  • Do you have some best practices when you consider using FORCE INDEX?

Edit - Obsevations: I created this blog post with the question here also. All the answer you provide will appear there too - whith credits and all the stuff you want.

Edit 2

I applied the suggestions I received in your comments (ANALYZE TABLE and OPTIMIZE TABLE), below is the output of EXPLAIN applied on query - unfortunately the index selection is not at all better:

1. without FORCE INDEX on table with alias a:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  am2 range   PRIMARY,idx_meta_article    idx_meta_article    4   NULL    275228  Using where; Using index; Using temporary; Using f...
1   SIMPLE  a   eq_ref  PRIMARY,serial_issue_date_productid,pub_date,idx_d...   PRIMARY 4   mydb_toto.am2.ArticleID 1   Using where
1   SIMPLE  ai  ref PRIMARY,idx_iso_article PRIMARY 4   mydb_toto.a.serial  11523   Using where; Using index
1   SIMPLE  m   range   PRIMARY,meta_articles_type  meta_articles_type  4   NULL    96  Using where
1   SIMPLE  am  eq_ref  PRIMARY,idx_meta_article    PRIMARY 8   mydb_toto.a.serial,mydb_toto.m.meta_id  1   Using where; Using index

2. with FORCE INDEX on table with alias a:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  a   range   pub_date    pub_date    11  NULL    17679   Using where; Using temporary; Using filesort
1   SIMPLE  am2 ref PRIMARY,idx_meta_article    PRIMARY 4   mydb_toto.a.serial  21930   Using where; Using index
1   SIMPLE  ai  ref PRIMARY,idx_iso_article PRIMARY 4   mydb_toto.a.serial  11523   Using where; Using index
1   SIMPLE  m   range   PRIMARY,meta_articles_type  meta_articles_type  4   NULL    96  Using where
1   SIMPLE  am  eq_ref  PRIMARY,idx_meta_article    PRIMARY 8   mydb_toto.am2.ArticleID,mydb_toto.m.meta_id 1   Using where; Using index

3. after ANALYZE TABLE, without FORCE INDEX:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  am2 range   PRIMARY,idx_meta_article    idx_meta_article    4   NULL    275228  Using where; Using index; Using temporary; Using f...
1   SIMPLE  a   eq_ref  PRIMARY,serial_issue_date_productid,pub_date,idx_d...   PRIMARY 4   mydb_toto.am2.ArticleID 1   Using where
1   SIMPLE  ai  ref PRIMARY,idx_iso_article PRIMARY 4   mydb_toto.a.serial  11523   Using where; Using index
1   SIMPLE  m   range   PRIMARY,meta_articles_type  meta_articles_type  4   NULL    96  Using where
1   SIMPLE  am  eq_ref  PRIMARY,idx_meta_article    PRIMARY 8   mydb_toto.a.serial,mydb_toto.m.meta_id  1   Using where; Using index

4. after OPTIMIZE TABLE, without FORCE INDEX:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  am2 range   PRIMARY,idx_meta_article    idx_meta_article    4   NULL    275228  Using where; Using index; Using temporary; Using f...
1   SIMPLE  a   eq_ref  PRIMARY,serial_issue_date_productid,pub_date,idx_d...   PRIMARY 4   mydb_toto.am2.ArticleID 1   Using where
1   SIMPLE  ai  ref PRIMARY,idx_iso_article PRIMARY 4   mydb_toto.a.serial  11523   Using where; Using index
1   SIMPLE  m   range   PRIMARY,meta_articles_type  meta_articles_type  4   NULL    96  Using where
1   SIMPLE  am  eq_ref  PRIMARY,idx_meta_article    PRIMARY 8   mydb_toto.a.serial,mydb_toto.m.meta_id  1   Using where; Using index

5. after OPTIMIZE TABLE and ANALYZE TABLE, with FORCE INDEX:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  a   range   pub_date    pub_date    11  NULL    17679   Using where; Using temporary; Using filesort
1   SIMPLE  am2 ref PRIMARY,idx_meta_article    PRIMARY 4   mydb_toto.a.serial  21930   Using where; Using index
1   SIMPLE  ai  ref PRIMARY,idx_iso_article PRIMARY 4   mydb_toto.a.serial  11523   Using where; Using index
1   SIMPLE  m   range   PRIMARY,meta_articles_type  meta_articles_type  4   NULL    96  Using where
1   SIMPLE  am  eq_ref  PRIMARY,idx_meta_article    PRIMARY 8   mydb_toto.am2.ArticleID,mydb_toto.m.meta_id 1   Using where; Using index

I have noticed that FORCE INDEX helps when you have multiple joins and sub-queries on VARCHAR fields where both the FK and the referenced value are not the primary key, while at the same time having where clause on a DATE field.

Something like:

SELECT NAME, a.reference_no, i.value, p.value FROM customers AS c
INNER JOIN accounts AS a ON c.id = a.customer_id
INNER JOIN invoices AS i ON i.reference_no = a.reference_no
INNER JOIN payments AS p ON p.invoice_no = i.invoice_no
WHERE payments.date >= '2011-09-01' AND DATE < '2011-10-01';

mysql will always use the PKs and FKs, where you would it to use the payment_date index on the payments table first as it is the largest one. So a FORCE INDEX(payment_date) on the payments table join would help a lot.

This an example from the third-party billing database that we use at work. We had huge issues with optimisation, and FORCE INDEX did the job most of the times. Usually we found the slow quires with mysqladmin, tested them with FORCE INDEX and send them to the vendors to rewrite them in the source code of the app.

Here are the four tables to get a better grip on the example:

CREATE TABLE `customers` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(100) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1;

CREATE TABLE `accounts` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `customer_id` int(11) NOT NULL,
  `reference_no` varchar(10) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `reference_no_uniq` (`reference_no`),
  KEY `FK_accounts` (`customer_id`),
  CONSTRAINT `FK_accounts` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;

CREATE TABLE `invoices` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `reference_no` varchar(10) NOT NULL,
  `invoice_no` varchar(10) NOT NULL,
  `value` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `invoice_no_uniq` (`invoice_no`),
  KEY `FK_invoices` (`reference_no`),
  CONSTRAINT `FK_invoices` FOREIGN KEY (`reference_no`) REFERENCES `accounts` (`reference_no`)
) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=latin1;

CREATE TABLE `payments` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `invoice_no` varchar(10) NOT NULL,
  `value` int(11) NOT NULL,
  `date` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `FK_payments` (`invoice_no`),
  KEY `payment_date` (`date`),
  CONSTRAINT `FK_payments` FOREIGN KEY (`invoice_no`) REFERENCES `invoices` (`invoice_no`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;

Optimising Database Structure

8 votes

I'm developing a reward system for our VLE which uses three separate technologies - JavaScript for most of the client-side/display processing, PHP to communicate with the database and MySQL for the database itself.

I've attached three screenshots of my "transactions" table. Its structure, a few example records and an overview of its details.

The premise is that members of staff award points to students for good behaviour etc. This can mean that classes of 30 students are given points at a single time. Staff have a limit of 300 points/week and there are around 85 staff currently accessing the system (this may rise).

The way I'm doing it at the moment, every "transaction" has a "Giver_ID" (the member of staff awarding points), a "Recipient_ID" (the student receiving the points), a category and a reason. This way, every time a member of staff issues 30 points, I'm putting 30 rows into the database.

This seemed to work early on, but within three weeks I already have over 12,000 transactions in the database.

At this point it gets a bit more complicated. On the Assign Points page (another screenshot attached), when a teacher clicks into one of their classes or searches for an individual student, I want the students' points to be displayed. The only way I can currently do this on my system is to do a "SELECT * FROM 'transactions'" and put all the information into an array using the following JS:

var Points = { "Recipient_ID" : "0", "Points" : "0" };

function getPoints (data) {
    for (var i = 0; i < data.length; i++) {
        if (Points[data[i].Recipient_ID]) {
            Points[data[i].Recipient_ID] = parseInt(Points[data[i].Recipient_ID]) + parseInt(data[i].Points);
        } else {
            Points[data[i].Recipient_ID] = data[i].Points;
        }
    }
}

When logging in to the system internally, this appears to work quickly enough. When logging in externally however, this process takes around 20 seconds, and thus doesn't display the students' points values until you've clicked/searched a few times.

I'm using the following code in my PHP to access these transactions:

function getTotalPoints() {
    $sql = "SELECT * 
        FROM `transactions`";

    $res = mysql_query($sql);
    $rows = array(); 
    while($r = mysql_fetch_assoc($res)) {
        $rows[] = $r;
    }

    if ($rows) {
        return $rows;
    } else {
        $err = Array("err_id" => 1);
        return $err;
    }
}

So, my question is, how should I actually be approaching this? Full-text indices; maybe a student table with their total points values which gets updated every time a transaction is entered; mass-transactions (i.e. more than one student receiving the same points for the same category) grouped into a single database row? These are all things I've contemplated but I'd love someone with more DB knowledge than myself to provide enlightenment.

Example records Example records

Table structure Table structure

Table overview Table overview

Assign Points interface Assign Points interface

Many thanks in advance.

Your problem is your query:

SELECT * FROM `transactions`

As your data set gets bigger, this will take longer to load and require more memory to store it. Rather determine what data you need specifically. If it's for a particular user:

SELECT SUM(points) FROM `transactions` WHERE Recipient_ID=[x]

Or if you want all the sums for all your students:

SELECT Recipient_ID, SUM(points) AS Total_Points FROM `transactions` GROUP BY Recipient_ID;

To speed up selections on a particular field you can add an index for that field. This will speed up the selections, especially as the table grows.

ALTER TABLE `transactions` ADD INDEX Recipient_ID (Recipient_ID);

Or if you want to display a paginated list of all the entries in transactions:

SELECT * FROM `transactions` LIMIT [page*num_records_per_page],[num_records_per_page];

e.g.: SELECT * FROM `transactions` LIMIT 0,25 ORDER BY Datetime; # First 25 records

Explicitly specify sort order for mysql query?

8 votes

If I have:

ID | Title
1  | Shirt
2  | CD
3  | Cap
4  | Mp3
5  | Badge

If I want to sort by this order: 4, 2, 5, 3,1. Is there a way to do an sql query where you explicitly specify this? Something like:

select * from TABLE order by ID(4,2,5,3,1) ??

Actually, you were surprisingly close. It's a simple as:

select * from TABLE order by field(ID,4,2,5,3,1)

What's wrong with the Foreign Key Constraint in this table

7 votes

MySQL 5.1.59 throws an error error with this create table:

CREATE  TABLE IF NOT EXISTS `genre` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `abv` CHAR(3) CHARACTER SET 'latin1' COLLATE 'latin1_bin' NULL DEFAULT NULL ,
  `name` VARCHAR(80) NOT NULL DEFAULT '' ,
  `parent_id` INT NULL DEFAULT NULL ,
  PRIMARY KEY (`id`) ,
  INDEX `fk_genre_genre1` (`parent_id` ASC) ,
  CONSTRAINT `fk_genre_genre1`
    FOREIGN KEY (`parent_id` )
    REFERENCES `genre` (`id` )
    ON DELETE SET NULL
    ON UPDATE CASCADE)
ENGINE = InnoDB;

Which was generated by MySQLWorkbench 5.2.33.

The error massage is:

ERROR 1005 (HY000) at line __: Can't create table 'mydb.genre' (errno: 150)

What's wrong with this create table?

The manual says:

If MySQL reports an error number 1005 from a CREATE TABLE statement, and the error message refers to error 150, table creation failed because a foreign key constraint was not correctly formed.

It also says that foreign key references to the same table are allowed:

InnoDB supports foreign key references within a table. In these cases, “child table records” really refers to dependent records within the same table.

The relationship I want is a non-identifying parent-child to represent a hierarchy of genres and sub-genres. A genre doesn't have to have a parent, hence parent_id is nullable.

It may be relevant that MySQLWorkbench set the following:

SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='TRADITIONAL';

Your column id is int unsigned; your column parent_id is int. Those don't match. The solution is to change parent_id to be int unsigned as well.

If you run the SHOW ENGINE InnoDB STATUS I put in a comment, you see this:

11005 17:18:38 Error in foreign key constraint of table test/genre:

    FOREIGN KEY (`parent_id` )
    REFERENCES `genre` (`id` )
    ON DELETE SET NULL
    ON UPDATE CASCADE)
ENGINE = InnoDB:
Cannot find an index in the referenced table where the
referenced columns appear as the first columns, or column types
in the table and the referenced table do not match for constraint.
Note that the internal storage type of ENUM and SET changed in
tables created with >= InnoDB-4.1.12, and such columns in old tables
cannot be referenced by such columns in new tables.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-foreign-key-constraints.html
for correct foreign key definition.

Note the "column types in the table and the referenced table do not match" part.

How to get the difference in years from two different dates?

7 votes

I want to get the difference in years from two different dates using MySQL database.

for example:

  • 2011-07-20 - 2011-07-18 => 0 year
  • 2011-07-20 - 2010-07-20 => 1 year
  • 2011-06-15 - 2008-04-11 => 2 3 years
  • 2011-06-11 - 2001-10-11 => 9 years

How about the SQL syntax? Is there any built in function from MySQL to produce the result?

Here's the expression that also caters for leap years:

YEAR(date1) - YEAR(date2) - (DATE_FORMAT(date1, '%m%d') < DATE_FORMAT(date2, '%m%d'))

This works because the expression (DATE_FORMAT(date1, '%m%d') < DATE_FORMAT(date2, '%m%d')) is true if date1 is "earlier in the year" than date2 and because in mysql, true = 1 and false = 0, so the adjustment is simply a matter of subtracting the "truth" of the comparison.

This gives the correct values for your test cases, except for test #3 - I think it should be "3" to be consistent with test #1:

create table so7749639 (date1 date, date2 date);
insert into so7749639 values
('2011-07-20', '2011-07-18'),
('2011-07-20', '2010-07-20'),
('2011-06-15', '2008-04-11'),
('2011-06-11', '2001-10-11'),
('2007-07-20', '2004-07-20');
select date1, date2,
YEAR(date1) - YEAR(date2)
    - (DATE_FORMAT(date1, '%m%d') < DATE_FORMAT(date2, '%m%d')) as diff_years
from so7749639;

Output:

+------------+------------+------------+
| date1      | date2      | diff_years |
+------------+------------+------------+
| 2011-07-20 | 2011-07-18 |          0 |
| 2011-07-20 | 2010-07-20 |          1 |
| 2011-06-15 | 2008-04-11 |          3 |
| 2011-06-11 | 2001-10-11 |          9 |
| 2007-07-20 | 2004-07-20 |          3 |
+------------+------------+------------+

Show table name where a value is present

7 votes

Is it possible to show the name of a table in a db where a specific value is present. I have different tables and i want to show only the table names that contains a specific value in any of the fields.

This will return lots of empty result sets, but the non-empty ones correspond to table/column combinations that fit your search. It only works for text, and detects columns that contain the value (as opposed to a full column match.)

DELIMITER |

DROP PROCEDURE IF EXISTS `SearchAllTables`|
CREATE PROCEDURE `SearchAllTables` (
    IN _search varchar(256)
    )
LANGUAGE SQL
DETERMINISTIC
SQL SECURITY DEFINER
BEGIN
    -- declare stuff
    declare _tableName varchar(64);
    declare _columnName varchar(64);
    declare _done tinyint(1) default 0;

    -- we will examine every string column in the database
    declare _columnCursor cursor for
        select TABLE_NAME, COLUMN_NAME
            from INFORMATION_SCHEMA.COLUMNS
        where TABLE_SCHEMA = database()
        and (DATA_TYPE like '%char%' 
            or DATA_TYPE like 'text');
    declare CONTINUE handler for NOT FOUND
        SET _done = 1;

    OPEN _columnCursor;
    LOOP1: LOOP
            -- get the next table/column combination
        FETCH _columnCursor INTO _tableName,_columnName;
        IF _done = 1 THEN
            CLOSE _columnCursor;
            LEAVE LOOP1;
        END IF;

            -- query the current column to see if it holds the value
        SET @query = concat(
            "select '",_tableName,"' as TableName, '",
                      _columnName,"' as ColumnName 
            from ",_tableName," 
            where ",_columnName," like concat('%',?,'%') 
            group by 1;"
        );
        SET @search = _search;
        PREPARE _stmt FROM @query;
        EXECUTE _stmt USING @search;
        DEALLOCATE PREPARE _stmt;
    END LOOP LOOP1;
END|

DELIMITER ;

Oh, yeah, and it's ugly... Maybe it'll help you, though!

Slow cronjobs on Cent OS 5

7 votes

I have 1 cronjob that runs every 60 minutes but for some reason, recently, it is running slow.

Env: centos5 + apache2 + mysql5.5 + php 5.3.3 / raid 10/10k HDD / 16gig ram / 4 xeon processor

Here's what the cronjob do:

  1. parse the last 60 minutes data

    a) 1 process parse user agent and save the data to the database

    b) 1 process parse impressions/clicks on the website and save them to the database

  2. from the data in step 1

    a) build a small report and send emails to the administrator/bussiness

    b) save the report into a daily table (available in the admin section)

I see now 8 processes (the same file) when I run the command ps auxf | grep process_stats_hourly.php (found this command in stackoverflow)

Technically I should only have 1 not 8.

Is there any tool in Cent OS or something I can do to make sure my cronjob will run every hour and not overlapping the next one?

Thanks

Your hardware seems to be good enough to process this.

1) Check if you already have hanging processes. Using the ps auxf (see tcurvelo answer), check if you have one or more processes that takes too much resources. Maybe you don't have enough resources to run your cronjob.

2) Check your network connections: If your databases and your cronjob are on a different server you should check whats the response time between these two machines. Maybe you have network issues that makes the cronjob wait for the network to send the package back.

You can use: Netcat, Iperf, mtr or ttcp

3) Server configuration Is your server is configured correctly? Your OS, MySQL are setup correctly? I would recommend to read these articles:

http://www3.wiredgorilla.com/content/view/220/53/

http://www.vr.org/knowledgebase/1002/Optimize-and-disable-default-CentOS-services.html

http://dev.mysql.com/doc/refman/5.1/en/starting-server.html

http://www.linux-mag.com/id/7473/

4) Check your database: Make sure your database has the correct indexes and make sure your queries are optimized. Read this article about the explain command

If a query with few hundreds thousands of record takes times to execute that will affect the rest of your cronjob, if you have a query inside a loop, even worse.

Read these articles:

http://dev.mysql.com/doc/refman/5.0/en/optimization.html

http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/

http://blog.fedecarg.com/2008/06/12/10-great-articles-for-optimizing-mysql-queries/

5) Trace and optimized PHP code? Make sure your PHP code runs as fast as possible.

Read these articles:

http://phplens.com/lens/php-book/optimizing-debugging-php.php

http://code.google.com/speed/articles/optimizing-php.html

http://ilia.ws/archives/12-PHP-Optimization-Tricks.html

A good technique to validate your cronjob is to trace your cronjob script: Based on your cronjob process, put some debug trace including how much memory, how much time it took to execute the last process. eg:

<?php

echo "\n-------------- DEBUG --------------\n";
echo "memory (start): " . memory_get_usage(TRUE) . "\n";

$startTime = microtime(TRUE);
// some process
$end = microtime(TRUE);

echo "\n-------------- DEBUG --------------\n";
echo "memory after some process: " . memory_get_usage(TRUE) . "\n";
echo "executed time: " . ($end-$start) . "\n";

By doing that you can easily find which process takes how much memory and how long it takes to execute it.

6) External servers/web service calls Is your cronjob calls external servers or web service? if so, make sure these are loaded as fast as possible. If you request data from a third-party server and this server takes few seconds to return an answer that will affect the speed of your cronjob specially if these calls are in loops.

Try that and let me know what you find.

How should this Many-To-Many doctrine2 association be defined?

6 votes

I have two Entities - Users & Challenges. A User can participate in many challenges and a challenge can have many participants (users). I began approaching this problem by creating a Many-To-Many relationship on my Users class:

/**
     * @ORM\ManytoMany(targetEntity="Challenge")
     * @ORM\JoinTable(name="users_challenges",joinColumns={@ORM\JoinColumn(name="user_id",referencedColumnName="id")},
     * inverseJoinColumns={@ORM\JoinColumn(name="challenge_id",referencedColumnName="id")})
     *
     */

    protected $challenges;

However, I then realised that I need to store a distance attribute against a user/challenge combination (how far the user has travelled in their challenge). The Doctrine2 docs state:

"Why are many-to-many associations less common? Because frequently you want to associate additional attributes with an association, in which case you introduce an association class. Consequently, the direct many-to-many association disappears and is replaced by one-to-many/many-to-one associations between the 3 participating classes."

So my question is what should these associations be between User, Challenge and UsersChallenges?

UPDATE

See comment to first answer for links to the Entity code. I have a controller method below which always creates a new UsersChallenges record rather than updating an existing one (which is what I want)

public function updateUserDistanceAction()
    {

      $request = $this->getRequest();
      $distance = $request->get('distance');
      $challenge_id = $request->get('challenge_id');


      if($request->isXmlHttpRequest()) {

        $em = $this->getDoctrine()->getEntityManager();
        $user = $this->get('security.context')->getToken()->getUser();
        $existingChallenges = $user->getChallenges();


        $challengeToUpdate = $em->getRepository('GymloopCoreBundle:Challenge')
                                ->find( (int) $challenge_id);

        if(!$challengeToUpdate) {

          throw $this->createNotFoundException('No challenge found');
        }

//does the challengeToUpdate exist in existingChallenges? If yes, update UsersChallenges with the distance
//if not, create a new USersChallenges object, set distance and flush

        if ( !$existingChallenges->isEmpty() && $existingChallenges->contains($challengeToUpdate)) {

          $userChallenge = $em->getRepository('GymloopCoreBundle:UsersChallenges')
                              ->findOneByChallengeId($challengeToUpdate->getId());

          $userChallenge->setDistance( $userChallenge->getDistance() + (int) $distance );
          $em->flush();

        } else {

          $newUserChallenge = new UsersChallenges();
          $newUserChallenge->setDistance($distance);
          $newUserChallenge->setChallenge($challengeToUpdate);
          $newUserChallenge->setUser($user);
          $user->addUsersChallenges($newUserChallenge);
          $em->persist($user);
          $em->persist($newUserChallenge);
          $em->flush();

        }

        //if success
        return new Response('success');

       //else

      }

    }

i belive you want a $em->persist($userChallenge); before $em->flush();

$userChallenge->setDistance( $userChallenge->getDistance() + (int) $distance );
$em->persist($userChallenge);
$em->flush();

However i am not sure if it solves your problem.

Can you post isEmpty and contains function in existingChallenges

Why do I need to flush the connection pool each time I redeploy?

6 votes

SOLVED

enter image description here

Made my own dual familiar with the one on Oracle.

CREATE TABLE dual 
(
    x VARCHAR(1)
);

INSERT INTO dual(x) VALUES('y');

I have successfully made a connection to a remote MySQL server and I use it for my Glassfish educational purpose application (not homework). However each time I make a change to the code or XHTML files I need to open the administrator panel of Glassfish and flush the connection pool otherwise I get this when I just refresh the page. Have anybody experienced this? I can post code or other information if it is needed.

HTTP Status 500 -

type Exception report

message

descriptionThe server encountered an internal error () that prevented it from fulfilling this request.

exception

javax.servlet.ServletException: WELD-000049 Unable to invoke [method] @PostConstruct public com.myapp.QuestionController.initialize() on com.myapp.QuestionController@4635bd2a

root cause

org.jboss.weld.exceptions.WeldException: WELD-000049 Unable to invoke [method] @PostConstruct public com.myapp.interfaces.QuestionController.initialize() on com.myapp.interfaces.QuestionController@4635bd2a

root cause

java.lang.reflect.InvocationTargetException

root cause

javax.ejb.EJBException

root cause

javax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.3.0.v20110604-r9504): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: java.sql.SQLException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start: Error Code: 0

root cause

Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.3.0.v20110604-r9504): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: java.sql.SQLException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start: Error Code: 0

root cause

java.sql.SQLException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

javax.resource.spi.ResourceAllocationException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

com.sun.appserv.connectors.internal.api.PoolingException: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

com.sun.appserv.connectors.internal.api.PoolingException: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

java.lang.RuntimeException: Got exception during XAResource.start:

root cause

javax.transaction.xa.XAException: com.sun.appserv.connectors.internal.api.PoolingException: javax.resource.spi.LocalTransactionException: Communications link failure

The last packet successfully received from the server was 435�409 milliseconds ago. The last packet sent successfully to the server was 7 milliseconds ago.

Image of config

Persistence XML

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0" xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
    <persistence-unit name="SertifikatPU" transaction-type="JTA">
        <jta-data-source>jdbc/sertifikatdb</jta-data-source>
    </persistence-unit>
</persistence>

In the "Additional properties" in Glassfish connection pool settings I have just configured: servername, URL, user and password.

Searching on your root cause, PoolingException: javax.resource.spi.LocalTransactionException: Communications link failure

I found this bug on jira for Glassfish (check the comments tab at the bottom), which explains that you may need to refresh your invalid connections. Read the comment at the bottom by Jagadish, which says to check your connection validation type. If it is set to "autocommit" (default), jdbc-drivers may cache the connection validation data and no actual database interaction will happen during connection validation.

Another article which also explains this

To fix this, set your connection-validation-method="table" and validation-table-name="any_table_you_know_exists" and provide any existing table-name, which forces connections to talk to the database instead of the cache. If the connection is invalid, it should be dropped and recreated. You may also need to specify is-connection-validation-required="true".

EDIT: Two excellent articles I found to help with additional configuration:

Jagadish's Oracle Blog Article on this topic with even more info

Glassfish JDBC Connection Validation explained in detail

From Jagadish's Blog:

AS_INSTALL_ROOT/bin/asadmin set domain.resources.jdbc-connection-pool.DerbyPool.is-connection-validation-required=true
domain.resources.jdbc-connection-pool.DerbyPool.is-connection-validation-required = true

AS_INSTALL_ROOT/bin/asadmin set domain.resources.jdbc-connection-pool.DerbyPool.connection-validation-method=table
domain.resources.jdbc-connection-pool.DerbyPool.connection-validation-method = table

bin/asadmin set domain.resources.jdbc-connection-pool.DerbyPool.validation-table-name=sys.systables
domain.resources.jdbc-connection-pool.DerbyPool.validation-table-name = sys.systables

Note that sys.systables is a guaranteed MSSQL table and the other article references dual, which is a guaranteed ORACLE table. You will need to use a guaranteed MySQL table, and I do not know one offhand. Your best bet is to just create a table solely for validation purposes, possibly with one column with one row of data.

How to query count over 2 columns

6 votes

I'm trying to write a sql query that shows how often two teams have played against each other.

Id | Team1 | Team2 | Date 
1  |   A   |   B   | 25/5/11
2  |   B   |   A   | 26/5/11
3  |   A   |   C   | 27/5/11
4  |   C   |   B   | 28/5/11
5  |   A   |   B   | 28/5/11

result should be:

A vs B => 3
A vs C => 1
C vs B => 1 

Counting A-B and B-A as different is an easy query. But I can't get them to be counted together.

Any suggestions?

SELECT Team1, Team2, SUM(num) FROM (
   SELECT Team1, Team2, COUNT(*) num
   FROM table_name
   GROUP BY Team1, Team2
  UNION ALL
   SELECT Team2, Team1, COUNT(*) num
   FROM table_name
   GROUP BY Team2, Team1
) combined
WHERE Team1 < Team2
GROUP BY Team1, Team2

Edit: Updated to reverse teams when necessary.

Note: This will run a lot faster than the versions using CASE you have been given in other answers because it will make full use of indexes.

Edit2: Moved the where to be even faster with indexes.

Are mysql statements atomic?

5 votes

For example, I have a row with a column C1 value = 'clean', and two different computers run this query:

UPDATE T1
  SET C1 = 'dirty'
WHERE id = 1

at the same time.

Without using transactions, is it guaranteed that the value of mysql_affected_rows() would be 1 for one client and 0 for the other?

Yes and No :-)

In both cases, the access is serialised (assuming you're using a transactional engine like InnoDB) since they hit the same row, so they won't interfere with each other. In other words, the statements are atomic.

However, the affected row count actually depends on your configuration set when you open the connection. The page for mysql_affected_rows() has this to say (my bold):

For UPDATE statements, the affected-rows value by default is the number of rows actually changed. If you specify the CLIENT_FOUND_ROWS flag to mysql_real_connect() when connecting to mysqld, the affected-rows value is the number of rows "found"; that is, matched by the WHERE clause.

And from the mysql_real_connect page:

CLIENT_FOUND_ROWS: Return the number of found (matched) rows, not the number of changed rows.

So, in terms of what happens with CLIENT_FOUND_ROWS being configured, the affected rows for:

UPDATE T1 SET C1 = 'dirty' WHERE id = 1

have nothing to do with whether the data is changed, only what rows matched. This would be 1 for both queries.

On the other hand, if CLIENT_FOUND_ROWS was not set, the second query would not actually be changing the row (since it's already populated with 'dirty') and would have a row count of zero.

If you wanted the same behaviour regardless of that setting (only showing changes), you could use something like:

UPDATE T1 SET C1 = 'dirty' WHERE id = 1 AND C1 <> 'dirty'

Remove values in comma separated list from database

4 votes

I have a table in my MySQL database called 'children'. In that table is a row called 'wishes' (a comma separated list of the child's wishlist items). I need to be able to update that list so that it only removes one value. i.e. the list = Size 12 regular jeans, Surfboard, Red Sox Baseball Cap; I want to remove Surfboard.

My query right now looks like this

$select = mysql_query('SELECT * FROM children WHERE caseNumber="'.$caseNum.'" LIMIT 1 ');
    $row = mysql_fetch_array($select);

    foreach ($wish as $w) {
        $allWishes = $row['wishes'];
        $newWishes = str_replace($w, '', $allWishes);
        $update = mysql_query("UPDATE children SET wishes='$newWishes' WHERE caseNum='".$caseNum."'");
}

But the UPDATE query isn't removing anything. How can I do what I need?

Using these user-defined REGEXP_REPLACE() functions, you may be able to replace it with an empty string:

UPDATE children SET wishes = REGEXP_REPLACE(wishes, '(,(\s)?)?Surfboard', '') WHERE caseNum='whatever';

Unfortunately, you cannot just use plain old REPLACE() because you don't know where in the string 'Surfboard' appears. In fact, the regex above would probably need additional tweaking if 'Surfboard' occurs at the beginning or end.

Perhaps you could trim off leading and trailing commas left over like this:

UPDATE children SET wishes = TRIM(BOTH ',' FROM REGEXP_REPLACE(wishes, '(,(\s)?)?Surfboard', '')) WHERE caseNum='whatever';

So what's going on here? The regex removes 'Surfboard' plus an optional comma & space before it. Then the surrounding TRIM() function eliminates a possible leading comma in case 'Surfboard' occurred at the beginning of the string. That could probably be handled by the regex as well, but frankly, I'm too tired to puzzle it out.

Note, I've never used these myself and cannot vouch for their effectiveness or robustness, but it is a place to start. And, as others are mentioning in the comments, you really should have these in a normalized wishlist table, rather than as a comma-separated string.

Update

Thinking about this more, I'm more partial to just forcing the use of built-in REPLACE() and then cleaning out the extra comma where you may get two commas in a row. This is looking for two commas side by side, as though there had been no spaces separating your original list items. If the items had been separated by commas and spaces, change ',,' to ', ,' in the outer REPLACE() call.

UPDATE children SET wishes = TRIM(BOTH ',' FROM REPLACE(REPLACE(wishes, 'Surfboard', ''), ',,', ',')) WHERE caseNum='whatever';