Best php questions in June 2011

Is memory encrypted?

24 votes

I want to store some data in a variable (and I know variables are stored in memory). Does that data in memory get encrypted? Also, is it possible for software to be able to read the variable names stored in memory and be able to actually extract the data from it?

Memory is not encrypted on any platform I know about. It would be of limited value anyway, because the processor must, in general, operate on plaintext data, so the data must be in plaintext on the machine somewhere.

Instead, modern operating systems (and most historical ones) use memory protection to allow only certain processes access to certain memory pages. Every memory page comes with read, write, and (sometimes) execute permissions. The operating system kernel is in charge of handling those permissions on context switch to grant or deny access to memory pages per-process as needed.

Saltzer and Schroeder's 1975 paper The Protection of Information in Computer Systems describe a mechanism using segments, rather than pages, but the principle has remained unchanged for decades.

Typically, any process-owned memory page is readable by a process with high-enough privilege; the OS kernel certainly can modify any page of memory, and it can choose to delegate that privilege to user processes too. The ptrace(2) system call on Linux provides a debugger-backdoor that can be used to implement read-only memory inspection systems such as strace(1) or ltrace(1) or gdb(1), or memory-modification systems such as gdb(1) and ptrace-based sandbox environments.

Or, a core file can be dumped, under certain situations (see core(5) and setrlimit(2) manpages), containing the contents of the process's memory. This is one reason why it is important to clear memory of important data before release.

I was part of a team that worked on encrypting pointers (non-PTO link) in running programs. The overhead was amazing, and the number of corner cases was even more astonishing. Using these techniques for common programs is probably not practical, though I could imagine a restricted environment where encrypted memory or control structures is a feasible approach. (Though probably other techniques would be more appropriate.)

Why do PHP and MySQL unix timestamps diverge on 1983-10-29?

23 votes

I've been using PHP's strtotime and MySQL's UNIX_TIMESTAMP functions in my app, to convert dates into timestamps. PHP and MySQL are running on my local machine, and these functions generally return the same result, as I expect them to:

$ php
<?php echo strtotime("2011-06-02"); ?>
1307001600

mysql> SELECT UNIX_TIMESTAMP("2011-06-02") ts;
+------------+
| ts         |
+------------+
| 1307001600 |
+------------+

But, sorta by chance, I happened to notice that when I entered 1983-01-01 as the date, the results were no longer equal:

$ php
<?php echo strtotime("1983-01-01"); ?>
410263200

mysql> SELECT UNIX_TIMESTAMP("1983-01-01") ts;
+-----------+
| ts        |
+-----------+
| 410256000 |
+-----------+

As you can see, PHP returned 410263200, while MySQL returned 410256000 - a difference of 7200 seconds.

This got me curious, and I wanted to know on what date the timestamps were no longer equivalent, so I wrote a little program that starts with today's date (in Y-m-d format), uses PHP's strtotime and MySQL's UNIX_TIMESTAMP and compares the results. It then subtracts 1 day from each value and loops until they're no longer equal.

The result:

1983-10-29

On October 29, 1983, for some reason, strtotime and UNIX_TIMESTAMP return values that differ by 7200 seconds.

Any ideas?

Thanks for reading.

You say you're on Alaskan time. From http://www.statoids.com/tus.html:

1983-10-30 02:00: All parts of Alaska except the Aleutians and Saint Lawrence Island switched to AT. Prior to the change, Alaska east of 138°W (Juneau) had been on PT; between 138°W and 141°W (Yakutat) had been on Yukon Time, which was UTC-9 with DST; west of 162°W (Nome) had been on Bering Time, which was UTC-11 with DST. Name of Alaska-Hawaii zone changed to Hawaii-Aleutian.

So, I'd guess that this is because your timezone changed by two hours (7200 seconds = 2 hours) on 30th October 1983.

I'd guess that MySQL's UNIX_TIMESTAMP is using a different timezone from your Alaskan setting in PHP, which may be either the server default, or dependent on your connection settings, so these diverge when you hit that date.

You may be able to glean more information by asking MySQL what its timezone setting is on that connection with SELECT @@global.time_zone, @@session.time_zone;; see this answer for more info on the output and how to interpret it.

The grand, unified theory of PHP error handling

16 votes

aka, Seeking generic Error Handler (ΟΚ to use commercially)

I doubt that I am the best PHP programmer around, so, although I have my own generic error handler for set_error_handler(), I wondered what others do and if there is a "best" (sorry if that sounds subjective - I just want to draw out general approaches (but even the 'best practices' tag has been removed from SO)).

To be objective about it, here's what I think is needed. Πlease correct me if I am wrong & point me to some good code if you agree.

  • I want to capture as much information as possible - without knowing what the error was.

  • so, for instance, it makes sense to dump the call stack.

  • and $_GET, $_POST and $_SESSION.

  • and I want the call stack & Globals to be pretty-printed

  • I want some 'plain-text' layout, not CSS & fancy JS to expand/collapse the information. My users may have to cut/paste into email or even print out & fax.

  • I would like to be able to add a header of my own devising, preferably as a parameter, but I can hack the code if need be. The header might include the program version, timestamp, etc (and, in my case, I have an audit track, so I can include the user's last few actions, which led to the crash).

  • some users may allow my code to auto-email the report, some may wish to preview it forst & them email it and some may not want me to send email.

I suggest to go the "Exceptions" way.

Throw exceptions when there's a user error, and you can convert php errors into exceptions, like this:

function exception_error_handler($errno, $errstr, $errfile, $errline ) {
    throw new ErrorException($errstr, 0, $errno, $errfile, $errline);
}
set_error_handler("exception_error_handler");

Albeit that this kind of behaviour works best in an OOP kind of environment. If you don't have a single point of entry (like a frontcontroller), you may also catch loose exceptions with this:

function myException($exception)
{
    echo "<b>Exception:</b> " , $exception->getMessage();
}

set_exception_handler('myException');

Simple debugging with exceptions would go something a bit like this:

function parseException($e) {
    $result = 'Exception: "';
    $result .= $e->getMessage();
    $trace = $e->getTrace();
    foreach (range(0, 10) as $i) {
        $result .= '" @ ';
        if (!isset($trace[$i])) {
            break;
        }
        if (isset($trace[$i]['class'])) {
            $result .= $trace[$i]['class'];
            $result .= '->';
        }
        $result .= $trace[$i]['function'];
        $result .= '(); ';
        $result .= $e->getFile() . ':' . $e->getLine() . "\n\n";
    }

    return $result;
}

From there evaluating globals etc. is a walk in the park. You might look for inspiration to the Symfony Framework Debug Toolbar, which offers many of these requests.

Integrating openID and oauth as website login, signin and authentication system

14 votes

First of all let me start by saying that this question is not about different openID and oAuth implementations. There are many classes about these.

My question is what to do after authenticating a user:

  • How to add this user to the user table in the database?
  • How to handle different logins for the same user? (Remy Sharp's example suggests something for openID)
  • How to combine oAuth and openID in the database?

Any ideas?

Your question has to main parts to it:

  1. Authentication
  2. Authorization

Usually the two are not treated differently if the identity provider (IP) is your own, which has been the most common setup in web apps until now.

When using an OpenId Provider such as Google, the authentication part is seperated from your control. You will get a token back telling you if the user is authenticated or not. The token will normally contain the following claims: Name, Email and Named Identity where the last is the unique id of the identity at the IP.

So far so good.

The trick is now as you ask, how do I authorize this user?

well, there are a couple of approaches to this.

First off, when you create a local user in your system, you can prepopulate the Name and Email values based off the claims you get from the IP. In this process, you can start and say that all users that have a profile stored in your system are authorized, or you can develop further processes that will add whatever details you need to know about the user.

Then, how do you avoid that the user is not re-registered if they switch from google to facebook as the IP?

This is where things get tricky. The most common claim that Google, Yahoo, Facebook will provide to you is the email address and Name. So what you can do, is try to match the incomming claim with existing customers in your app. This is not failsafe however, as people can have different emails in different systems.

The name value is also not safe.

In our setup, we start by matching emails, as we know that most IPs validate email addresses. This will reduce duplicates a lot. After that check, we start our own validation process where the goal is to see if the person is already registered. This process looks for the customers mobile number in our database, and if a match is found, we send a one-time-password to the customer to verify correct ownership of the phone number.

Since login is a time sensitive setup, we are created a simple SQL table that maps external identities to our customer numbers. This allows us to implement this kind of validation logic outside all our web apps (and thereby reduce code redundancy)

Explain this fragment of PHP for me: returning array and immediately reference an index

13 votes
<?php
function ReturnArray() {
    return array('a' => 'f', 'b' => 'g', 'c' => 'h', 'd' => 'i', 'e' => 'j');
}

echo ${!${!1}=ReturnArray()}['a']; // 'f'
?>

Please explain what's the logic and step of compute with those ${!1} in the above resolution that works well.

Let's start with some basics. In PHP, something like hello will evaluate to the string "hello". To reference a variable, you can use this syntax: ${expr}. There's also a shorthand for this, $foo, which will roughly evaluate to this: ${"foo"}.

Also, you probably know that you can assign multiple variables at once: $a=$b=$c='hello';, for example. This will assign $a, $b, and $c to 'hello'. This is actually represented as $a=($b=($c='hello')));. $foo=value is an expression which, after $foo is set, will evaluate to value.

Your code statement looks like this:

echo ${!${!1}=ReturnArray()}['a'];

The first thing it does, obviously, is call ReturnArray. It then evaluates !1, which evaluates to false. The ${!1} therefore makes a variable with the name false, though not a string(?!). After that, it applies a not operation to the array. All non-empty arrays are truthy, so the not operation changes it to false. It then uses that ${} syntax again to retrieve the variable named false. It then uses an array access to retrieve the value in the array for key 'a'.

I hope that made sense.

Mysql count rows using filters on high trafic database

10 votes

Let's say you have a search form, with multiple select fields, let's say a user selects from a dropdown an option, but before he submits the data I need to display the count of the rows in the database .

So let's say the site has at least 300k(300.000) visitors a day, and a user selects options from the form at least 40 times a visit, that would mean 12M ajax requests + 12M count queries on the database, which seems a bit too much .

The question is how can one implement a fast count (using php(Zend Framework) and MySQL) so that the additional 12M queries on the database won't affect the load of the site .

One solution would be to have a table that stores all combinations of select fields and their respective counts (when a product is added or deleted from the products table the table storing the count would be updated). Although this is not such a good idea when for 8 filters (select options) out of 43 there would be +8M rows inserted that need to be managed.

Any other thoughts on how to achieve this?

p.s. I don't need code examples but the idea itself that would work in this scenario.

p.s. I'll update the question with a bounty tomorrow.

I would suggest a separate table that caches the counts, combined with triggers.

In order for it to be fast you make it a memory table and you update it using triggers on the inserts, deletes and updates.

pseudo code:

CREATE TABLE counts (
  id unsigned integer auto_increment primary key
  option integer indexed using hash key
  user_id integer indexed using hash key
  rowcount unsigned integer
  unique key user_option (user, option)
) engine = memory

DELIMITER $$

CREATE TRIGGER ai_tablex_each AFTER UPDATE ON tablex FOR EACH ROW
BEGIN
  IF (old.option <> new.option) OR (old.user_id <> new.user_id) THEN BEGIN
    UPDATE counts c SET c.rowcount = c.rowcount - 1 
      WHERE c.user_id = old.user_id and c.option = old.option; 
    INSERT INTO counts rowcount, user_id, option  
      VALUES (1, new.user_id, new.option)
      ON DUPLICATE KEY SET c.rowcount = c.rowcount + 1; 
  END; END IF;
END $$

DELIMITER ;

Selection of the counts will be instant, and the updates in the trigger should not take very long either because you're using a memory table with hash indexes which have O(1) lookup time.

Links:
Memory engine: http://dev.mysql.com/doc/refman/5.5/en/memory-storage-engine.html
Triggers: http://dev.mysql.com/doc/refman/5.5/en/triggers.html

generating an sequential five digit alphanumerical ID

10 votes

General Overview:

The function below spits out a random ID. I'm using this to provide a confirmation alias to identify a record. However, I've had to check for collision(however unlikely), because we are only using a five digit length. With the allowed characters listed below, it comes out to about 33 million plus combinations. Eventually we will get to five million or so records so collision becomes an issue.

The Problem:

Checking for dupe aliases is inefficient and resource heavy. Five million records is a lot to search through. Especially when this search is being conducted concurrently by different users.

My Question:

Is there a way to 'auto increment' the combinations allowed by this function? Meaning I only have to search for the last record's alias and move on to the next combination?

Acknowledged Limitations:

I realize the code would be vastly different than the function below. I also realize that mysql has an auto increment feature for numerical IDs, but the project is requiring a five digit alias with the allowed characters of '23456789ABCDEFGHJKLMNPQRSTUVWXYZ'. My hands are tied on that issue.

My Current Function:

 public function random_id_gen($length)
 {
     $characters = '23456789ABCDEFGHJKLMNPQRSTUVWXYZ';
     $max = strlen($characters) - 1;
     $string = '';

     for ($i = 0; $i < $length; $i++) {
         $string .= $characters[mt_rand(0, $max)];
     }

     return $string;
 }

Why not just create a unique index on the alias column?

CREATE UNIQUE INDEX uniq_alias ON MyTable(alias);

at which point you can try your insert/update and if it returns an error, generate a new alias and try again.

Track php emails with php script?

10 votes

I am sending email newsletters using php mail() function.

I have to track the newsletter emails status.

The status would be
1. Num.Of Sent.
2. Num.Of Delivered.
3. Delivered date.
4. Total Num.Of Read.
5. Unique Num.Of Read.
6. Read date.
7. Num.Of Bounced.
8. Total Num.Of users clicked the links in the email.
9. Unique Num.Of users clicked the links in the email.

From the above status i could track the following:
1. Is Sent. // This is tracked as it is sent from coding.
8. Total Num.Of clicked the links in the email. // This is tracked by adding additional parameters in url.
9. Unique Num.Of clicked the links in the email. // This is tracked by adding additional parameters in url.

How to track the other status of the emails sent from mail() function?
I have to send and track emails from same server.

You can't directly track the other status from the mail() function. Technically Is Sent only tells you that the mail was passed over to the server's outbound mail queue successfully - you can't actually tell if it left your server.

1,. You will need to check your mail server logs to see exactly when the email left the server.

2,3. Num of delivered and delivered date - again you would need to check your mail server logs to see when the mail was handed over (successfully) to a third party mail server. However it would depend on your definition of delivered (into the end-users mailbox? Into their email client?) as to how reliable these stats would be.

4,5,6. Total number read, unique number read, read date. You can't accurately track this. However if you sent HTML email you could embed an image into the email whereby the source of the image was your webserver. If each image URL had a unique identifier for the person you sent the email to then you could track this from your server logs (or via php if the url was a php script that returned an image). However this relies on the end user allowing the loading of images from external webservers (Outlook and gmail for example have this turned off by default).

7,. If you sent the from address to be a script on your server it could parse the bounce message and determine how many bounced. Alternatively you can just have the from address be a mailbox that you go into and check manually.

8, 9. Each link in the email would need to be a link to a url on your webserver. That URL could be a script that would track who clicked (by the use of a query variable) and what they want to see (another query variable) and then redirect them (header function in php) to where you want them to end up.

Alternative to Captcha?

10 votes

I have a simple reg form (Name, Email, Password) on my website and im looking to implement some sort of anti-bot/spam protection, Captcha seems like a very long winded way, plus they really annoy me when I see them on sites. Has anybody an alternative method to protecting against spam which is lightweight and simply integrated?

One alternative is to use a hidden form field as a honeypot for bots. This field can be filled using an appropriate value from Javascript, or it can just be left blank. Either way, if the value isn't what you expect, then you can treat the submission as spam. This won't stop bots that are specifically targeting your site, but it will stop most of the common spam bots that just see a form and fill it out.

10 votes

I have two particular cases where I disagree with a coworker, whether constants should be used or not.

We use a homemade framework working roughly like Symfony 1.x.

  1. Initial code was, in a routing PHP config file for routing, like this one:

    $router->map('/some_url', array('module' => 'some_module', 'action' => 'some_action'));
    $router->map('/some_other_url', array('module' => 'some_module', 'action' => 'some_action'));
    // etc.
    

    The coworker changed it to:

    $router->map('/some_url', array(MODULE => 'some_module', ACTION => 'some_action'));
    $router->map('/some_other_url', array(MODULE => 'some_module', ACTION => 'some_action'));
    
    // + in constants.php file:
    define('MODULE', 'module');
    define('ACTION', 'action');
    

    IMO this is constant overuse: if the concept of "module" or "action" is ever renamed, it would have to be renamed in the entire code, either written as a string or a constant. Plus, the defined constant names above have no very specific meaning, favoring naming collisions/confusions.

  2. Initial code example:

    if (isset($_SESSION['unid']) && isset($_SESSION['login'])) { ... }
    

    Modified by the coworker:

    if (isset($_SESSION[UNID]) && isset($_SESSION[LOGIN])) { ... }
    
    // + in a constants.php file:
    define('UNID', 'unid');
    define('LOGIN', 'login');
    

    In our application, those session vars names unid and login are clearly unlikely to change. Nonetheless, if declaring constants was really a good practice here, I would suggest at least more precise names, for example FIELDNAME_UNID and FIELDNAME_LOGIN...

Is introducing those constants really relevant (that is, naming should just be improved), or (as I guess) completely useless ?

Thanks.

There is a valid argument for using constants like that.

If you accidentally do something like:

$router->map('/some_url', array('moduel' => 'some_module', 'action' => 'some_action'));

it will fail in some undefined way (note the misspelled "moduel").

If you make a spelling mistake or typo when a constant is involved, PHP emits a Notice, and you catch it right away.

How often that actually saves you is a matter for debate. Personally, I usually don't think it's worth the trouble.

What is the most accepted method for hiding password for 'connect.php' file?

8 votes

As my server is getting a bit bigger, and more users are getting access to it, I don't want them to see the password that MySQL is using to connect to PHP, which is stored in my 'connect.php' file and required by every page. However, it is just sitting in the same directory as the rest of the php files.

I've considered using a second 'connect.php'-like file with access to only one table, that stores the encrypted passwords to connect to MySQL, but then I would have the problem of hiding the key to it.

Changing permissions won't work either, if you chmod o-r or something similar, nobody will be able to access the web application, obviously.

Is there an accepted method to get around this problem, or should I just solve it on my own? The problem is that I don't want it to be too convoluted if there is an accepted method.

All the answers have good advice but fail to address the fact that any user with server access can just snoop around and open the config.php in an editor.

Set your config files in a directory outside of public webspace , the webserver should be the owner of this directory and it should have permissions set to 700. All files it contains should be 644. This way no one can even read the file contents apart from webserver user or root.

This is a common approach, but there is a lot more to the subject as security is a very vast topic, but is better than 90% of the setups out there.

Are ^$ and $^ in PHP regex the same?

8 votes

I know it's a stupid question, im learning how the engine works, i wanna know if someone can explain it.

if(preg_match_all('/$^/m',"",$array))
echo "Match";

if(preg_match_all('/$^\n$/m',"\n",$array))
echo "Match";

Both match!

$ and ^ are zero-width meta-characters. Unlike other meta-characters like . which match one character at a time (unless used with quantifiers), they do not actually match literal characters. This is why ^$ matches an empty string "", even though the regex (sans delimiters) contains two characters while the empty string contains zero.

It doesn't matter that an empty string contains no characters. It still has a starting point and an ending point, and since it's an empty string both are at the same location. Therefore no matter the order or number of ^ and $ you use, all of their permutations should match the empty string.


Your second case is slightly trickier but the same principles apply.

The m modifier (PCRE_MULTILINE) just tells the PCRE engine to feed in the entire string at one go, regardless of newlines, but the string still comprises "multiple lines". It then looks at ^ and $ as "the start of a line" and "the end of a line" respectively.

The string "\n" is essentially logically split into three parts: "", "\n" and "" (because the newline is surrounded by emptiness... sounds poetic).

Then these matches follow:

  1. The first empty string is matched by the starting $^ (as I explain above).
  2. The \n is matched by the same \n in your regex.
  3. The second empty string is matched by the last $.

And that's how your second case results in a match.

Why does PDO print my password when the connection fails?

8 votes

I have a simple website where I establish a connection to a Mysql server using PDO.

$dbh  =  new PDO('mysql:host=localhost;dbname=DB;port=3306', 'USER', 
'SECRET',array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));

I had some traffic on my site and the servers connection limit was reached, and the website throw this error, with my PLAIN password in it!

Fatal error: Uncaught exception 'PDOException' with message 'SQLSTATE[08004] [1040] Too many connections' in /home/premiumize-me/html/index.php:64 Stack trace: #0 /home/premiumize-me/html/index.php(64): PDO->__construct('mysql:host=loca...', 'USER', 'SECRET', Array) #1 {main} thrown in /home/premiumize-me/html/index.php on line 64

Ironically I switched to PDO for security reasons, this really shocked me.

Because this exact error is something you can provoke very easily on most sites using simple http flooding.

I now wrapped my conenction into a try/catch clause, but still. I think this is catastrophic!

So I am new to PDO and my questino is: What do I have to consider to be safe! How to I establish a connection in a secure way? Are there other known security holes like this one that I have to be aware of?

You should have display_errors = off in your PHP.ini anyway to avoid this problem. Errors that reveal details like these come from many places, in addition to PDO.

Yes, you should also have it in a try/catch block.

You can also $pdo->setAttribute(PDO::ERRMODE_SILENT), but then you need to be checking the error codes manually rather than using a try/catch block. See http://php.net/manual/en/pdo.setattribute.php for more error constants.

Possible collision of two ajax requests?!

7 votes

I'm having trouble with one of my sites on which two ajax requests are executed when the page loads. I'm using jQuery in combination with an PHP application based on the zend framework.

The relevant HTML (simplified) looks like:

<select id="first">
     <option value="1">First Option</option>
     <option value="2">Second Option</option>
</select>
<div class="first_block"></div>

<select id="second">
     <option value="1">First Option</option>
     <option value="2">Second Option</option>
</select>
<div class="second_block"></div>

Here is what my jQuery looks like:

$(document).ready(function(){

// function to update the first block
var updateFirstBlock = function(){
    var param = $(this).val();
    $.ajax('module/controller/action/param/' + param, {
        'success': function(data){
            $('.first_block').html(data);
        }
    });
};

// bind and trigger the first update function
$('select#first').bind('change', updateFirstBlock );
$('select#first').trigger('change');


// function to update the second block
var updateSecondBlock = function(){
    var param= $(this).val();
    $.ajax('module/controller/another-action/param/' + param, {
        'success': function(data){
            $('.second_block').html(data);
        }
    });
};

// bind and trigger the second update function
$('select#second').bind('change', updateSecondBlock );
$('select#second').trigger('change');

});

The PHP Application just returns some content dependent on which value is distributed.

Now what happens when the pages is loaded, is that in nine of ten cases one of the two requests gets no answer. The other one gets 200 OK and the failing one times out. There's no regularity, which request fails.

Is it possible that there's something wrong in the web-servers (Apache 2.2) configuration, so that two simultaneously fired requests constrain each other?

EDIT

If I set both requests to async: false, they are always executed properly. So there must be a collision, I think.

EDIT 2

A possible reason for this behavior, could be php's session lock. I will examine this further.

It seems your definitely on the right rack with the PHP session lock:

Session locking (concurrency) notes

The default PHP session model locks a session until the page has finished loading. So if you have two or three frames that load, and each one uses sessions, they will load one at a time. This is so that only one PHP execution context has write access to the session at any one time.

Some people work around this by calling session_write_close() as soon as they've finished writing any data to the $_SESSION - they can continue to read data even after they've called it. The disadvantage to session_write_close() is that your code still will lock on that first call to session_start() on any session'ed page, and that you have to sprinkle session_write_close() everywhere you use sessions, as soon as you can. This is still a very good method, but if your Session access follows some particular patterns, you may have another way which requires less modification of your code.

The idea is that if your session code mostly reads from sessions, and rarely writes to them, then you can allow concurrent access. To prevent completely corrupted session data, we will lock the session's backing store (tmp files usually) while we write to them. This means the session is only locked for the brief instant that we are writing to the backing store. However, this means that if you have two pages loading simultaneously, and both modify the session, the Last One Wins. Whichever one loads first will get its data overwritten by the one that loads second. If this is okay with you, you may continue - otherwise, use the session_write_close method, above.

If you have complicated bits of code that depend on some state in the session, and some state in a database or text file, or something else - again, you may not want to use this method. When you have two simultaneous pages running, you might find that one page runs halfway through, modifying your text file, then the second one runs all the way through, further modifying your text file, then the first one finishes - and your data might be mangled, or completely lost.

So if you're prepared to debug potentially very, very nasty race conditions, and your access patterns for your sessions is read-mostly and write-rarely (and not write-dearly), then you can try the following system.

Copy the example from session_set_save_handler() into your include file, above where you start your sessions. Modify the session write() method:

function write($id, $sess_data)
{
  global $sess_save_path, $sess_session_name;

  $sess_file = "$sess_save_path/sess_$id";
  if ($fp = @fopen($sess_file, "w")) {
   flock($fp,LOCK_EX);
   $results=fwrite($fp, $sess_data);
   flock($fp,LOCK_UN);
   return($results);
  } else {
   return(false);
  }

}

You will probably also want to add a GC (Garbage Collection) method for the sessions, as well.

And of course, take this advice with a grain of salt - We currently have it running on our testing server, and it seems to work OK there, but people have reported terrible problems with the Shared Memory session handler, and this method may be as unsafe as that.

You can also consider implementing your own locks for scary concurrency-sensitive bits of your code.

Ref: http://ch2.php.net/manual/en/ref.session.php#64525


It might be worthwhile implementing a database session handler. Using a database eliminates this problem and can actually improve performance slightly, if a good database structure is used.

Dealing with nested sets in mysql?

7 votes

I have decided to follow http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html

So now I am looking for some help with the code.

I am using their data for my testing, So, I visualized the tree being like so:

    array('value' => 'Richard Shakespeare',
        array('value' => 'Henry',
            array('value' => 'Joan'),
            array('value' => 'Margaret'),
            array('value' => 'William',
                array('value' => 'Susana',
                    array('value' => 'Elizabeth Hall',
                        array('value' => 'John Bernard'))),
                array('value' => 'Hamnet'),
                array('value' => 'Judith',
                    array('value' => 'Shakespeare Quiney'),
                    array('value' => 'Richard Quiney'),
                    array('value' => 'Thomas Quiney'))),
            array('value' => 'Gilbert'),
            array('value' => 'Joan',
                array('value' => 'William Hart'),
                array('value' => 'Mary Hart'),
                array('value' => 'Thomas Hart'),
                array('value' => 'Micheal Hart')),
            array('value' => 'Anne'),
            array('value' => 'Richard'),
            array('value' => 'Edmond')),
        array('value' => 'John'));

So if we want to insert that into the database we want to end up with

Array
(
    [0] => Array
        (
            [value] => Richard Shakespeare
            [left] => 1
            [right] => 46
        )

    [1] => Array
        (
            [value] => Henry
            [left] => 2
            [right] => 43
        )

    [2] => Array
        (
            [value] => Joan
            [left] => 3
            [right] => 4
        )

    [3] => Array
        (
            [value] => Margaret
            [left] => 5
            [right] => 6
        )

    [4] => Array
        (
            [value] => William
            [left] => 7
            [right] => 24
        )

    [5] => Array
        (
            [value] => Susana
            [left] => 8
            [right] => 13
        )

    [6] => Array
        (
            [value] => Elizabeth Hall
            [left] => 9
            [right] => 12
        )

    [7] => Array
        (
            [value] => John Bernard
            [left] => 10
            [right] => 11
        )

    [8] => Array
        (
            [value] => Hamnet
            [left] => 14
            [right] => 15
        )

    [9] => Array
        (
            [value] => Judith
            [left] => 16
            [right] => 23
        )

    [10] => Array
        (
            [value] => Shakespeare Quiney
            [left] => 17
            [right] => 18
        )

    [11] => Array
        (
            [value] => Richard Quiney
            [left] => 19
            [right] => 20
        )

    [12] => Array
        (
            [value] => Thomas Quiney
            [left] => 21
            [right] => 22
        )

    [13] => Array
        (
            [value] => Gilbert
            [left] => 25
            [right] => 26
        )

    [14] => Array
        (
            [value] => Joan
            [left] => 27
            [right] => 36
        )

    [15] => Array
        (
            [value] => William Hart
            [left] => 28
            [right] => 29
        )

    [16] => Array
        (
            [value] => Mary Hart
            [left] => 30
            [right] => 31
        )

    [17] => Array
        (
            [value] => Thomas Hart
            [left] => 32
            [right] => 33
        )

    [18] => Array
        (
            [value] => Micheal Hart
            [left] => 34
            [right] => 35
        )

    [19] => Array
        (
            [value] => Anne
            [left] => 37
            [right] => 38
        )

    [20] => Array
        (
            [value] => Richard
            [left] => 39
            [right] => 40
        )

    [21] => Array
        (
            [value] => Edmond
            [left] => 41
            [right] => 42
        )

    [22] => Array
        (
            [value] => John
            [left] => 44
            [right] => 45
        )

)

So the issue comes to mind of, How best to do this?

My solution was:

$container = array();

function children($item){
  $children = 0;
  foreach($item as $node)
    if(is_array($node))
      $children += children($node)+1;
    return $children;
}

function calculate($item, &$container, $data = array(0,0)){
  //althought this one is actually of no use, it could be useful as it contains a count 
  $data[0]++; //$left

  $right = ($data[0]+(children($item)*2))+1;

  //store the values in the passed container
  $container[] = array(
    'value' => $item['value'],
    'left'  => $data[0],
    'right' => $right,
  );

  //continue looping
  $level = $data[1]++;
  foreach($item as &$node)
    if(is_array($node))
      $data = calculate($node, $container, $data);

    $data[1] = $level;
    $data[0]++;
    return $data;
}

calculate($tree, $container);

How efficient it is I do not know.

But now onto the queries.

To select all descendants of a node we can use

SELECT child.value AS 'Descendants of William', COUNT(*) AS `Level`
FROM tester AS parent
JOIN tester AS child ON child.`left` BETWEEN parent.`left` AND parent.`right`
WHERE parent.`left` > 7 AND parent.`right` < 24
GROUP BY child.value ORDER BY `level`;

To select all descendants of a node, to a specific depth we can use
Note that we are selecting Descendants of William to a depth of 2
Williams left: 7, Williams right: 24, Levels: 2

SELECT child.value AS 'Descendants of William', COUNT(*) AS `Level`
FROM tester AS parent
JOIN tester AS child ON child.`left` BETWEEN parent.`left` AND parent.`right`
WHERE parent.`left` > 7 AND parent.`right` < 24
GROUP BY child.value HAVING `level` <= 2 ORDER BY `level`;

So that's easy enough.

But now I want to know a few things,
Note that in the actual database as well as left/right all rows have a unique id, and a "parent" column containing their inviteers id, or null if not invited

  • Lets say I want to insert David as a child of Judith, How do I do that?
  • Lets say I want to get Mary Hart's Parent, and the Parents Parent (array('Henery', 'Joan', 'Mary Hart')), How do I do that?
  • Lets say I want to delete William Hart from Joan How so I do that?

To update/delete you will need to increase/decrease left/right values of all elements of branch.
Examples of queries you can find here.

How efficient it is I do not know.

Nested sets works VERY slowly with big trees on update/insert/delete. And very fast to select.
So use this model only with static data, which will be stored without changes most of the time, and this tree will not contain thousands of nodes (or any update will take minutes to complete). Materialized path works much faster.

MySQL Injection Problem

7 votes

I've been coding my website in PHP lately and I was pretty proud of myself for my good practices of sanitizing my input before I used it in a query. It was all going great until my friend said I need to sanitize my input. When I tried to explain to him that it was sanitized, he showed me that he had found everything in 'users' table in my database. I didn't know how, so I thought I would post what I was doing wrong that made my sanitizing not work. Here is the PHP code he was exploiting:

start_mysql(); // Starts the databases stuff, etc.

$id = mysql_real_escape_string($_GET['id']);
$game = mysql_query("SELECT * FROM `games` WHERE `id` = $id LIMIT 0, 1");

All he was doing was changing the id parameter, making him able to use SQL injection on my database. I thought mysql_real_escape_string escaped all characters like that, but apparently I was wrong. I did some tests with a normal string to see what would happen, and this is what it said

URL: /game.php?id=' OR '' = '

echo($_GET['id']); // This echo'd: \' OR \'\' = \'
echo(mysql_real_escape_string($_GET['id'])); // This echo'd: \\\' OR \\\'\\\' = \\\'

So, my simple question is, what am I doing wrong?

Matt,

mysql_real_escape_string() will only filter for certain characters, if you truly want to prevent injection attacks check out this other Stack Overflow article that suggests you use Prepared statements:

Prepared Statements

PHP Manual entry on Prepared statements

Edit: Also check out Slaks and Michael's postings about wrapping your variable in single quotes.

Good luck!

H

Dealing with commas in CSV

7 votes

I get a CSV data from a SOAP call in php. Unfortunately, the data may have commas in it. It is formatted correctly as in

1,name,2,lariat,3,"first, last",5,NMEA,...

I need to parse it to individual values in either php or javascript. I have browsed through threads on stack overflow and elsewhere but have not found a specific solution in php / javascript.

The approach I am currently using is

$subject = '123,name,456,lryyrt,123213,"first,last",8585,namea3';
$pattern = '/,|,"/';
$t2=preg_replace ('/,|(".*")/','$0*',$subject);
$t2=str_replace(',','*',$t2);
$t2=str_replace('*',',',$t2);

Where * is the deliminator, but the preg_replace generates an extra *. I have tried a couple of other approaches involving preg_match and other preg_ functions but did not succeed in having any kind of a clean split.

Any suggestion on how to split up CSV data that contains commas in it?

Don't attempt to do this with a regular expression. Just use str_getcsv()! The third parameter informs str_getcsv() to look for quote-enclosed fields.

$subject = '123,name,456,lryyrt,123213,"first,last",8585,namea3';
$array = str_getcsv($subject, ",", '"');

print_r($array);

Prints the following:

Array
(
    [0] => 123
    [1] => name
    [2] => 456
    [3] => lryyrt
    [4] => 123213
    [5] => first,last
    [6] => 8585
    [7] => namea3
)

Why does mysql_query() return TRUE with a SELECT statement?

6 votes

According to the manual of mysql_query() and to everything I know about this function that I used so many times, it can either return a resource or FALSE if the query is a SELECT. Yet it returns TRUE from time to time.

How can this be? It never ever happened before. Is this a bug in PHP 5.3.2? Does anyone know anything about this?

The code is something like:

if (!$resource = mysql_query($query, $handle)) {
    throw some exception;
}

var_dump($query);
if ($resource === true && strpos($query, 'SELECT') !== false) {
    throw new Exception('mysql_query() returned TRUE for SELECT');
}

It's pretty hard to reproduce, too. It happens only from time to time. I also noticed that it's likely this happens at the same time the server interrupts the connection suddenly, in which case it should return FALSE...

If webbiedave isn't on the right track, there's only one codepath that allows for this situation in the php source:

#if MYSQL_VERSION_ID < 32224
#define PHP_MYSQL_VALID_RESULT(mysql)       \
    (mysql_num_fields(mysql)>0)
#else
#define PHP_MYSQL_VALID_RESULT(mysql)       \
    (mysql_field_count(mysql)>0)
#endif

...

if (!mysql_result) {
    if (PHP_MYSQL_VALID_RESULT(mysql->conn)) { /* query should have returned rows */
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to save result set");
        RETURN_FALSE;
    } else {
        RETURN_TRUE; // <<< this case
    }
}

I would consider this a bug. Especially since there's no real way to verify this - mysql_num_fields in the PHP code uses the resource that you're not getting, not the connection.

Although it's still weird that the C version of mysql_query returns zero on lost connection - if you're able to, try the following patch and reinstall the mysql extension:

Index: ext/mysql/php_mysql.c
===================================================================
--- ext/mysql/php_mysql.c       (revision 311719)
+++ ext/mysql/php_mysql.c       (working copy)
@@ -1485,6 +1485,9 @@
                if (PHP_MYSQL_VALID_RESULT(mysql->conn)) { /* query should have returned rows */
                        php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to save result set");
                        RETURN_FALSE;
+               } else if( mysql_errno(mysql->conn) != 0 ) {
+                       php_error_docref("http://www.mysql.com/doc" TSRMLS_CC, E_WARNING, "%s", mysql_error(mysql->conn));
+                       RETURN_FALSE;
                } else {
                        RETURN_TRUE;
                }

E-commerce from scratch or not

5 votes

I need to develop a new site with the e-commerce part. I'm here to get some hints from you on which road I should go on.

The site will have a static part which include some static pages and the e-commerce part for sell the products. I'm a Django and PHP developer but this is the first time I need to develop an e-commerce.

I see there's some solutions like Satchmo for Django and Magento for PHP. There's also some services like Shopify that provide a pre cooked solution.

These are my requirements:

  • I'll need to give to my site a custom layout/style, all written by me
  • I'll need a nice admin interface for view the incoming orders, the average orders for day and other nice stuff (I see Magento have a very nice panel for this)
  • I'll need a feature that reports me the products that are outgoing out of stock
  • Maybe I'll need to implement a custom payment method
  • Maybe I'll need to implement a custom shipping courier

Of course I prefer Django instead PHP. Anyone can give me an hint? I've never used anyone of this framework/products.

Thank you.

-- EDIT

Deadline is 3 months. Budget ~7500 $. I've three months for finish it.

I've no experience with Django. I develop websites using Wordpress and a year ago I tested Magento (not tested it since).

I was surprised of how user friendly Magento was. Both front end and backend. It was really easy to set up. You can easilly customize the layout as well using Magento template API.

Here are some pros and cons:

Pros

  • Don't develop your own e-commerce site. It's a lot of work to re-invent the wheel.
  • By using a well know e-commers product, you will also find support.
  • E-commerse like Magento is easy to set up and offers the option for change look and feel to match the rest of the web site.
  • You can easilly combine CMS and e-commerce
  • E-commerse apps like Magento offers the option for custom static / dynamic pages (like a mini cms)
  • You will save money (development time) by using something that is already developed
  • Most e-commerce sites handles payment towards different banking providers.

Cons

  • Magento does not work on all servers (server configuration). This however might have changed the past year.
  • If free e-commers products like Magento does not sattisfy your needs, you need to buy licenses. And that costs a lot of money.
  • It takes a lot of time developing your own e-commerce site if you want quality and usability. There is no way you can develop a good e-commerce site from scratch in just 3 months.