Best php questions in July 2011

Parentheses altering semantics of function call result

26 votes

It was noted in another question that wrapping the result of a PHP function call in parentheses can somehow convert the result into a fully-fledged expression, such that the following works:

<?php
error_reporting(E_ALL | E_STRICT);

function get_array() {
   return Array();
}

function foo() {
   // return reset(get_array());
   //              ^ error: "Only variables should be passed by reference"

   return reset((get_array()));
   //           ^ OK
}

foo();
?>

I'm trying to find anything in the documentation to explicitly and unambiguously explain what is happening here. Unlike in C++, I don't know enough about the PHP grammar and its treatment of statements/expressions to derive it myself.

Is there anything hidden in the documentation regarding this behaviour? If not, can somebody else explain it without resorting to supposition?


Update

I first found this EBNF purporting to represent the PHP grammar, and tried to decode my scripts myself, but eventually gave up.

Then, using phc to generate a .dot file of the two foo() variants, I produced the following images:

[root@lolphin:~] $ yum install phc graphviz
[root@lolphin:~] $ phc --dump-ast-dot test1.php > test1.dot
[root@lolphin:~] $ dot -Tpng test1.dot > test1.png

Parse tree of snippet 1

[root@lolphin:~] $ phc --dump-ast-dot test2.php > test2.dot
[root@lolphin:~] $ dot -Tpng test2.dot > test2.png

Parse tree of snippet 2

Notice how they are identical. >.<

This behavior could be classified as bug, so you should definitely not rely on it.

The (simplified) conditions for the message not to be thrown on a function call are as follows (see the definition of the opcode ZEND_SEND_VAR_NO_REF):

  • the argument is not a function call (or if it is, it returns by reference), and
  • the argument is either a reference or it has reference count 1 (if it has reference count 1, it's turned into a reference).

Let's analyse these in more detail.

First point is verified (not a function call)

By using the parentheses, you're marking the argument not be detected as a function call anymore.

When parsing a non empty function argument list there are three possibilities for PHP:

  • An expr_without_variable
  • A variable
  • A & followed by a variable (for the deprecated call-time pass by reference)

When writing just get_array() PHP sees this as a variable.

(get_array()) on the other hand does not qualify as a variable. It is an expr_without_variable.

This ultimately affects the way the code compiles too, namely the extended value of the opcode SEND_VAR_NO_REF will no longer include the flag ZEND_ARG_SEND_FUNCTION, which is the way the function call is detected in the opcode implementation.

Second point is verified (the reference count is 1)

At several points, the Zend Engine allows non-references with reference count 1 where references are expected. These details should not be exposed to the user, but unfortunately they are here.

In your example you're returning an arrray that's not referenced from anywhere else. If it were, you would still get the message, i.e., this second point would not be verified.

So the following very similar example does not work:

<?php

$a = array();
function get_array() {
   return $GLOBALS['a'];
}

return reset((get_array()));

Should one minify server code when it's in production?

23 votes

When it comes to the frontend code you always minify it (remove white spaces, comments etc) in production.

Should one do the same with server code? I usually have a lot of comments in my server files. But I have never heard about people doing so.

Wouldn't the server run faster if the code was optimized in the same way?

You're not going to have any improvement as the whitespaces and all formatting are lost when your server side code is translated to machine code (or interpreted). It's also not sent over the wire, it's read from the local filesystem, so while having less characters would lead to a faster startup, it would not make any difference on the long run and the startup speed gain would be marginal (or even unnoticeable).

So, no, minifying your server side code is basically useless, worse, it's probably going to make stack traces completely useless, as there's going to be a lot of code in the same line (and not necessarily with the same formatting you used).

Is there anything that can be put after the "ORDER BY" clause that can pose a security risk?

18 votes

Basically, what I want to do is this:

mysql_query("SELECT ... FROM ... ORDER BY $_GET[order]")

They can obviously easily create a SQL error by putting non-sense in there, but mysql_query only allows you to execute 1 query, so they can't put something like 1; DROP TABLE ....

Is there any damage a malicious user could do, other than creating a syntax error?

If so, how can I sanitize the query?

There's a lot of logic built on the $_GET['order'] variable being in SQL-like syntax, so I really don't want to change the format.


To clarify, $_GET['order'] won't just be a single field/column. It might be something like last_name DESC, first_name ASC.

Yes, SQL injection attacks can use an unescaped ORDER BY clause as a vector. There's an explanation of how this can be exploited and how to avoid this problem here:

http://josephkeeler.com/2009/05/php-security-sql-injection-in-order-by/

That blog post recommends using a white list to validate the ORDER BY parameter against, which is almost certainly the safest approach.


To respond to the update, even if the clause is complex, you can still write a routine that validates it against a whitelist, for example:

function validate_order_by($order_by_parameter) {
    $columns = array('first_name', 'last_name', 'zip', 'created_at');

    $parts = preg_split("/[\s,]+/", $order_by_parameter);

    foreach ($parts as $part) {
        $subparts = preg_split("/\s+/", $part);

        if (count($subparts) < 0 || count($subparts) > 2) {
           // Too many or too few parts.
           return false;
        }

        if (!in_array($subparts[0], $columns)) {
           // Column name is invalid.
           return false;
        }

        if (count($subparts) == 2 
            && !in_array(strtoupper($subparts[1]), array('ASC', 'DESC')) {
          // ASC or DESC is invalid
          return false;
        }
    }

    return true;
}

Even if the ORDER BY clause is complex, it's still made only out of values you supply (assuming you're not letting users edit it by hand). You can still validate using a white list.

I should also add that I normally don't like to expose my database structure in URLs or other places in the UI and will often alias the stuff in the parameters in the URLs and map it to the real values using a hash.

Are PHP global constants a good modern development practice?

13 votes

I'm working on a new project with a sizeable PHP codebase. The application uses quite a few PHP constants ( define('FOO', 'bar') ), particularly for things like database connection parameters. These constants are all defined in a single configuration file that is require_once()'d directly by basically every class in the application.

A few years ago this would have made perfect sense, but since then I've gotten the Unit Testing bug and this tight coupling between classes is really bothering me. These constants smell like global variables, and they're referenced directly throughout the application code.

Is this still a good idea? Would it be reasonable to copy these values into an object and use this object (i.e. a Bean - there, I said it) to convey them via dependency injection to the the classes that interact with the database? Am I defeating any of the benefits of PHP constants (say speed or something) by doing this?

Another approach I'm considering would be be to create a separate configuration PHP script for testing. I'll still need to figure a way to get the classes under test to use the sandbox configuration script instead of the global configuration script. This still feels brittle, but it might require less outright modification to the entire application.

In my opinion, constants should be used only in two circumstances:

  • Actual constant values (i.e. things that will never change, SECONDS_PER_HOUR).
  • OS-dependent values, as long as the constant can be used transparently by the application, in every situation possible.

Even then, I'd reconsider whether class constants would be more appropriate so as not to pollute the constants space.

In your situation, I'd say constants are not a good solution because you will want to provide alternative values depending on where they're used.

using |= in php

12 votes

I was reading some php code source in Joomla and found the following:

$failed |= is_numeric( $key );

Other than if $key is numeric , what does |= mean?

$x |= $y; is the same as $x = $x | $y;

$x | $y is a bitwise operator which means it returns the result of a logical 'or' between the two variables.

In the context of the question, it allows $failed to store failure statuses for several actions in a single variable (each bit position representing an individual action).

If you need to know more about what this does, I suggest reading the PHP manual page for bitwise operators: http://www.php.net/manual/en/language.operators.bitwise.php

Efficient method to find collision free random numbers

12 votes

I have a users table, the user ID is public. But I want to obfuscate the number of registered user and trends of the project, so I don't want to have public incrementing IDs.

When a new user is created I want to find a random integer number that is greater than a certain number and that is not yet in the database.

Naive code:

<?php
    $found = false;
    while(!$found) {
      $uid = rand(1000000000,4294967295) // find random number betwen minimum and maximum
      $dbh->beginTransaction();
      // check if user id is in use, and if not insert it
      if($dbh->query("SELECT * FROM users WHERE uid = $uid")) {
        $dbh->exec("INSERT INTO users (uid) VALUES ($uid)");
        $found = true;
      }
      $dbh->commit();
    }
    // we just got our new uid ...
?>

This will work it however may become inefficient. True that there is a big range and the probability of hitting an unused uid is high. But what if I want to use a smaller range, because I don't want to have so long userids?

Example of my concerns:

  • 60% of all user ids are in use
  • the chance of hitting an unused uid are 0.4
  • the first attempt has 0.4% success rate
  • if 1st not successful the second attempt has 0.6*0.4 probability
  • so with a maximum of two tries i have 0.4 + 0.6*0.4 proability (is that right??)

So one method to optimize is that came to my mind is the following:

  • find a random number, check if its free, if not, increment it by 1 and try again and so on
  • if the maximum number is hit, continue with the minimum number

That should give me a number with a maximum runtime of O(range)

That sounds pretty bad but I think it is not, because I submit random numbers to the database and that they are all at the beginnig is very unlikely. So how good/bad is it really?

I think this would work just fine but I want it BETTER

So what about this?

  • find a random number
  • query the database for how many numbers are occupied in the range whole range, starting from that number (this first step is trivial...)
  • if there are numbers occupied in that range, divide the range by half and try again. starting with the initial number
  • if there are numbers occupied divide the range by half and try again. starting with the initial number

If I am thinking correctly this will give ma a number with a maximum of O(log(range)) time.

That is pretty satisfying because log() is pretty good. However I think this method will often be as bad as possible. Because with our random numbers we will probably always hit numbers in the large intervals.

So at the beginning our pure random method is probably better.

So what about having a limit like this

  • select current number of used numbers
  • is it greater than X, logarithmic range approach
  • if it is not, use pure random method

What would X be and why?

So final question:

This is pretty easy and pretty complicated at the same time.

I think this is a standard problem because lots and lots of system use random ids (support tickets etc), so I cannot imagine I am the first one to stumble across this.

How would you solve this? Any input is appriciated!

Is there maby an existing class / procedure for this I can use?

Or maby some database functions that I can use?

I would like to do it in PHP/Mysql

IMPORTANT EDIT:

I just thought about the range/logarithmic solution. It seems to be complete bullshit sorry for my wording because:

  • what if i hit an occupied number at start?

Then I am dividing my range so long if it is only 1. And even then the number is occoupied.

So its completely the same as the pure random method from start, only worse....

I am a bit embarassed I made this up but I will leave it in because I think its a good example of overcomplicated thinknig!

If p is the proportion of ids in use, your "naive" solution will, on average, require 1/(1-p) attempts to find an unused id. (See Exponential distribution). In the case of 60% occupancy, that is a mere 1/0.4 = 2.5 queries ...

Your "improved" solution requires about log(n) database calls, where n is the number of ids in use. That is quite a bit more than the "naive" solution. Also, your improved solution is incomplete (for instance, it does not handle the case where all number in a subrange are taken, and does not elaborate with subrange you recurse into) and is more complex to implement to boot.

Finally, note that your implementation will only be thread safe if the database provides very strict transaction isolation, which scales poorly, and might not be the default behaviour of your database system. If that turns out to be a problem, you could speculatively insert with a random id, and retry in the event of a constraint violation.

How is this valid php code?

11 votes

I'm modifying a wordpress template and I'm curious as to how this is a valid control structure in PHP. Anyone have any insight?

<?php if(condition): ?>
<?php if(!condition || !function()) ?>
<?php elseif(condition): ?>
<?php if(!condition || !function()) ?>
<?php endif; ?>

If I remove all the tags, I get (with more sane indentation):

<?php
if(condition):
    if(!condition || !function())
elseif(condition):
    if(!condition || !function())
endif;
?>

which is invalid because the indented if statements don't end. So how/why is this code valid if there are opening and closing php tags everywhere?


Edit for Kerrek SB. Make a php file and run it. It's valid:

<?php if(true): ?>
<?php if(true) ?>
<?php endif; ?>
<?php echo 'here'; ?>

Your (reduced) example code is equivalent to this:

<?php
if(condition):
    if(!condition || !function()) { }
endif;
?>

Or even:

<?php
if(condition):
    if(!condition || !function());
endif;
?>

By closing off the <?php tag, you appear to get an "empty statement" for free.

Your real example could be one-lined like so:

<?php if(true): if(true); endif; echo 'here'; ?>

But note that an elseif makes this ambigous!

<?php
if(condition):
    if(!condition || !function());
elseif(condition):   // Bogus! which block does this belong to?
    if(!condition || !function());
endif;
?>

We'd have to disambiguate this:

<?php
if(condition):
{
    if(!condition || !function());
}
elseif(condition):
{
    if(!condition || !function());
}
endif;
?>

Now it's clear, but now we could have spared ourselves the colon syntax altogether.

Thanks to Lekensteyn for pointing this out!

See the discussion below for further oddities.

String is not equal to itself

Asked on Sun, 10 Jul 2011 by Qiao php
11 votes

But why?

if ('i' == 'і')
    echo 'good';
else
    echo 'bad';  

echos:

>> bad

You should copy this snippet. If you write it by hand, it will works.
It drives me crazy.

You are sneaky! The second I is not a lower case latin small i. I hexdumped it:

hexdump -C check
00000000  69 66 20 28 27 69 27 20  3d 3d 20 27 d1 96 27 29  |if ('i' == '..')|
00000010  0a 20 20 20 20 65 63 68  6f 20 27 67 6f 6f 64 27  |.    echo 'good'|
00000020  3b 0a 65 6c 73 65 0a 20  20 20 20 65 63 68 6f 20  |;.else.    echo |
00000030  27 62 61 64 27 3b 20 20  0a 0a                    |'bad';  ..|
0000003a

I'll let you look up D1 96 :-) Awesome tricksy riddle. +1

Inverse htmlentities / html_entity_decode

11 votes

Basically I want to turn a string like this:

<code> &lt;div&gt; blabla &lt;/div&gt; </code>

into this:

&lt;code&gt; <div> blabla </div> &lt;/code&gt;

How can I do it?


The use case (bc some people were curious):

A page like this with a list of allowed HTML tags and examples. For example, <code> is a allowed tag, and this would be the sample:

<code>&lt;?php echo "Hello World!"; ?&gt;</code>

I wanted a reverse function because there are many such tags with samples that I store them all into a array which I iterate in one loop, instead of handling each one individually...

My version using regular expressions:

$string = '<code> &lt;div&gt; blabla &lt;/div&gt; </code>';
$new_string = preg_replace(
    '/(.*?)(<.*?>|$)/se', 
    'html_entity_decode("$1").htmlentities("$2")', 
    $string
);

It tries to match every tag and textnode and then apply htmlentities and html_entity_decode respectively.

Is it secure to place uploaded images in a public folder?

10 votes

I've just had a discussion with my teammate about the location of user uploaded images in an image gallery. I would like a broader insight on the methods we suggest.

My teammate wrote a controller + action that calls file_get_contents on an image file placed in a folder that's not available for public browsing (i.e., outside public_html on the server), and echoes it via a header. This is secure, but since we use Zend Framework, it's also crawling slow - each call to the image controller costs us approx 500ms of lag due to the bootstrap's queries being executed. It's annoying since the picture gallery view displays over 20 images at the same time.

In short, the relevant code would be:

class ImageController extends Zend_Controller_Action {
    public function showAction () {
        $filename = addslashes($this->_getParam('filename'));
        if(!is_file($filename)) {
            $filename = APPLICATION_PATH.'/../public/img/nopicture.jpg';
        }
        $this->_helper->viewRenderer->setNoRender(true);
        $this->view->layout()->disableLayout();
        $img = file_get_contents($filename);
        header('Content-Type: image/jpeg');
        $modified = new Zend_Date(filemtime($filename));
        $this->getResponse()
             ->setHeader('Last-Modified',$modified->toString(Zend_Date::RFC_1123))
             ->setHeader('Content-Type', 'image/jpeg')
             ->setHeader('Expires', '', true)
             ->setHeader('Cache-Control', 'public', true)
             ->setHeader('Cache-Control', 'max-age=3800')
             ->setHeader('Pragma', '', true);
        echo $img;
    }
}

Then, in a view, we just call:

<img src="<?php echo $this->url(array('controller' => 'image', 'action' => 'show', 'filename' => PATH_TO_HIDDEN_LOCATION.'/filename.jpg')); ?>" />

I have a different approach: I prefer to keep the original images in a hidden location, but as soon as they are requested, copy them to a public location and provide a link to it (with an extra mechanism, run by cron, to wipe the public images directory every now and then in order not to waste space, and a robots.txt telling Google not to index the directory). The solution places files (a few at every given moment) in a publicly accessible directory (provided one knows the filename), but also requires only a view helper, thus not launching the bootstrap:

class Zend_View_Helper_ShowImage extends Zend_View_Helper_Abstract {
    public function showImage ($filename) {
        if (!file_exists(PUBLIC_PATH."/img/{$filename}")) {
            if (!copy(PATH_TO_HIDDEN_FILES."/{$filename}",PUBLIC_PATH."/img/{$filename}"))
                $url = PUBLIC_PATH.'/img/nopicture.jpg';
            else
                $url = PUBLIC_PATH."/img/{$filename}";
        } else {
            $url = PUBLIC_PATH."/img/{$filename}"
        }
        return "{$url}";
    }
}

With the aid of this helper, the call is very simple in the view:

<img src="<?php echo $this->showImage('filename.jpg'); ?>" />

Question: Does my approach pose a security threat, as my coleague states? What are the potential risks of this? And, most importantly, do the security threats, if any, outweigh the 10 seconds gain on page load?

In case it matters: we're working on a community portal with around 15K registered users, with the galleries being a very frequently used feature.

*The code I pasted is an edited, simplified version of what each of us has come up with - just to show the mechanics of both approaches.

I have a different approach: I prefer to keep the original images in a hidden location, but as soon as they are requested, copy them to a public location and provide a link to it

+1 for creativity.

Does my approach pose a security threat, as my coleague states? What are the potential risks of this? And, most importantly, do the security threats, if any, outweigh the 10 seconds gain on page load?

Sort of. Yes, if you have images only some people are allowed to see, and you're putting them into a publicly accessible directory, there is a change other people can see that image, which appears to be undesirable. I also don't think (might be wrong) that it will gain 10 seconds on a page load, as you'll have to copy the images, which is a rather intensive operation, more than using file_get_contents or readfile( ).

This is secure, but since we use Zend Framework, it's also crawling slow - each call to the image controller costs us approx 500ms of lag due to the bootstrap's queries being executed.

If I may suggest; nuke Zend Framework for this specific case. I'm using Zend Framework for a rather large website as well, so I know the bootstrap can take longer than you want. If you circumvent Zend Framework, opting for vanilla PHP, this would improve the performance significantly.

Also, use readfile( ), not file_get_contents( ). There's a big difference in that file_get_contents will load the whole file in memory before outputting, where readfile does this more efficiently.

Are most PHP frameworks actually MVA instead of MVC?

10 votes

Many PHP frameworks claim that they implement MVC design pattern. However, in their implementation, the model and view don't know each other and each communication in between must be done through controller. As I read in wikipedia, this is MVA (Model View Adapter) instead of MVC design pattern approach because in MVC, model and view communicates directly.

Those frameworks' claim are wrong or did I miss something?

Frameworks like CodeIgniter are MVA, yes. However, their claims are not wrong since MVA is basically a different type of MVC deployment. Mediating controllers are hit by users which handle the business logic; they also call to the model to get the data and prepare the view.

This is not wholly diverged from strict MVC where the Model and View can talk to each other, so to say it's "wrong" is a bit harsh. I would say it's a different take on MVC.

EDIT:

See CodeIgniter's take on it:

http://codeigniter.com/user_guide/overview/mvc.html

Models are not required as everything can be done in the controller (not advised, obviously). Note that CI (and most other frameworks) say they are based on MVC principles.

PHP 5.4's simplified string offset reading

9 votes

As many of you already know, PHP 5.4 alpha has been released. I have a question regarding the following.

Simplified string offset reading. $str[1][0] is now a legal construct.

How exactly does $str[1][0] work?

EDIT: http://php.net/releases/NEWS_5_4_0_alpha1.txt

This is a side effect, and was mentioned in the proposal here: http://php.markmail.org/thread/yiujwve6zdw37tpv

The feature is speed/optimization of string offsets.

Hi,

Recently I noticed that reading of string offset is performed in two steps. At first special string_offset variant of temporary_variable is created in zend_fetch_dimension_address_read() and then the real string value is created in _get_zval_ptr_var_string_offset().

I think we can create the real string in the first place. This makes 50% speed-up on string offset reading operation and allows to eliminate some checks and conditional brunches in VM.

The patch is attached (don't forget to regenerate zend_vm_execute.h to test it). However it changes behavior in one bogus case. The following code now will emit "b" (currently it generates a fatal error - cannot use string offset as an array).

$str = "abs";
var_dump($str[1][0]);

I think it's not a problem at all. "b" makes sense because "abs"[1] -> "b" and "b"[0] -> "b".

I'm going to commit the patch in case of no objections.

Thanks. Dmitry.

No require, no include, no url rewriting, yet the script is executed without being in the url

9 votes

I am trying to trace the flow of execution in some legacy code. We have a report being accessed with

http://site.com/?nq=showreport&action=view

This is the puzzle:

  • in index.php there is no $_GET['nq'] or $_GET['action'] (and no $_REQUEST either),
  • index.php, or any sources it includes, do not include showreport.php,
  • in .htaccess there is no url-rewriting

yet, showreport.php gets executed.

I have access to cPanel (but no apache config file) on the server and this is live code I cannot take any liberty with.

What could be making this happen? Where should I look?

Update
Funny thing - sent the client a link to this question in a status update to keep him in the loop; minutes latter all access was revoked and client informed me that the project is cancelled. I believe I have taken enough care not to leave any traces to where the code actually is ...

I am relieved this has been taken off me now, but I am also itching to know what it was!

Thank you everybody for your time and help.

There are "a hundreds" ways to parse a URL - in various layers (system, httpd server, CGI script). So it's not possible to answer your question specifically with the information you have got provided.

You leave a quite distinct hint "legacy code". I assume what you mean is, you don't want to fully read the code, understand it even that much to locate the piece of the application in question that is parsing that parameter.

It would be good however if you leave some hints "how legacy" that code is: Age, PHP version targeted etc. This can help.

It was not always that $_GET was used to access these values (same is true for $_REQUEST, they are cousins).

Let's take a look in the PHP 3 manual Mirror:

HTTP_GET_VARS

An associative array of variables passed to the current script via the HTTP GET method.

Is the script making use of this array probably? That's just a guess, this was a valid method to access these parameter for quite some time.

Anyway, this must not be what you search for. There was this often misunderstood and mis-used (literally abused) feature called register globals PHP Manual in PHP. So you might just be searching for $nq.

Next to that, there's always the request uri and apache / environment / cgi variables. See the link to the PHP 3 manual above it lists many of those. Compare this with the current manual to get a broad understanding.

In any case, you might have grep or a multi file search available (Eclipse has a nice build in one if you need to inspect legacy code inside some IDE).

So in the end of the day you might just look for a string like nq, 'nq', "nq" or $nq. Then check what this search brings up. String based search is a good entry into a codebase you don't know at all.

Is there a way to simplify this case statement?

9 votes

I have this PHP case statement

switch ($parts[count($parts) - 1]) {
    case 'restaurant_pos':
        include($_SERVER['DOCUMENT_ROOT'] . '/pages/restaurant_pos.php');
        break;
    case 'retail_pos':
    include($_SERVER['DOCUMENT_ROOT'] . '/pages/retail_pos.php');
        break;  
    .....

}

Which works great but I have many many files (like 190) and I would love to know if there is a way to make this case statement many work with anything so I dont have to do 190 case conditions. I was thinking I can use the condtion in the case and maybe see if that file is present and if so then display and if not then maybe a 404 page but i was not sure a good way to do this...any ideas would help alot

You can predefine file names in an array and then use in_array in order to check name's existence:

$files = array('restaurant_pos', 'retail_pos', ......);
$file = $parts[count($parts) - 1];
if (in_array($file, $files)) {
    include($_SERVER['DOCUMENT_ROOT'] . "/pages/$file.php");
}

Are there any issues with always preparing SQL statements with PHP?

8 votes

Is there any issue with always preparing SQL statements with PHP instead of executing them directly?

Not sure if database system matters, but it's DB2 on System i.

You might take a slight performance hit, if they are real prepared statements and not just emulated in the driver. This is because you will have to make two calls to the database, rather than just one.

Consolidate repeating pattern

8 votes

I am working on a script that develops certain strings of alphanumeric characters, separated by a dash -. I need to test the string to see if there are any sets of characters (the characters that lie in between the dashes) that are the same. If they are, I need to consolidate them. The repeating chars would always occur at the front in my case.

Examples:

KRS-KRS-454-L
would become:
KRS-454-L

DERP-DERP-545-P
would become:
DERP-545-P

<?php
$s = 'KRS-KRS-454-L';
echo preg_replace('/^(\w+)-(?=\1)/', '', $s);
?>
// KRS-454-L

This uses a positive lookahead (?=...) to check for repeated strings.

Note that \w also contains the underscore. If you want to limit to alphanumeric characters only, use [a-zA-Z0-9].

Also, I've anchored with ^ as you've mentioned: "The repeating chars would always occur at the front [...]"

Php upload and bandwidth/traffic question

7 votes

I have set upload limit to 3M in php.ini. If someone uploads a file that is 50 mb, then does the upload stop when it hits 3Mb or does it continue until the upload is complete, then reads the filesize and deletes the file?

If you're using Apache as your web server, then PHP doesn't get a chance to start until the request completes. Thus, the upload limit only comes into action after the whole upload finishes. Apache first receives the entire request, and only then does it invoke the appropriate handler (in this case, PHP). Since there is no server-side mechanism to abort a HTTP request in progress and return a response, you'll need to wait until the whole request is complete.

So, to answer your question: NO, the upload will go through in full; PHP's internal logic will check the uploaded file size, see that it's larger than the limit, and then fail immediately with an error. Your PHP script will not get a chance to run, so don't rely on runtime checks - they won't be executed at all.

PHP - Expose own size to client (so the client knows how much it is downloading)

7 votes

My PHP script is outputting the contents of a .sql file, after it has been called by a POST request from my Delphi Desktop Client.

Here is what is happening:

  1. My Desktop Client sends a POST request to my PHP Script.
  2. The Script then calls mysqldump and generates a file - xdb_backup.sql
  3. The Script then include "xdb_backup.sql"; which will print and return it to the Desktop Client, whereafter it deletes the SQL file.

The problem is, that the size of the SQL file can vary (for testing, I generated one that is 6 mb). I would like my desktop client to be able to show the progress, however the PHP script does not expose it's size, so I have no Progressbar.Max value to assign.

How can I make my PHP script let the Client know how big it is before the whole thing is over ?

Note: Downloading the SQL file is not an option, as the script has to destroy it. :)

You would do

$fsize = filesize($file_path); 

where $file_path will be path to the generated file xdb_backup.sql,

to get the filesize in server and return headers with the following line attached.

header("Content-Length: " . $fsize);

Take a look at http://www.hotscripts.com/forums/php/47774-download-script-not-sending-file-size-header-corrupt-files-since-using-remote-file-server.html which explains a download php script.

How to create website APIs

7 votes

I get a lot of clients asking me about making mobile apps that connect to their websites to retrieve data, allow users to login, etc. Most of them have PHP-based sites, but don't have any clue about making APIs to interface with them. They ask me why I can't just connect directly to their SQL databases. I don't think that's a good thing to do from a mobile app. I would prefer they have some sort of API in place.

As far as PHP-based sites go, what are the best options when it comes to implementing an API for this purpose?

Thanks, gb

You want to look into RESTful web services. Have a look at the wiki for this here. Your client's need to essentially build PHP applications that serve the underlying data of their websites through some REST-compliant data type e.g. JSON, XML, SOAP, etc. There are a number of in-built PHP functions that enable the quick conversion of PHP data structures into these formats. This would enable you to build mobile apps that make HTTP requests to get data which it can then display in it's own unique way.

An example for a JSON powered service could be as follows:

$action = $_GET['action'];
switch($action) {
  case 'get-newest-products':
     echo json_encode(getNewestProducts());
     break;
  case 'get-best-products':
     echo json_encode(getBestProducts());
     break;
  .
  .
  .
  default:
     echo json_encode(array());
     break;
}

function getNewestProducts($limit = 10) {
  $rs = mysql_query("SELECT * FROM products ORDER BY created DESC LIMIT $limit");

  $products = array();
  if (mysql_num_rows($rs) > 0) {
    while ($obj = mysql_fetch_object($rs)) {
      $products[] $obj;
    }
  }
  return $products;
}

function getBestProducts($limit = 10) {
  $rs = mysql_query("SELECT * FROM products ORDER BY likes DESC LIMIT $limit");

  $products = array();
  if (mysql_num_rows($rs) > 0) {
    while ($obj = mysql_fetch_object($rs)) {
      $products[] $obj;
    }
  }
  return $products;
}

You could query the API as follows (with mod_rewrite on) http://myapi.mywebsite.com/get-newest-products

php sql injection

7 votes

I have been surfing these days and got to know about SQL INJECTION ATTACK. i have tried to implement on my local machine to know how this can be done so that i can prevent it in my system...

i have written code like this

PHP Code :

if(count($_POST) > 0){

       $con = mysql_connect("localhost","root","") or die(mysql_error());
    mysql_select_db('acelera',$con) or die(mysql_error()); //
    echo $sql = 'SELECT * FROM acl_user WHERE user_email = "'.$_POST['email'].'" AND user_password = "'.$_POST['pass'].'"';
    $res_src = mysql_query($sql);
    while($row = mysql_fetch_array($res_src)){
        echo "<pre>";print_r($row);echo "</pre>";
    }
}

HTML CODE :

<html>
<head></head>
<body>

 EMAIL : <input type="text" name="email" id="email" /><br />
    PASWD : <input type="text" name="pass" id="pass" /><br />
    <input type="submit" name="btn_submit" value="submit email pass" />
        </body>
</html>

by this code if i give input as " OR ""=" then sql injection should get done. but it is not working properly. in post data i have addition slashes if i give above input in password field.

can any one show me how actually SQL INJECTION ATTACK can be done?(code will be more appreciable)

You probably have magic quotes enabled. Check the return value of get_magic_quotes_gpc.

"Magic quotes" is an antique attempt from PHP to auto-magically prevent SQL injection, but in current versions it has been deprecated and you are encouraged to use prepared statements to avoid SQL injection.

See here how to disable them so you can experiment with SQL injection.