Best php questions in October 2011

What is normalized UTF-8 all about?

39 votes

The ICU project (which also now has a PHP library) contains the classes needed to help normalize UTF-8 strings to make it easier to compare values when searching.

However, I'm trying to figure out what this means for applications. For example, in which cases do I want "Canonical Equivalence" instead of "Compatibility equivalence", or vis-versa?

Everything You Never Wanted to Know about Unicode Normalization

Canonical Normalization

Unicode includes multiple ways to encode some characters, most notably accented characters. Canonical normalization changes the code points into a canonical encoding form. The resulting code points should appear identical to the original ones barring any bugs in the fonts or rendering engine.

When To Use

Because the results appear identical, it is always safe to apply canonical normalization to a string before storing or displaying it, as long as you can tolerate the result not being bit for bit identical to the input.

Canonical normalization comes in 2 forms: NFD and NFC. The two are equivalent in the sense that one can convert between these two forms without loss. Comparing two strings under NFC will always give the same result as comparing them under NFD.

NFD

NFD has the characters fully expanded out. This is the faster normalization form to calculate, but the results in more code points (i.e. uses more space).

If you just want to compare two strings that are not already normalized, this is the preferred normalization form unless you know you need compatability normalization.

NFC

NFC recombines code points when possible after running the NFD algorithm. This takes a little longer, but results in shorter strings.

Compatibility Normalization

Unicode also includes many characters that really do not belong, but were used in legacy character sets. Unicode added these to allow text in those character sets to be processed as Unicode, and then be converted back without loss.

Compatibility normalization converts these to the corresponding sequence of "real" characters, and also performs canonical normalization. The results of compatibility normalization may not appear identical to the originals.

Characters that include formatting information are replaced with ones that do not. For example the character gets converted to 9. Others don't involve formatting differences. For example the roman numeral character is converted to the regular letters IX.

Obviously, once this transformation has been performed, it is no longer possible to losslessly convert back to the original character set.

When to use

The Unicode Consortium suggests thinking of compatibility normalization like a ToUpperCase transform. It is something that may be useful in some circumstances, but you should not just apply it willy-nilly.

An excellent use case would be a search engine since you would probably want a search for 9 to match .

One thing you should probably not do is display the result of applying compatibility normalization to the user.

NFKC/NFKD

Compatibility normalization form comes in two forms NFKD and NFKC. They have the same relationship as between NFD and C.

Any string in NFKC is inherently also in NFC, and the same for the NFKD and NFD. Thus NFKD(x)=NFD(NFKC(x)), and NFKC(x)=NFC(NFKD(x)), etc.

Conclusion

If in doubt, go with canonical normalization. Choose NFC or NFD based on the space/speed tradeoff applicable, or based on what is required by something you are interoperating with.

PHP 5.1 Online Codepad

20 votes

I'm looking for a PHP codepad like codepad.org or ideone.com with support for PHP 5.1.

From some question/answer here on SO some month ago I remember such a thing exists (existed?) but I was not able to retrieve the info again and my internet search did not reveal any.

Does somebody remember the name/address of a PHP 5.1 codepad?

(Codepad: Execute a PHP snippet and share it online on a webserver)

Although I like the PHP codepad listing on your blog alot I am really sorry to say that there is currently no codepad out there for PHP 5.1. Someone has to tell you the truth.

Create an A* search with PHP

15 votes

i have a map stored as a multidimensional array ($map[row][col]) and i'd wish to create a path from point A to point B.

since i can have some obstacles with turns, corners etc etc, i'd wish to use the A* search to calculate the fastest path.
so the general function is
f(x) = g(x) + h(x)

and i have all of these values. g(x) is cost of the move (and it's saved on the map); h(x) is the linear distance between A and B.

so i have everything i need, but i have a question: how can i organize everything?
i have no need to test for alternative paths, since a square on the map can be passable or not, so when i reach the target it should be the shortest one.

how can i organize everything?
i tried with multidimensional array, but i get lost.. :(

EDIT
i worked out some code, it's pretty a wall of text :)

//$start = array(28, 19), $end = array(14, 19)
//$this->map->map is a multidimensional array, everything has a cost of 1, except for 
//blocking squares that cost 99
//$this->map->map == $this->radar
//blocking square at 23-17, 22-18, 22-19, 22-20, 23-21, 19-17, 20-18,20-19,20-20,19-21
//they are like 2 specular mustache :P
function createPath($start, $end)
{
    $found = false;
    $temp  = $this->cost($start, $end);

    foreach($temp as $t){
        if($t['cost'] == $this->map->map[$end[0]][$end[1]]) $found = true;
        $this->costStack[$t['cost']][] = array('grid' => $t['grid'], 'dir' => $t['dir']);
    }

    ksort($this->costStack);

    if(!$found) {
        foreach($this->costStack as $k => $stack){
            foreach($stack as $kn => $node){
                $curNode = $node['grid'];
                unset($this->costStack[$k][$kn]);
                break;
            }

            if(!count($this->costStack[$k])) unset($this->costStack[$k]);
            break;
        }
        $this->createPath($curNode, $end);
    }
}

function cost($current, $target)
{
    $return = array();

    //$AIM  = array('n' => array(-1,  0),'e' => array( 0,  1),'s' => array( 1,  0),'w' => array( 0, -1));
    foreach($this->AIM as $direction => $offset){
        $position[0] = $current[0] + $offset[0];
        $position[1] = $current[1] + $offset[1];

        //radar is a copy of the map
        if ( $this->radar[$position[0]][$position[1]] == 'V') continue;
        else $this->radar[$position[0]][$position[1]] =  'V';

        $h = (int) $this->distance($position, $target);
        $g = $this->map->map[$position[0]][$position[1]];

        $return[] = array('grid' => $position,
                          'dir'  => $direction,
                          'cost' => $h + $g);
    }

    return $return;
}

i hope you can understand everything, i tried to be clear as much as possible.
finally i can get to my destination, expanding only cheaper nodes, but now i have a problem.
how can i turn it into directions? i have to store a stack of orders (ie n, n, e etc etc), how can i identify a path inside these values?

My structure was:

  • Have a Grid-class for holding all possible nodes (propably your array goes here)
  • Have a Node-class representing the nodes. Nodes will also calculated costs and store predecessor/g-values set by AStar
  • Have a AStar class, which will only get two nodes (e.g. startNode, endNode)
  • Have a PriorityQueue as your open-list
  • when a Node is asked (by AStar) about it's neighbors, delegated that call to Grid

I'll try to collect some code samples from a prior project, could take a while though.


Update

(found my old project ;))

It's probably not exactly what you're looking for, but maybe it's a start.

So using the files below, and mazes defined like:

00000000000000000000000
00000000000000000000000
0000000000W000000000000
0000000000W000000000000
0000000000W000000000000
0000000000W00000WWWWWWW
0000000000W000000000000
S000000000W00000000000E

(test/maze.txt)

You'll get something like this:

00000000000000000000000
0000000000X000000000000
000000000XWXX0000000000
00000000X0W00X000000000
000000XX00W000X00000000
00000X0000W0000XWWWWWWW
0000X00000W00000XXX0000
SXXX000000W00000000XXXE

index.php

error_reporting(E_ALL ^ E_STRICT);
ini_set('display_errors', 'on');

header('Content-Type: text/plain; charset="utf-8"');

// simple autoloader
function __autoload($className) {
  $path = '/lib/' . str_replace('_', '/', $className) . '.php';
  foreach (explode(PATH_SEPARATOR, get_include_path()) as $prefix) {
    if (file_exists($prefix . $path)) {
      require_once $prefix . $path;
    }
  }
}

// init maze
$maze = new Maze_Reader('test/maze.txt');

$startNode = $maze->getByValue('S', true);
$endNode = $maze->getByValue('E', true);

$astar = new AStar;
if ($astar->getPath($startNode, $endNode)) {
  do {
    if (!in_array($endNode->value(), array('S', 'E'))) {
      $endNode->value('X');
    }
  } while ($endNode = $endNode->predecessor());
}

echo $maze;

/lib/AStar.php

/**
 * A simple AStar implementation
 */
class AStar
{
  protected $openList;
  protected $closedList;

  /**
   * Constructs the astar object
   */
  public function __construct() {
    $this->openList = new PriorityQueue;
    $this->closedList = new SplObjectStorage;
  }

  public function getPath($startNode, $endNode) {
    $this->openList->insert(0, $startNode);
    while (!$this->openList->isEmpty()) {
      $currentNode = $this->openList->extract();

      if ($currentNode->equals($endNode)) {
        return $currentNode;
      }

      $this->expandNode($currentNode, $endNode);
      $this->closedList[$currentNode] = true;
    }

    return false;
  }

  protected function expandNode($currentNode, $endNode) {
    foreach ($currentNode->successors() as $successor) {
      if (isset($this->closedList[$successor])) {
        continue;
      }

      $tentative_g = $currentNode->g() + $currentNode->distance($successor);

      if ($this->openList->indexOf($successor) > -1 && $tentative_g >= $successor->g()) {
        continue;
      }

      $successor->predecessor($currentNode);
      $successor->g($tentative_g);

      $f = $tentative_g + $successor->distance($endNode);

      if ($this->openList->indexOf($successor) > -1) {
        $this->openList->changeKey($successor, $f);
        continue;
      }

      $this->openList->insert($f, $successor);
    }
  }
}

/lib/PriorityQueue.php

class PriorityQueue
{
  protected $keys = array();
  protected $values = array();

  /**
   * Helper function to swap two <key>/<value> pairs
   * 
   * @param Integer a
   * @param Integer b
   * @return Integer b
   */
  protected function swap($a, $b) {
    // swap keys
    $c = $this->keys[$a];
    $this->keys[$a] = $this->keys[$b];
    $this->keys[$b] = $c;

    // swap values
    $c = $this->values[$a];
    $this->values[$a] = $this->values[$b];
    $this->values[$b] = $c;

    return $b;
  }


  /**
   * Heapify up
   * 
   * @param Integer pos
   * @return void
   */
  protected function upHeap($pos) {
    while ($pos > 0) {
      $parent = ($pos - 1) >> 2;
      if ($this->compare($this->keys[$pos], $this->keys[$parent]) >= 0) {
        break;
      }

      $pos = $this->swap($pos, $parent);
    }
  }

  /**
   * Heapify down
   * 
   * @param Integer pos
   * @return void
   */
  protected function downHeap($pos) {
    $len = sizeof($this->keys);
    $max = ($len - 1) / 2;

    while ($pos < $max) {
      $child = 2 * $pos + 1;
      if ($child < $len - 1 && $this->compare($this->keys[$child], $this->keys[$child + 1]) > 0) {
        $child += 1;
      }
      if ($this->compare($this->keys[$pos], $this->keys[$child]) <= 0) {
        break;
      }
      $pos = $this->swap($pos, $child);
    }
  }

  /**
   * Insert an <key>/<value> pair into the queue
   * 
   * @param Object key
   * @param Object value
   * @return this
   */
  public function insert($key, $value) {
    $this->keys[] = $key;
    $this->values[] = $value;

    $this->upHeap(sizeof($this->keys) - 1);
    return $this;
  }

  /**
   * Extract the top <value>
   * 
   * @return Object
   */
  public function extract() {
    $resultValue = $this->values[0];
    $lastValue = array_pop($this->values);
    $lastKey = array_pop($this->keys);

    if (sizeof($this->keys) > 0) {
      $this->values[0] = $lastValue;
      $this->keys[0] = $lastKey;
      $this->downHeap(0);
    }

    return $resultValue;
  }

  /**
   * Changes the <key> of a <value>
   * 
   * @param Object key
   * @param Object value
   * @return this
   */
  public function changeKey($key, $value) {
    $pos = $this->indexOf($value);
    if ($pos !== false) {
      $this->keys[$pos] = $key;
      $this->upHeap($pos);
    }
    return $this;
  }


  /**
   * Returns the index of <value> or false if <value> is not in the queue
   * 
   * @return false|Int
   */
  public function indexOf($value) {
    return array_search($value, $this->values, true);
  }

  /**
   * Used to campare two <key>s.
   * 
   * @param Object a
   * @param Object b
   * @return Number
   */
  protected function compare($a, $b) {
    return $a - $b;
  }


  /**
   * Returns true if the queue is empty
   * 
   * @return Boolean
   */
  public function isEmpty() {
    return sizeof($this->keys) === 0;
  }
}

/lib/Maze/Reader.php

class Maze_Reader implements IteratorAggregate
{
  /**
   * The initial maze
   * @var string
   */
  protected $rawMaze;


  /**
   * A tow dimensional array holding the parsed maze
   * @var array
   */
  protected $map = array();


  /**
   * A flat array holding all maze nodes
   * @var array
   */
  protected $nodes = array();


  /**
   * A value map for easier access
   * @var array
   */
  protected $valueMap = array();


  /**
   * Constructs a maze reader
   * 
   * @param string $file A path to a maze file
   */
  public function __construct($file) {
    $this->rawMaze = file_get_contents($file);
    $this->parseMaze($this->rawMaze);
  }


  /**
   * Parses the raw maze into usable Maze_Nodes
   * 
   * @param string $maze
   */
  protected function parseMaze($maze) {
    foreach (explode("\n", $maze) as $y => $row) {
      foreach (str_split(trim($row)) as $x => $cellValue) {
        if (!isset($this->map[$x])) {
          $this->map[$x] = array();
        }

        if (!isset($this->valueMap[$cellValue])) {
          $this->valueMap[$cellValue] = array();
        }

        $this->nodes[] = new Maze_Node($x, $y, $cellValue, $this);;
        $this->map[$x][$y] =& $this->nodes[sizeof($this->nodes) - 1];
        $this->valueMap[$cellValue][] =& $this->nodes[sizeof($this->nodes) - 1];
      }
    }
  }

  /**
   * Returns the neighobrs of $node
   * 
   * @return array
   */
  public function getNeighbors(Maze_Node $node) {
    $result = array();

    $top = $node->y() - 1;
    $right = $node->x() + 1;
    $bottom = $node->y() + 1;
    $left = $node->x() - 1;


    // top left
    if (isset($this->map[$left], $this->map[$left][$top])) {
      $result[] = $this->map[$left][$top];
    }

    // top center
    if (isset($this->map[$node->x()], $this->map[$node->x()][$top])) {
      $result[] = $this->map[$node->x()][$top];
    }

    // top right
    if (isset($this->map[$right], $this->map[$right][$top])) {
      $result[] = $this->map[$right][$top];
    }

    // right
    if (isset($this->map[$right], $this->map[$right][$node->y()])) {
      $result[] = $this->map[$right][$node->y()];
    }

    // bottom right
    if (isset($this->map[$right], $this->map[$right][$bottom])) {
      $result[] = $this->map[$right][$bottom];
    }

    // bottom center
    if (isset($this->map[$node->x()], $this->map[$node->x()][$bottom])) {
      $result[] = $this->map[$node->x()][$bottom];
    }

    // bottom left
    if (isset($this->map[$left], $this->map[$left][$bottom])) {
      $result[] = $this->map[$left][$bottom];
    }

    // left
    if (isset($this->map[$left], $this->map[$left][$node->y()])) {
      $result[] = $this->map[$left][$node->y()];
    }

    return $result;
  }


  /**
   * @IteratorAggregate
   */
  public function getIterator() {
    return new ArrayIterator($this->nodes);
  }


  /**
   * Returns a node by value
   * 
   * @param mixed $value
   * @param boolean $returnOne
   * @param mixed $fallback
   * @return mixed
   */
  public function getByValue($value, $returnOne = false, $fallback = array()) {
    $result = isset($this->valueMap[$value]) ? $this->valueMap[$value] : $fallback;
    if ($returnOne && is_array($result)) {
      $result = array_shift($result);
    }

    return $result;
  }


  /**
   * Simple output 
   */
  public function __toString() {
    $result = array();

    foreach ($this->map as $x => $col) {
      foreach ($col as $y => $node) {
        $result[$y][$x] = (string)$node;
      }
    }

    return implode("\n", array_map('implode', $result));
  }
}

/lib/Maze/Node.php

class Maze_Node
{
  protected $x;
  protected $y;
  protected $value;
  protected $maze;

  protected $g;
  protected $predecessor;

  /**
   * @param Integer $x
   * @param Integer $y
   * @param mixed $value
   * @param Maze_Reader $maze
   */
  public function __construct($x, $y, $value, $maze) {
    $this->x = $x;
    $this->y = $y;
    $this->value = $value;
    $this->maze = $maze;
  }


  /**
   * Getter for x
   * 
   * @return Integer
   */
  public function x() {
    return $this->x;
  }


  /**
   * Getter for y
   * 
   * @return Integer
   */
  public function y() {
    return $this->y;
  }


  /**
   * Setter/Getter for g
   * 
   * @param mixed $g
   * @return mixed
   */
  public function g($g = null) {
    if ($g !== null) {
      $this->g = $g;
    }

    return $this->g;
  }


  /**
   * Setter/Getter for value
   * 
   * @param mixed $value
   * @return mixed
   */
  public function value($value = null) {
    if ($value !== null) {
      $this->value = $value;
    }

    return $this->value;
  }


  /**
   * Setter/Getter for predecessor
   * 
   * @param Maze_Node $predecessor
   * @return Maze_Node|null
   */
  public function predecessor(Maze_Node $predecessor = null) {
    if ($predecessor !== null) {
      $this->predecessor = $predecessor;
    }

    return $this->predecessor;
  }


  /**
   * simple distance getter
   * 
   * @param Maze_Node $that
   * @return Float
   */
  public function distance(Maze_Node $that) {
    if ($that->value() === 'W') {
      return PHP_INT_MAX;
    }

    return sqrt(pow($that->x() - $this->x, 2) + pow($that->y() - $this->y, 2));
  }


  /**
   * Test for equality
   * 
   * @param Maze_Node $that
   * @return boolean
   */
  public function equals(Maze_Node $that) {
    return $this == $that;
  }


  /**
   * Returns the successors of this node
   * 
   * @return array
   */
  public function successors() {
    return $this->maze->getNeighbors($this);
  }


  /**
   * For debugging
   * 
   * @return string
   */
  public function __toString() {
    return (string)$this->value;
  }
}

Is it safe to assume decoded percent-encoded URIs turn into UTF-8?

11 votes

RFC 3986 states that new URI scheme should be encoded to UTF-8 first before being percent encoded. However, this does not apply to previous URI versions.

Is it safe to assume that all multibyte, percent encoded URI turns into UTF-8 encoded string after being passed through urldecode()?

For example, if the contents of $_SERVER['REQUEST_URI'] is being percent encoded as such:

/b%C3%BCch/w%C3%B6rterb%C3%BCch

After I pass this string to urldecode(), I should have a multibyte string. But how do I know in what encoding the string is? In the above example, it's UTF-8, but is it safe to always assume so?

If it's not safe to assume so, is there a way (other than mb_detect_encoding) to detect the encoding of the string? I've checked request headers, they don't seem to have anything helpful.

Thank you for all the comments and answers! I have done some digging myself after I posted the question and would like to write it down here as a reference. Please let me know if this answer is wrong.

Skip to the end to go directly to the conclusion.

From the JETTY Docs on International Characters and Character Encoding, from the section "International characters in URLs", I found these paragraphs:

Due to the lack of a standard, different browers took different approaches to the character encoding used. Some use the encoding of the page and some use UTF-8. Some drafts were prepared by various standards bodies suggesting that UTF-8 would become the standard encoding. Older versions of jetty (eg 4.0.x series) used UTF-8 as the default in anticipation of a standard being adopted. As a standard was not forthcoming, jetty-4.1.x reverted to a default encoding of ISO-8859-1.

The W3C organization's HTML standard now recommends the use of UTF-8: http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars and accordingly jetty-6 series uses a default of UTF-8.

On the linked HTML 4.0 spec, there is indeed a recommendation for clients to encode non-ASCII characters into UTF-8 first before percent-encoding it, so we know it has been a recommendation from W3C since HTML 4.0.

The example used on the page is this:

<A href="http://foo.org/Håkon">...</A>

While it later states that the same encoding should be applied to the fragment part, it doesn't say that if it also applies to query string.

Typing URLs into browsers

Firefox

As Pekka already mentioned, based on this link Firefox sends ISO-8859-1 encoded URI as late as 2007. Reading the link, this seems to be the default behavior for Firefox < 3.0. I'm not sure if this also applies to Firefox < 3.0 in Mac OS X, since default encoding in Mac is UTF-8.

I've tested Firefox 3.6.13 in Windows XP and Firefox 6 in both Windows 7 and Mac OS X. The Mac version sends everything in UTF-8, so it's nothing to worry about.

Firefox 3.6.13 and 6 in windows encodes query strings into ISO-8859-1 by default, but when you type characters that doesn't exist in ISO-8859-1 to the query string (α, for example), Firefox 3 switches the encoding of the entire query string to UTF-8. I'm pretty sure this is the same behavior in later versions too.

In Firefox 3.6.13 and 6 in Windows that I tested, the path part of the URI is always encoded as UTF-8.

If you type this URL to Firefox 3.6/6 in Windows:

http://localhost/test/ü/ä/index.php?chär=ü

The query string gets encoded as ISO-8859-1, but the 'path' part gets encoded as UTF-8:

http://localhost//test/%C3%BC/%C3%A4/index.php?ch%E4r=%FC

Also to be noted, according to this blog post, Firefox 3.0 converts katanaka character ア into &#12450; before percent-encoding it. When I tried to do this in Firefox 3.6.13 in the query string and the path, the katanaka character gets encoded in UTF-8 correctly.

Opera

Opera 10.10 on Mac encodes the query string part of the URI into ISO-8859-1, even though the default encoding for Mac OS X is UTF-8. The 'path' part gets encoded into UTF-8, just like Firefox.

If you try to type greek alphabet α to the query string it gets sent as a question mark.

The same behavior is exhibited by Opera 11.51 in Windows XP.

Safari

Safari 5.1 on Mac always sends everything as UTF-8. Safari 5.1 in Windows exhibit the same behavior.

Chrome

Version 13 on Windows encodes both query string and path as UTF-8. I don't have Chrome on Mac, but it seems safe to assume that Chrome always sends UTF-8, like Safari.

Internet Explorer

DISCLAIMER: I use IECollection to install multiple versions of IE in one machine, so this may not be IE's natural behavior (anyone can confirm on this?).

IE 6, 7, and 8 in Windows XP encodes 'path' part of the URI into UTF-8 correctly. Umlauts and greek alphabet typed to the query string does not get percent encoded though. The query string typed to the address bar seems to be sent in ISO-8859-1, the greek alphabet alpha 'α' in the query string gets transliterated into 'a'.

Conclusion

This is short and incomplete, and I cannot guarantee the correctness of it, but it seems that the most common encodings for URIs are either ISO-8859-1 and UTF-8 (I have no idea what east asians use as their encoding, and it is too exhaustive for me to try and find out).

Since it is already a recommendation from HTML 4.0, I guess it's safe to assume the 'path' part of the URI is always encoded in UTF-8. Firefox 2.0 might still be around, so you must check if the encoding is ISO-8859-1 too. If it's not UTF-8 or ISO-8859-1, most likely it's a bad request.

It's theoretically impossible to correctly detect the encoding of of a string (see here, and here). You can guess, but you can get the wrong result. So don't rely on encoding detection.

Safe Multibyte Routing

The safest way is just to choose one encoding (UTF-8 is the safest bet) for your entire application. Then you have to:

  1. Make sure that all your strings are encoded in UTF-8 before using it to build your URI. Properly percent encode your URI after that.
  2. Make sure all your URL encoded (GET) forms sends their data in the proper encoding. See this FAQ by Kore Nordmann for more information about making sure your forms send the correct encoding.

Also see this great answer from bobince.

After this, you shouldn't have any problems parsing the URI. If the encoding is not in UTF-8, then it's a bad request, and you can respond with 404 or 400 page.

PHP Garbage Collection clarification

11 votes

From the PHP manual, session.gc_probability and session.gc_divisor state that gc will occur based on this probability. I get that.

What I'm not clear on is whether this probability is on a session by session basis or overall.

So if my probability is 1% (1/100) that GC will occur, does that mean that if one session keeps getting extended, each time there is a 1% change that specific session will be cleaned up? Or does this mean that 1% of all existing sessions (as well as new ones) will trigger GC for all other existing sessions?

I'm pretty sure it's the latter, I just want to make sure.

The purpose of this question is that on our site, I want users to have long-term sessions (6 months). If 1% of all sessions trigger GC, then that effectively removes the purpose of having that long-term session, as GC will end up occurring every hour or two.

Every time a PHP script is executes and starts session there is a probability that it will sweep through the session folder killing off old session.

Cleanup will only delete sessions which were not accessed within a certain time. However PHP does not guarantee that the session WILL be destroyed within that time.

Your long-term session strategy should work just fine, but you might want to reduce 1% to something like 0.1%

Another thing to look out for is that operating system might clean up your /tmp folder during reboot so even if PHP won't do it.

Generate random coordinates around a location

10 votes

I'd like to have a function that accepts a geo location (Latitude, Longitude) and generates random sets of coordinates around it but also takes these parameters as a part of the calculation:

  • Number Of Random Coordinates To Make
  • Radius to generate in
  • Min distance between the random coordinates in meters
  • The root coordinates to generate the locations around it.

Example of how the generation would be:

Example

What's a good approach to achieve this?

A brute force method should be good enough.

for each point to generate "n"
  find a random angle
  get the x and y from the angle * a random radius up to max radius
  for each point already generated "p"
     calculate the distance between "n" and "p"
  if "n" satisfies the min distance
     add new point "n"

In PHP, generating a new point is easy

$angle = deg2rad(mt_rand(0, 359);
$pointRadius = mt_rand(0, $radius);
$point = array(
   'x' => sin($angle) * $pointRadius,
   'y' => cos($angle) * $pointRadius
);

Then calculating the distance between two points

$distance = sqrt(pow($n['x'] - $p['x'], 2) + pow($n['y'] - $p['y'], 2));

** Edit **

For the sake of clarifying what others have said, and after doing some further research (I'm not a mathematician, but the comments did make me wonder), here the most simple definition of a gaussian distribution :

If you were in 1 dimension, then $pointRadius = $x * mt_rand(0, $radius); would be OK since there is no distinction between $radius and $x when $x has a gaussian distribution.

In 2 or more dimensions, however, if the coordinates ($x,$y,...) have gaussian distributions then the radius $radius does not have a gaussian distribution.

In fact the distribution of $radius^2 in 2 dimensions [or k dimensions] is what is called the "chi-squared distribution with 2 [or k] degrees of freedom", provided the ($x,$y,...) are independent and have zero means and equal variances.

Therefore, to have a normal distribution, you'd have to change the line of the generated radius to

$pointRadius = sqrt(mt_rand(0, $radius*$radius));

as others have suggested.

php foreach, why using pass by reference of a array is fast?

10 votes

Below is a test of php foreach loop of a big array, I thought that if the $v don't change, the real copy will not happen because of copy on write, but why it is fast when pass by reference?

Code 1:

function test1($a){
  $c = 0;
  foreach($a as $v){ if($v=='xxxxx') ++$c; }
}

function test2(&$a){
  $c = 0;
  foreach($a as $v){ if($v=='xxxxx') ++$c; }
}

$x = array_fill(0, 100000, 'xxxxx');

$begin = microtime(true);
test1($x);
$end1 = microtime(true);
test2($x);
$end2 = microtime(true);

echo $end1 - $begin . "\n";   //0.03320002555847
echo $end2 - $end1;           //0.02147388458252

But this time, using pass by reference is slow.

Code 2:

function test1($a){
  $cnt = count($a); $c = 0;
  for($i=0; $i<$cnt; ++$i)
    if($a[$i]=='xxxxx') ++$c;
}
function test2(&$a){
  $cnt = count($a); $c = 0;
  for($i=0; $i<$cnt; ++$i)
    if($a[$i]=='xxxxx') ++$c;
}
$x = array_fill(0, 100000, 'xxxxx');

$begin = microtime(true);
test1($x);
$end1 = microtime(true);
test2($x);
$end2 = microtime(true);

echo $end1 - $begin . "\n";   //0.024326801300049
echo $end2 - $end1;           //0.037616014480591

Can someone explain why passing by reference is fast in code1 but slow in code2?

Edit: With Code 2, the count($a) makes the main difference, so the time of the loop took is almost the same.

I thought that if the $v don't change [foreach($a as $v)], the real copy will not happen because of copy on write, but why it is fast when pass by reference?

The impact is not on $v but on $a, the huge array. You either pass it as value or as reference to the function. Inside the function it's then value (test1) or reference (test2).

You have two codes (code 1 and code 2).

Code 1: Is using foreach. With foreach you've got two options: iterate over a value or a reference (Example). When you iterate over a value, the iteration is done on a copy of the value. If you iterate over a reference, no copy is done.

As you use the reference in test2, it's faster. The values do not need to be copied. But in test1, you pass the array as value, the array gets copied.

Code 2: Is using for. For does nothing actually here. In both cases. You access the variable and read value from the array. That's pretty much the same regardless if it's a reference or a copy (thanks to the copy on write optimization in PHP).

You might now wonder, why there is a difference in code 2. The difference is not because of for but because of count. If you pass a reference to count PHP internally creates a copy of it because it count needs a copy, not a reference.

Read as well: Do not use PHP references by Johannes Schlüter


I've compiled a set of tests as well. But I more specifically put code into the test functions.

  • Blank - What's the difference in calling the function?
  • Count - Does count make a difference?
  • For - What happens with foronly (not count)?
  • Foreach - Just foreach - even breaking on first element.

Every test is in two versions, one called _copy (passing the array as copy into the function) and one called _ref (passing the array as reference).

It's not always that these micro-benchmarks tell you the truth, but if you're able to isolate specific points, you can quite well do an educated guess, for example that not for but count had the impact:

function blank_copy($a){
}
function blank_ref(&$a){
}
function foreach_copy($a){
    foreach($a as $v) break;
}
function foreach_ref(&$a){
    foreach($a as $v) break;
}
function count_copy($a){
  $cnt = count($a);
}
function count_ref(&$a){
  $cnt = count($a);
}
function for_copy($a){
    for($i=0;$i<100000;$i++)
        $a[$i];
}
function for_ref(&$a){
    for($i=0;$i<100000;$i++)
        $a[$i];
}

$tests = array('blank_copy', 'blank_ref', 'foreach_copy', 'foreach_ref', 'count_copy', 'count_ref', 'for_copy', 'for_ref');


$x = array_fill(0, 100000, 'xxxxx');
$count = count($x);
$runs = 10;

ob_start();

for($i=0;$i<10;$i++)
{
    shuffle($tests);
    foreach($tests as $test)
    {
        $begin = microtime(true);
        for($r=0;$r<$runs;$r++)
            $test($x);
        $end = microtime(true);
        $result = $end - $begin;
        printf("* %'.-16s: %f\n", $test, $result);
    }
}

$buffer = explode("\n", ob_get_clean());
sort($buffer);
echo implode("\n", $buffer);

Output:

* blank_copy......: 0.000011
* blank_copy......: 0.000011
* blank_copy......: 0.000012
* blank_copy......: 0.000012
* blank_copy......: 0.000012
* blank_copy......: 0.000015
* blank_copy......: 0.000015
* blank_copy......: 0.000015
* blank_copy......: 0.000015
* blank_copy......: 0.000020
* blank_ref.......: 0.000012
* blank_ref.......: 0.000012
* blank_ref.......: 0.000014
* blank_ref.......: 0.000014
* blank_ref.......: 0.000014
* blank_ref.......: 0.000014
* blank_ref.......: 0.000015
* blank_ref.......: 0.000015
* blank_ref.......: 0.000015
* blank_ref.......: 0.000015
* count_copy......: 0.000020
* count_copy......: 0.000022
* count_copy......: 0.000022
* count_copy......: 0.000023
* count_copy......: 0.000024
* count_copy......: 0.000025
* count_copy......: 0.000025
* count_copy......: 0.000025
* count_copy......: 0.000026
* count_copy......: 0.000031
* count_ref.......: 0.113634
* count_ref.......: 0.114165
* count_ref.......: 0.114390
* count_ref.......: 0.114878
* count_ref.......: 0.114923
* count_ref.......: 0.115106
* count_ref.......: 0.116698
* count_ref.......: 0.118077
* count_ref.......: 0.118197
* count_ref.......: 0.123201
* for_copy........: 0.190837
* for_copy........: 0.191883
* for_copy........: 0.193080
* for_copy........: 0.194947
* for_copy........: 0.195045
* for_copy........: 0.195944
* for_copy........: 0.198314
* for_copy........: 0.198878
* for_copy........: 0.200016
* for_copy........: 0.227953
* for_ref.........: 0.191918
* for_ref.........: 0.194227
* for_ref.........: 0.195952
* for_ref.........: 0.196045
* for_ref.........: 0.197392
* for_ref.........: 0.197730
* for_ref.........: 0.201936
* for_ref.........: 0.207102
* for_ref.........: 0.208017
* for_ref.........: 0.217156
* foreach_copy....: 0.111968
* foreach_copy....: 0.113224
* foreach_copy....: 0.113574
* foreach_copy....: 0.113575
* foreach_copy....: 0.113879
* foreach_copy....: 0.113959
* foreach_copy....: 0.114194
* foreach_copy....: 0.114450
* foreach_copy....: 0.114610
* foreach_copy....: 0.118020
* foreach_ref.....: 0.000015
* foreach_ref.....: 0.000016
* foreach_ref.....: 0.000016
* foreach_ref.....: 0.000016
* foreach_ref.....: 0.000018
* foreach_ref.....: 0.000019
* foreach_ref.....: 0.000019
* foreach_ref.....: 0.000019
* foreach_ref.....: 0.000019
* foreach_ref.....: 0.000020

Is it possible to reference an anonymous function from within itself in PHP?

10 votes

I'm trying to do something like the following:

// assume $f is an arg to the wrapping function
$self = $this;
$func = function() use($f, $ctx, $self){

    $self->remove($func, $ctx); // I want $func to be a reference to this anon function

    $args = func_get_args();
    call_user_func_array($f, $args);
};

Is it possible to reference the function assigned to $func from with the same function?

Try doing

$func = function() use (/*your variables,*/ &$func) {
    var_dump($func);
    return 1;
};

http://codepad.viper-7.com/cLd3Fu

jquery Autocomplete working with older versions of the browsers but not new ones?

9 votes

here is the JSON data for my auto complete

{ "list" : [ {
    "genericIndicatorId" : 100,
    "isActive" : false,
    "maxValue" : null,
    "minValue" : null,
    "modificationDate" : 1283904000000,
    "monotone" : 1,
    "name":"Abbau",
    "old_name" : "abbau_change_delete_imac",
    "position" : 2,
    "systemGraphics" : "000000",
    "unitId" : 1,
    "valueType" : 1,
    "description" : "Abbau",
    "weight" : 1
}]}

and the code which i wrote is

$("#<portlet:namespace />giName").autocomplete({
            source :`enter code here` function( request, response ) {
                $.post(
                    "<%=AJAXgetGIs%>",
                    {
                        "<%=Constants.INDICATOR_NAME%>" : request.term,
                        "<%=Constants.SERVICE_ID%>" : <%=serviceId%>
                    },
                    function( data ) {
                        response( $.map( data.list, function( item ) {
                                //alert(item.name + " || " + item.genericIndicatorId);
                                item.value = item.name;
                            return item;
                        }));
                    },
                    "json"
                );
            },
            minLength : 2

i am using jquery-ui-1.8.14.autocomplete.min.js plugin for auto complete the problem i am getting is it is not showing all the matched results in new browsers. for example if i type "an" in which should matches to the "anzahl" keyword, the fire bug is showing error like "bad control character literal in a string". results are showing for the letters "as,sa....". any help would be appriciated thank you

The error message means you have control characters in your JSON response (something like \n, \t, etc). Newlines and the other control characters are not allowed in JSON strings, according to ECMA262 5ed. You can fix it rather easily by escaping or removing those characters, either from PHP or from Javascript.

Here you can find an example of how you can fix it from PHP, as the problem most likely comes from json_encode (which I assume you're using): http://codepad.org/Qu7uPt0E As you can see, json_encode doesn't escape the \n so you have to do it manually before outputting.

Now for the mistery related to older browsers. If you look at jQuery's parseJSON function you'll notice that it first tries to parse the string with the browser's builtin JSON object and if it doesn't find any, it will just do a (sort of) eval (which will work even with newlines). So it probably works for you on Firefox < 3.5 or IE < 8 which don't have a native JSON object. Also, it probably works with other search terms (like as, etc) simply because they don't include a result which has control characters.

Consequences of not using server side validation?

Asked on Wed, 05 Oct 2011 by kht php
9 votes

What are the consequences of not validating a simple email form on the server.

Keep in mind that:

  • javascript validation is being carried out
  • there is no database in question, this is a simple email form

The PHP code I would like to use is this:

<?php
    $post_data = filter_input_array( INPUT_POST, FILTER_SANITIZE_SPECIAL_CHARS );

    $full_name = $post_data["full_name"];
    $email_address = $post_data["email_address"];
    $gender = $post_data["gender"];
    $message = $post_data["message"];

    $formcontent = "Full Name: $full_name \nEmail Address: $email_address \nGender: $gender \nMessage: $message \n";
    $formcontent = wordwrap($formcontent, 70, "\n", true);

    $recipient = "myemail@address.com"; $subject = "Contact Form"; $mailheader = "From: $email_address \r\n";

    mail($recipient, $subject, $formcontent, $mailheader);

    echo 'Thank You! - <a href="#"> Return Home</a>'; 
?>

Would a simple captcha solve the issue of security?

UPDATE:

A few questions I would really like answered: If I am not worried about invalid data being sent, what is the absolute minimum I can do to improve security. Basically avoid disasters.

I should probably mention that this code is being generated in a form generator and I would like to avoid my users getting attacked. Spamming might be sorted by adding Captcha.

UPDATE: What is the worst case scenario?

UPDATE: Really appreciate all the answers!

A couple of things I plan to do:

  • add this as Alex mentioned: filter_var("$post_data['email_address']", FILTER_VALIDATE_EMAIL);

  • add simple captcha

If I did add simple server side validation, what should I validate for? Cant the user still send invalid data even if I am validating it?

Also, will the above stop spam?

In general if you are just playing around and don't care, you don't need validation at all. Having client-side validation is pointless and you will just be wasting your time. The client-side only approach will get you in trouble. You can't trust your users that much.

If you plan to actually release this or really use it on a live environment, you must have a server side validation. It is well worth the time since this is a simple form now, but it may grow to be much more than that. In addition, if you take care of your validation now, you can reuse it later with other components of your application/site. If you try thinking if terms of reusability you will save your self countless of hours of development.

There are also obvious issues such as injections and javaScript issues, as mentioned by other users. In addition, a simple CAPTCHA does not cut it anymore. There are some nice resource regarding CAPTCHA.

Take a look at those.

Coding Horror

Decapther

So the simple answer of your questions is that you are certainly vulnerable in your current situation. I know that more development takes more time, but if you follow good development practices such as reusability and orthogonal/modular design you can save yourself a lot of time and still produce robust applications.

Good luck!

UPDATE: You can add FILTER_VALIDATE_EMAIL to take care of the email validation and you can read more about the email injection and how to take care of it here: damonkohler. As for the CAPTCHA, it could solve the problem, but it really depends on how valuable of a target your form/site is. I would recommend using non-linear transforms or something that is widely used and proven. If you are writing your own you may get yourself in trouble.

Summary:

  1. Validate Email
  2. Still make sure you are save from injections
  3. Make sure the CAPTCHA is strong enough
  4. Really Consider server-side validation

UPDATE: @kht Did you get your questions answered? Let us know if something was unclear. Good Luck!

UPDATE: OK, I think we have made you a bit confused here with this whole client-side/server-side fiasco. I will try to break it down now so it makes more sense. The first part explains some basic concepts, and the second answers your questions.

First, PHP is a server-side language. It runs on the server and when a page request is sent, the server will "run" the PHP script, make any changes to the requested page, and then send it to the user who is requesting the page. The user has no access/control over that PHP script. On the contrary, as discussed earlier, the client-side scripts, such as JavaScript can be manipulated. However, just because you have some PHP script running and checking something on a form, that does not mean that the form is secure. It only means that you are doing some server-side processing of the form. Having it there, and making it secure are two different things as I am sure you have already figured out.

Now when we say that you need server-side validation we mean that you need a good one. Also, in this hectic Q&A format nobody really mentioned that there is a difference between validating data and sanitizing data.

sanitizing - making the data meet some criteria

validating - checking if the data meets a criteria

Take a look at phpnightly for a better explanation and examples. There are also some nice simple tutorials describing how to create basic validation of a form.

nettuts

Very basic, but you should get the idea.

So how do you approach your current problem?

  1. To begin with, you should keep what you have in terms or client-side validation and add the CAPTCHA as you mentioned(check my post or you can research some good ones).

  2. What should you validate?

    a. you should validate the data: all fields such as email, name, subject...

    • check if the data matches what you expected: is the filed empty?; is it an email?; does it contain numbers?; etc. You can validate the data on the server side for the same things you are validating it on the user side. The only difference is that the client cannot manipulate that validation.

    b. you could sanitize the data as well

    • make it lower case and compare it, trim it, or even cast it into a type if you need to. If you have time to check it out, the article from phpnighty has a decent explanation of the two and when not to use both.
  3. Can the users still send invalid data?

    • sure they can, but now they have no access to the validation algorithm, they can't just disable it or go around it.(strictly speaking)
    • when the data is invalid or malicious, just inform the user that there has been an error and make them do it again. That is the point of the server-side validation, you can prevent the user from circumventing the rules, and you can alert them that their input is not valid
    • be very careful with the error messages too; don't reveal too much of the rules you are using for validation to your user, just inform them what you are expecting
  4. Also, will the above stop spam? If you make sure the form is not vulnerable to email injections, you have client-side validation, CAPTCHA, and server-side validation of some form(it does not have to be super complex) it will stop spam.(keep in my that today's great solution is not so great tomorrow)

  5. Why the hell do I need that server-side bull* when my client-side validation works just fine?* Think of it as having a safety net. If a spammer goes around the client-side security, the server-side security will still be there.

This validation thing sounds like a lot of work, but it is actually pretty simple. Take a look at the tutorial I included and I am sure the code will make things click. If you make sure no unwanted information is being sent through the form, and the clients cannot manipulate the form to send to more than one email, then you are pretty much safe.

I just wrote this one out the top of my head, so if it is confusing just put some more questions or shoot me a message. Good Luck!

PHP pointer and variable conflict

9 votes

I have a question about PHP and the use of pointers and variables.

The following code produces something I wouldn't have expected:

<?php
$numbers = array('zero', 'one', 'two', 'three');

foreach($numbers as &$number)
{
  $number = strtoupper($number);
}

print_r($numbers);

$texts = array();
foreach($numbers as $number)
{
  $texts[] = $number;
}

print_r($texts);
?>

The output is the following

Array
(
    [0] => ZERO
    [1] => ONE
    [2] => TWO
    [3] => THREE
)
Array
(
    [0] => ZERO
    [1] => ONE
    [2] => TWO
    [3] => TWO
)

Notice the 'TWO' appearing twice in the second array.

It seems that there is a conflict between the two foreach loops, each declaring a $number variable (once by reference and the second by value).

But why ? And why does it affect only the last element in the second foreach ?

The key point is that PHP does not have pointers. It has references, which is a similar but different concept, and there are some subtle differences.

If you use var_dump() instead of print_r(), it's easier to spot:

$collection = array(
    'First',
    'Second',
    'Third',
);

foreach($collection as &$item){
    echo $item . PHP_EOL;
}

var_dump($collection);

foreach($collection as $item){
    var_dump($collection);
    echo $item . PHP_EOL;
}

... prints:

First
Second
Third
array(3) {
  [0]=>
  string(5) "First"
  [1]=>
  string(6) "Second"
  [2]=>
  &string(5) "Third"
}
array(3) {
  [0]=>
  string(5) "First"
  [1]=>
  string(6) "Second"
  [2]=>
  &string(5) "First"
}
First
array(3) {
  [0]=>
  string(5) "First"
  [1]=>
  string(6) "Second"
  [2]=>
  &string(6) "Second"
}
Second
array(3) {
  [0]=>
  string(5) "First"
  [1]=>
  string(6) "Second"
  [2]=>
  &string(6) "Second"
}
Second

Please note the & symbol that's left in the last array item.

To sum up, whenever you use references in a loop, it's good practice to remove them at the end:

<?php

$collection = array(
    'First',
    'Second',
    'Third',
);

foreach($collection as &$item){
    echo $item . PHP_EOL;
}
unset($item);

var_dump($collection);

foreach($collection as $item){
    var_dump($collection);
    echo $item . PHP_EOL;
}
unset($item);

... prints the expected result every time.

Optimising Database Structure

8 votes

I'm developing a reward system for our VLE which uses three separate technologies - JavaScript for most of the client-side/display processing, PHP to communicate with the database and MySQL for the database itself.

I've attached three screenshots of my "transactions" table. Its structure, a few example records and an overview of its details.

The premise is that members of staff award points to students for good behaviour etc. This can mean that classes of 30 students are given points at a single time. Staff have a limit of 300 points/week and there are around 85 staff currently accessing the system (this may rise).

The way I'm doing it at the moment, every "transaction" has a "Giver_ID" (the member of staff awarding points), a "Recipient_ID" (the student receiving the points), a category and a reason. This way, every time a member of staff issues 30 points, I'm putting 30 rows into the database.

This seemed to work early on, but within three weeks I already have over 12,000 transactions in the database.

At this point it gets a bit more complicated. On the Assign Points page (another screenshot attached), when a teacher clicks into one of their classes or searches for an individual student, I want the students' points to be displayed. The only way I can currently do this on my system is to do a "SELECT * FROM 'transactions'" and put all the information into an array using the following JS:

var Points = { "Recipient_ID" : "0", "Points" : "0" };

function getPoints (data) {
    for (var i = 0; i < data.length; i++) {
        if (Points[data[i].Recipient_ID]) {
            Points[data[i].Recipient_ID] = parseInt(Points[data[i].Recipient_ID]) + parseInt(data[i].Points);
        } else {
            Points[data[i].Recipient_ID] = data[i].Points;
        }
    }
}

When logging in to the system internally, this appears to work quickly enough. When logging in externally however, this process takes around 20 seconds, and thus doesn't display the students' points values until you've clicked/searched a few times.

I'm using the following code in my PHP to access these transactions:

function getTotalPoints() {
    $sql = "SELECT * 
        FROM `transactions`";

    $res = mysql_query($sql);
    $rows = array(); 
    while($r = mysql_fetch_assoc($res)) {
        $rows[] = $r;
    }

    if ($rows) {
        return $rows;
    } else {
        $err = Array("err_id" => 1);
        return $err;
    }
}

So, my question is, how should I actually be approaching this? Full-text indices; maybe a student table with their total points values which gets updated every time a transaction is entered; mass-transactions (i.e. more than one student receiving the same points for the same category) grouped into a single database row? These are all things I've contemplated but I'd love someone with more DB knowledge than myself to provide enlightenment.

Example records Example records

Table structure Table structure

Table overview Table overview

Assign Points interface Assign Points interface

Many thanks in advance.

Your problem is your query:

SELECT * FROM `transactions`

As your data set gets bigger, this will take longer to load and require more memory to store it. Rather determine what data you need specifically. If it's for a particular user:

SELECT SUM(points) FROM `transactions` WHERE Recipient_ID=[x]

Or if you want all the sums for all your students:

SELECT Recipient_ID, SUM(points) AS Total_Points FROM `transactions` GROUP BY Recipient_ID;

To speed up selections on a particular field you can add an index for that field. This will speed up the selections, especially as the table grows.

ALTER TABLE `transactions` ADD INDEX Recipient_ID (Recipient_ID);

Or if you want to display a paginated list of all the entries in transactions:

SELECT * FROM `transactions` LIMIT [page*num_records_per_page],[num_records_per_page];

e.g.: SELECT * FROM `transactions` LIMIT 0,25 ORDER BY Datetime; # First 25 records

need assistance with sql injection

8 votes

First, I'm not trying to hack or do anything illegal. Thought I let you guys know. I have a client that want's me to do some modifications on his system, when I was looking at it I notice that NOTHING was escaped. I'm not joking, nothing is being escaped. I explained to him that it's insecure to have a system like that. He then proceeds to tell me that he's had his system like this for few years and nothing has happened. I need to show him that his system is not safe, but I really don't know to do perform an sql injection. Here's a few queries that use $_GET and are not escaped.

SELECT *,DATE_FORMAT(joined,'%M %d, %Y') as \"Joined\" FROM `members` WHERE `name` LIKE '".$ltr."%' ORDER BY points DESC LIMIT $page,50

Here's another one:

SELECT * FROM groups WHERE id=$thisladder[grid]

The only thing that I see that "might" clean the $_GET is this function:

if (!ini_get('register_globals')) {
   $superglobals = array($_SERVER, $_ENV,
       $_FILES, $_COOKIE, $_POST, $_GET);
   if (isset($_SESSION)) {
       array_unshift($superglobals, $_SESSION);
   }
   foreach ($superglobals as $superglobal) {
       extract($superglobal, EXTR_SKIP);
   }
}

It's possible that the function above may be sanitizing the variables. And yes, the system also uses register globals, which is also bad.

I also made a backup, just in case.

Can't say it better than http://xkcd.com/327/.

But then again, as Marc B says, forget SQL injection, register_globals is much, much worse. Never thought I'd actually see it emulated, just in case it's off.

Reverse image archive : stacking images from bottom to top with CSS / Javascript?

7 votes

Wondering if anyone has a solution for this.
I would like to present an archive of thumbnail images oldest at the bottom and newest at the top. I would also like the flow itself to be reversed... something like this:

reverse archive

The page should be right aligned, with future images added to the top of the page. I am creating the page dynamically with PHP pulling image filenames from a MySQL DB. The catch here is I would love this layout to be fluid, meaning most PHP tricks for counting images and building the HTML accordingly go out the window.

Is there a way to do this with Javascript or even just CSS?

See: http://jsfiddle.net/thirtydot/pft6p/

This uses float: right to order the divs as required, then transform: scaleY(-1) flips the entire container, and lastly transform: scaleY(-1) again flips each individual image back.

It will work in IE9 and greater and all modern browsers.

CSS:

#container, #container > div {
    -webkit-transform: scaleY(-1);
       -moz-transform: scaleY(-1);
        -ms-transform: scaleY(-1);
         -o-transform: scaleY(-1);
            transform: scaleY(-1);
}

#container {
    background: #ccc;
    overflow: hidden;
}
#container > div {
    float: right;
    width: 100px;
    height: 150px;
    border: 1px solid red;
    margin: 15px;
    font-size: 48px;
    line-height: 150px;
    text-align: center;
    background: #fff;
}

HTML:

<div id="container">
    <div>1</div>
    <div>2</div>
    <div>3</div>
    ..
</div>

Slow cronjobs on Cent OS 5

7 votes

I have 1 cronjob that runs every 60 minutes but for some reason, recently, it is running slow.

Env: centos5 + apache2 + mysql5.5 + php 5.3.3 / raid 10/10k HDD / 16gig ram / 4 xeon processor

Here's what the cronjob do:

  1. parse the last 60 minutes data

    a) 1 process parse user agent and save the data to the database

    b) 1 process parse impressions/clicks on the website and save them to the database

  2. from the data in step 1

    a) build a small report and send emails to the administrator/bussiness

    b) save the report into a daily table (available in the admin section)

I see now 8 processes (the same file) when I run the command ps auxf | grep process_stats_hourly.php (found this command in stackoverflow)

Technically I should only have 1 not 8.

Is there any tool in Cent OS or something I can do to make sure my cronjob will run every hour and not overlapping the next one?

Thanks

Your hardware seems to be good enough to process this.

1) Check if you already have hanging processes. Using the ps auxf (see tcurvelo answer), check if you have one or more processes that takes too much resources. Maybe you don't have enough resources to run your cronjob.

2) Check your network connections: If your databases and your cronjob are on a different server you should check whats the response time between these two machines. Maybe you have network issues that makes the cronjob wait for the network to send the package back.

You can use: Netcat, Iperf, mtr or ttcp

3) Server configuration Is your server is configured correctly? Your OS, MySQL are setup correctly? I would recommend to read these articles:

http://www3.wiredgorilla.com/content/view/220/53/

http://www.vr.org/knowledgebase/1002/Optimize-and-disable-default-CentOS-services.html

http://dev.mysql.com/doc/refman/5.1/en/starting-server.html

http://www.linux-mag.com/id/7473/

4) Check your database: Make sure your database has the correct indexes and make sure your queries are optimized. Read this article about the explain command

If a query with few hundreds thousands of record takes times to execute that will affect the rest of your cronjob, if you have a query inside a loop, even worse.

Read these articles:

http://dev.mysql.com/doc/refman/5.0/en/optimization.html

http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/

http://blog.fedecarg.com/2008/06/12/10-great-articles-for-optimizing-mysql-queries/

5) Trace and optimized PHP code? Make sure your PHP code runs as fast as possible.

Read these articles:

http://phplens.com/lens/php-book/optimizing-debugging-php.php

http://code.google.com/speed/articles/optimizing-php.html

http://ilia.ws/archives/12-PHP-Optimization-Tricks.html

A good technique to validate your cronjob is to trace your cronjob script: Based on your cronjob process, put some debug trace including how much memory, how much time it took to execute the last process. eg:

<?php

echo "\n-------------- DEBUG --------------\n";
echo "memory (start): " . memory_get_usage(TRUE) . "\n";

$startTime = microtime(TRUE);
// some process
$end = microtime(TRUE);

echo "\n-------------- DEBUG --------------\n";
echo "memory after some process: " . memory_get_usage(TRUE) . "\n";
echo "executed time: " . ($end-$start) . "\n";

By doing that you can easily find which process takes how much memory and how long it takes to execute it.

6) External servers/web service calls Is your cronjob calls external servers or web service? if so, make sure these are loaded as fast as possible. If you request data from a third-party server and this server takes few seconds to return an answer that will affect the speed of your cronjob specially if these calls are in loops.

Try that and let me know what you find.

How should this Many-To-Many doctrine2 association be defined?

6 votes

I have two Entities - Users & Challenges. A User can participate in many challenges and a challenge can have many participants (users). I began approaching this problem by creating a Many-To-Many relationship on my Users class:

/**
     * @ORM\ManytoMany(targetEntity="Challenge")
     * @ORM\JoinTable(name="users_challenges",joinColumns={@ORM\JoinColumn(name="user_id",referencedColumnName="id")},
     * inverseJoinColumns={@ORM\JoinColumn(name="challenge_id",referencedColumnName="id")})
     *
     */

    protected $challenges;

However, I then realised that I need to store a distance attribute against a user/challenge combination (how far the user has travelled in their challenge). The Doctrine2 docs state:

"Why are many-to-many associations less common? Because frequently you want to associate additional attributes with an association, in which case you introduce an association class. Consequently, the direct many-to-many association disappears and is replaced by one-to-many/many-to-one associations between the 3 participating classes."

So my question is what should these associations be between User, Challenge and UsersChallenges?

UPDATE

See comment to first answer for links to the Entity code. I have a controller method below which always creates a new UsersChallenges record rather than updating an existing one (which is what I want)

public function updateUserDistanceAction()
    {

      $request = $this->getRequest();
      $distance = $request->get('distance');
      $challenge_id = $request->get('challenge_id');


      if($request->isXmlHttpRequest()) {

        $em = $this->getDoctrine()->getEntityManager();
        $user = $this->get('security.context')->getToken()->getUser();
        $existingChallenges = $user->getChallenges();


        $challengeToUpdate = $em->getRepository('GymloopCoreBundle:Challenge')
                                ->find( (int) $challenge_id);

        if(!$challengeToUpdate) {

          throw $this->createNotFoundException('No challenge found');
        }

//does the challengeToUpdate exist in existingChallenges? If yes, update UsersChallenges with the distance
//if not, create a new USersChallenges object, set distance and flush

        if ( !$existingChallenges->isEmpty() && $existingChallenges->contains($challengeToUpdate)) {

          $userChallenge = $em->getRepository('GymloopCoreBundle:UsersChallenges')
                              ->findOneByChallengeId($challengeToUpdate->getId());

          $userChallenge->setDistance( $userChallenge->getDistance() + (int) $distance );
          $em->flush();

        } else {

          $newUserChallenge = new UsersChallenges();
          $newUserChallenge->setDistance($distance);
          $newUserChallenge->setChallenge($challengeToUpdate);
          $newUserChallenge->setUser($user);
          $user->addUsersChallenges($newUserChallenge);
          $em->persist($user);
          $em->persist($newUserChallenge);
          $em->flush();

        }

        //if success
        return new Response('success');

       //else

      }

    }

i belive you want a $em->persist($userChallenge); before $em->flush();

$userChallenge->setDistance( $userChallenge->getDistance() + (int) $distance );
$em->persist($userChallenge);
$em->flush();

However i am not sure if it solves your problem.

Can you post isEmpty and contains function in existingChallenges

Remove <p><strong><br /> &nbsp;</strong></p> with XPATH

6 votes

I use xpath to remove <p>&nbsp;</p>

    $nodeList = $xpath->query("//p[text()=\"\xC2\xA0\"]"); # &nbsp;
    foreach($nodeList as $node) 
    {
        $node->parentNode->removeChild($node);
    }

but it does not remove this,

<p><strong><br /> &nbsp;</strong></p>

or this kind,

<p><strong>&nbsp;</strong></p>

How can I remove them?

Or maybe a regex that I should use?

Try with

$nodeList = $xpath->query("//p[normalize-space(.)=\"\xC2\xA0\"]"); # &nbsp;
foreach($nodeList as $node) 
{
    $node->parentNode->removeChild($node);
}

Quoting from the docs

The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space.

Viewing PHP File Output as XML in Firefox (and other browsers)

5 votes

I have a PHP file which echoes an XML output. However, whenever I go to view that PHP file through Firefox (I haven't tried IE or Chrome), I get a raw text output like seen in this image:

http://numberonekits.com/Screenshot.png

It seems to me that, since the file is a PHP file, Firefox interprets it as such and does not try to display it as an XML tree, despite its XML heading. I am aware that one solution would be to send the output to a separate file with a .xml extension, but I know there must be an easier way. I guess what I am really trying to find out is how to get Firefox to recognize the XML format and display it as it should. Any help is greatly appreciated.

Tell the browser you are sending it an XML file:

header('Content-Type: text/xml');

Are mysql statements atomic?

5 votes

For example, I have a row with a column C1 value = 'clean', and two different computers run this query:

UPDATE T1
  SET C1 = 'dirty'
WHERE id = 1

at the same time.

Without using transactions, is it guaranteed that the value of mysql_affected_rows() would be 1 for one client and 0 for the other?

Yes and No :-)

In both cases, the access is serialised (assuming you're using a transactional engine like InnoDB) since they hit the same row, so they won't interfere with each other. In other words, the statements are atomic.

However, the affected row count actually depends on your configuration set when you open the connection. The page for mysql_affected_rows() has this to say (my bold):

For UPDATE statements, the affected-rows value by default is the number of rows actually changed. If you specify the CLIENT_FOUND_ROWS flag to mysql_real_connect() when connecting to mysqld, the affected-rows value is the number of rows "found"; that is, matched by the WHERE clause.

And from the mysql_real_connect page:

CLIENT_FOUND_ROWS: Return the number of found (matched) rows, not the number of changed rows.

So, in terms of what happens with CLIENT_FOUND_ROWS being configured, the affected rows for:

UPDATE T1 SET C1 = 'dirty' WHERE id = 1

have nothing to do with whether the data is changed, only what rows matched. This would be 1 for both queries.

On the other hand, if CLIENT_FOUND_ROWS was not set, the second query would not actually be changing the row (since it's already populated with 'dirty') and would have a row count of zero.

If you wanted the same behaviour regardless of that setting (only showing changes), you could use something like:

UPDATE T1 SET C1 = 'dirty' WHERE id = 1 AND C1 <> 'dirty'

How preg_match_all() processes strings?

5 votes

I'm still learning a lot about PHP and string alteration is something that is of interest to me. I've used preg_match before for things like validating an email address or just searching for inquiries.

I just came from this post What's wrong in my regular expression? and was curious as to why the preg_match_all function produces 2 strings, 1 w/ some of the characters stripped and then the other w/ the desired output.

From what I understand about the function is that it goes over the string character by character using the RegEx to evaluate what to do with it. Could this RegEx have been structured in such a way as to bypass the first array entry and just produce the desired result?

and so you don't have to go to the other thread

$str = 'text^name1^Jony~text^secondname1^Smith~text^email1^example-
        free@wpdevelop.com~';

preg_match_all('/\^([^^]*?)\~/', $str, $newStr);

for($i=0;$i<count($newStr[0]);$i++)
{
    echo $newStr[0][$i].'<br>';
}

echo '<br><br><br>';

for($i=0;$i<count($newStr[1]);$i++)
{
    echo $newStr[1][$i].'<br>';
} 

This will output

^Jony~
^Smith~
^example-free@wpdevelop.com~


Jony
Smith
example-free@wpdevelop.com

I'm curious if the reason for 2 array entries was due to the original sytax of the string or if it is the normal processing response of the function. Sorry if this shouldn't be here, but I'm really curious as to how this works.

thanks, Brodie

It's standard behavior for preg_match and preg_match_all - the first string in the "matched values" array is the FULL string that was caught by the regex pattern. The subsequent array values are the 'capture groups', whose existence depends on the placement/position of () pairs in the regex pattern.

In your regex's case, /\^([^^]*?)\~/, the full matching string would be

^   Jony    ~
|     |     |
^  ([^^]*?) ~   -> $newstr[0] = ^Jony~
                -> $newstr[1] = Jony (due to the `()` capture group).