Best flask questions in March 2012

What is the best way to get a semi long unique id (non sequential) key for Database objects

7 votes

Iam building a web app and I would like my URL scheme to look something like this:

someurl.com/object/FJ1341lj

Currently I just use the primary key from my SQL Alchemy objects, but the problem is that I dont want the Urls to be sequential or low numbers. For instance my URLs look like this:

someurl.com/object/1
someurl.com/object/2

Encoding the integers

You could use a reversible encoding for your integers:

def int_str(val, keyspace):
    """ Turn a positive integer into a string. """
    assert val >= 0
    out = ""
    while val > 0:
        val, digit = divmod(val, len(keyspace))
        out += keyspace[digit]
    return out[::-1]

def str_int(val, keyspace):
    """ Turn a string into a positive integer. """
    out = 0
    for c in val:
        out = out * len(keyspace) + keyspace.index(c)
    return out

Quick testing code:

keyspace = "fw59eorpma2nvxb07liqt83_u6kgzs41-ycdjh" # Can be anything you like - this was just shuffled letters and numbers, but...
assert len(set(keyspace)) == len(keyspace) # each character must occur only once

def test(v):
    s = int_str(v, keyspace)
    w = str_int(s, keyspace)
    print "OK? %r -- int_str(%d) = %r; str_int(%r) = %d" % (v == w, v, s, s, w)

test(1064463423090)
test(4319193500)
test(495689346389)
test(2496486533)

outputs

OK? True -- int_str(1064463423090) = 'antmgabi'; str_int('antmgabi') = 1064463423090
OK? True -- int_str(4319193500) = 'w7q0hm-'; str_int('w7q0hm-') = 4319193500
OK? True -- int_str(495689346389) = 'ev_dpe_d'; str_int('ev_dpe_d') = 495689346389
OK? True -- int_str(2496486533) = '1q2t4w'; str_int('1q2t4w') = 2496486533

Obfuscating them and making them non-continuous

To make the IDs non-contiguous, you could, say, multiply the original value with some arbitrary value, add random "chaff" as the digits-to-be-discarded - with a simple modulus check in my example:

def chaffify(val, chaff_size = 150, chaff_modulus = 7):
    """ Add chaff to the given positive integer.
    chaff_size defines how large the chaffing value is; the larger it is, the larger (and more unwieldy) the resulting value will be.
    chaff_modulus defines the modulus value for the chaff integer; the larger this is, the less chances there are for the chaff validation in dechaffify() to yield a false "okay".
    """
    chaff = random.randint(0, chaff_size / chaff_modulus) * chaff_modulus
    return val * chaff_size + chaff

def dechaffify(chaffy_val, chaff_size = 150, chaff_modulus = 7):
    """ Dechaffs the given chaffed value. The chaff_size and chaff_modulus parameters must be the same as given to chaffify() for the dechaffification to succeed.
    If the chaff value has been tampered with, then a ValueError will (probably - not necessarily) be raised. """
    val, chaff = divmod(chaffy_val, chaff_size)
    if chaff % chaff_modulus != 0:
        raise ValueError("Invalid chaff in value")
    return val

for x in xrange(1, 11):
    chaffed = chaffify(x)
    print x, chaffed, dechaffify(chaffed)

outputs (with randomness):

1 262 1
2 440 2
3 576 3
4 684 4
5 841 5
6 977 6
7 1197 7
8 1326 8
9 1364 9
10 1528 10

EDIT: On second thought, the randomness of the chaff may not be a good idea, as you lose the canonicality of each obfuscated ID -- this lacks the randomness but still has validation (changing one digit will likely invalidate the whole number if chaff_val is Large Enough).

def chaffify2(val, chaff_val = 87953):
    """ Add chaff to the given positive integer. """
    return val * chaff_val

def dechaffify2(chaffy_val, chaff_val = 87953):
    """ Dechaffs the given chaffed value. chaff_val must be the same as given to chaffify2(). If the value does not seem to be correctly chaffed, raises a ValueError. """
    val, chaff = divmod(chaffy_val, chaff_val)
    if chaff != 0:
        raise ValueError("Invalid chaff in value")
    return val

Putting it all together

document_id = random.randint(0, 1000000)
url_fragment = int_str(chaffify(document_id))
print "URL for document %d: http://example.com/%s" % (document_id, url_fragment)
request_id = dechaffify(str_int(url_fragment))
print "Requested: Document %d" % request_id

outputs (with randomness)

URL for document 831274: http://example.com/w840pi
Requested: Document 831274