Best language-agnostic questions in October 2011

Clean and type-safe state machine implementation in a statically typed language?

20 votes

I implemented a simple state machine in Python:

import time

def a():
    print "a()"
    return b

def b():
    print "b()"
    return c

def c():
    print "c()"
    return a


if __name__ == "__main__":
    state = a
    while True:
        state = state()
        time.sleep(1)

I wanted to port it to C, because it wasn't fast enough. But C doesn't let me make a function that returns a function of the same type. I tried making the function of this type: typedef *fn(fn)(), but it doesn't work, so I had to use a structure instead. Now the code is very ugly!

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

typedef struct fn {
    struct fn (*f)(void);
} fn_t;

fn_t a(void);
fn_t b(void);
fn_t c(void);

fn_t a(void)
{
    fn_t f = {b};

    (void)printf("a()\n");

    return f;
}

fn_t b(void)
{
    fn_t f = {c};

    (void)printf("b()\n");

    return f;
}

fn_t c(void)
{
    fn_t f = {a};

    (void)printf("c()\n");

    return f;
}

int main(void)
{
    fn_t state = {a};

    for(;; (void)sleep(1)) state = state.f();

    return EXIT_SUCCESS;
}

So I figured it's a problem with C's broken type system. So I used a language with a real type system (Haskell), but the same problem happens. I can't just do something like:

type Fn = IO Fn
a :: Fn
a = print "a()" >> return b
b :: Fn
b = print "b()" >> return c
c :: Fn
c = print "c()" >> return a

I get the error, Cycle in type synonym declarations.

So I have to make some wrapper the same way I did for the C code like this:

import Control.Monad
import System.Posix

data Fn = Fn (IO Fn)

a :: IO Fn
a = print "a()" >> return (Fn b)

b :: IO Fn
b = print "b()" >> return (Fn c)

c :: IO Fn
c = print "c()" >> return (Fn a)

run = foldM (\(Fn f) () -> sleep 1 >> f) (Fn a) (repeat ())

Why is it so hard to make a state machine in a statically typed language? I have to make unnecessary overhead in statically typed languages as well. Dynamically typed languages don't have this problem. Is there an easier way to do it in a statically typed language?

If you use newtype instead of data, you don't incur any overhead. Also, you can wrap each state's function at the point of definition, so the expressions that use them don't have to:

import Control.Monad

newtype State = State { runState :: IO State }

a :: State
a = State $ print "a()" >> return b

b :: State
b = State $ print "b()" >> return c

c :: State
c = State $ print "c()" >> return a

runMachine :: State -> IO ()
runMachine s = runMachine =<< runState s

main = runMachine a

Edit: it struck me that runMachine has a more general form; a monadic version of iterate:

iterateM :: Monad m => (a -> m a) -> a -> m [a]
iterateM f a = do { b <- f a
                  ; as <- iterateM f b
                  ; return (a:as)
                  }

main = iterateM runState a

Edit: Hmm, iterateM causes a space-leak. Maybe iterateM_ would be better.

iterateM_ :: Monad m => (a -> m a) -> a -> m ()
iterateM_ f a = f a >>= iterateM_ f

main = iterateM_ runState a

Edit: If you want to thread some state through the state machine, you can use the same definition for State, but change the state functions to:

a :: Int -> State
a i = State $ do{ print $ "a(" ++ show i ++ ")"
                ; return $ b (i+1)
                }

b :: Int -> State
b i = State $ do{ print $ "b(" ++ show i ++ ")"
                ; return $ c (i+1)
                }

c :: Int -> State
c i = State $ do{ print $ "c(" ++ show i ++ ")"
                ; return $ a (i+1)
                }

main = iterateM_ runState $ a 1

Calculate if two infinite regex solution sets don't intersect

7 votes

In calculate if two arbitrary regular expressions have any overlapping solutions (assuming it's possible).

For example these two regular expressions can be shown to have no intersections by brute force because the two solution sets are calculable because it's finite.

^1(11){0,1000}$ ∩     ^(11){0,1000}$        = {}
{1,111, ..., ..111} ∩ {11,1111, ..., ...11} = {}
{}                                          = {}

But replacing the {0,1000} by * remove the possibility for a brute force solution, so a smarter algorithm must be created.

^1(11)*$ ∩ ^(11)*$ = {}
{1,^1(11)*$} ∩ {^(11)*$} = {}
{1,^1(11)*$} ∩ {11,^11(11)*$} = {}
{1,111,^111(11)*$} ∩ {11,^(11)*$} = {}
.....

In another similar question one answer was to calculate the intersection regex. Is that possible to do? If so how would one write an algorithm to do such a thing?

I think this problem might be domain of the halting problem.

EDIT:

I've used the accepted solution to create the DFAs for the example problem. It's fairly easy to see how you can use a BFS or DFS on the graph of states for M_3 to determine if a final state from M_3 is reachable.

DFA solution

It is not in the domain of the halting problem; deciding whether the intersection of regular languages is empty or not can be solved as follows:

  1. Construct a DFA M1 for the first language.
  2. Construct a DFA M2 for the second language. Hint: Kleene's Theorem and Power Set machine construction
  3. Construct a DFA M3 for M1 intersect M2. Hint: Cartesian Product Machine construction
  4. Determine whether L(M3) is empty. Hint: If M3 has n states, and M3 doesn't accept any strings of length no greater than n, then L(M3) is empty... why?

Each of those things can be algorithmically done and/or checked. Also, naturally, once you have a DFA recognizing the intersection of your languages, you can construct a regex to match the language. And if you start out with a regex, you can make a DFA. This is definitely computable.

EDIT:

So to build a Cartesian Product Machine, you need two DFAs. Let M1 = (E, q0, Q1, A1, f1) and M2 = (E, q0', Q2, A2, f2). In both cases, E is the input alphabet, q0 is the start state, Q is the set of all states, A is the set of accepting states, and f is the transition function. Construct M3 where...

  1. E3 = E
  2. Q3 = Q1 x Q2 (ordered pairs)
  3. q0'' = (q0, q0')
  4. A3 = {(x, y) | x in A1 and y in A2}
  5. f3(s, (x, y)) = (f1(s, x), f2(s, y))

Provided I didn't make any mistakes, L(M3) = L(M1) intersect L(M2). Neat, huh?

How to read title and id from Blu-ray disc?

6 votes

Is it somehow possible to fetch Blu-Ray Disc id and title programmatically on Windows7+ platform?

If you can programmatically open the following files you'll probably get what you need:

/AACS/mcmf.xml - This file is the Managed Copy manifest file and will contain a 'contentID' attribute (in the mcmfManifest tag) that can be used to identify the disc. Typically it is a 32 hexadecimal digit string.

There is sometimes, also an /CERTIFICATE/id.bdmv file which contains a 4 byte disc organization id (at byte offset 40) followed by a 16 byte disc id.

Sometimes, there is metadata information in the /BDMV/META/DL directory in the XML file bdmt_eng.xml (replace eng for other 3 letter language codes for other languages). For example on the supplemetary disc of The Dark Knight I see this file contains:

<di:title><di:name>The Dark Knight Bonus Disc</di:name></di:title>