find me on
 
twitter del.icio.us
linkedin flickr
goodreads flixster
facebook aim
dopplr stackoverflow
tweets
 
del.icio.us
 
recent posts
 
history
 
September 2010
M T W T F S S
« Jul    
 12345
6789101112
13141516171819
20212223242526
27282930  
 
jason.prado (@) gmail.com real tangible
 
Rules? Where we’re going…
 

we don’t need rules.

I was super impressed today upon reading about the SEC requiring Python source code for securities disclosure to represent a description of some complex calculation. They require that funds publish a program that takes input and produces an answer.

The SEC could ask these funds to just write down a bunch of assumptions and rules that, when put into practice, would allow a person who understands the rules to apply them to their input values and arrive at an answer. I argue: that would suck.

Clearly a program that does the work for the human is awesome, but there’s something more important here. The SEC requires that the source code of the program be made available also, so that (and I may be extrapolating here a little) the source code itself can represent the rules and policies of the financial product. This just makes so much sense. The code is the specification. It is authoritative. But, even better, it is naturally understandable.

I’m thinking that imperative programs are the best way to specify a lot of complex behaviors.  Humans seem to intuitively grok linear flow and simple branching. Imperative descriptions even help us see order even when there isn’t really order to see. The cereal decision in the attached example of the heavily-memed “mundane process in long flowchart” is not necessarily sequential, but when we see the options lined up in a branched structure we naturally follow it. The author could have just written “Golden Grahams are great if you’re on food stamps” in a list of a dozen other such declarations, but that would be unfunny, lost in the jumble. I suspect all sets of rules have similar problems.

Declarative information dumps remind me of the nearly-defunct programming language Prolog, wherein, to grossly oversimplify, a programmer considers a problem and writes a set of declarations that the answer to a problem must satisfy. Then a solver reads the declarations and arrives at the answer. The Prolog model is “tell me about the problem and I’ll go solve it”, whereas the imperative method is “solve a problem then tell me how you solved it”. Sounds magical and great, right? Then why don’t any of my friends write Prolog? There were efficiency problems and other factors, including just fashion, but Prolog never took off because declarations are not how people think. People think in steps and conditions and branches and loops. We write checklists and run through them linearly. We think about our thought process explicitly, seeing it as conditional tests we perform in sequential fashion.

I don’t know if this is fundamental to human thought, or if I’ve been infected with imperative thought since I picked up QBasic as a kid. We certainly live in an imperative world, and I can barely imagine how else it could work. Every time I’ve worked with programming tools that deal with logical declarations, they arrive upon results far from what I intended; the rules I input combine in unexpected ways to produce unpredictable behavior. Imperative programs, on the other hand, seem straightforward and almost effortlessly traceable. I’d rather write a spaghetti-code, heavily-branching 200-line state machine than logic out 20 rules that cover the same domain.

But, we’re potentially bumping up against the limits of what imperative solutions to problems can accomplish in computing. Software projects today are more wildly large and complex than ever, and today’s software has to be written to work across multiple CPU cores and across countless machines in a datacenter. All signs point to human reasoning alone not being enough to hack it in this complicated landscape. And that’s bad news, because I tend to think that imperative programming is a natural behavior that easily flows with our way of doing things.

I don’t have any kind of an answer to this dilemma. Maybe I’m a 24-year old dinosaur already, but I’m pretty sold on the important of imperative thought. I have some ideas about a  productive balance that I’ll try to explicate further, but the main idea is to use imperative description for as much problem-solving as possible, then reason about larger elements declaratively (and, in programming  terms, functionally).

Programming is like wishing on a genie
 

I’m reading Guy Steele’s interview in Coders at Work, and I found one of his analogies to be pretty neat:

And if you look at the fairy tales, people want to be able to just think in their minds what they want, wave their hands, and it happens. And of course the fairy tales are full of cautionary tales where you forgot to cover the edge case and then something bad happens.

As a pretty accurate real world example, he offers:

…suppose I were to tell my smart computer, “OK, I’ve got this address book and I want the addresses to always be in sorted order,” and it responds by throwing away everything but the first entry. Now the address book is sorted. But that’s not what you wanted.

Yeah, that’s how programming feels. I remember running into situations exactly like that when programming in Epilog and when trying to annotate code to pass through a static analysis tool. Those problems made me think that the logical/declarative paradigm is just not cut out for being written by humans, whereas functional and certainly imperative models of computation are actually more intuitive, despite their complexity.

I’m not sure Steele would draw the same conclusion, but his list of more than a dozen languages he has worked in seriously does not include Prolog.

% 2 or & 1 ?
 

Should you test an integer for evenness by n % 2 == 0 or n & 1 == 0? I always assumed they compiled down to the same thing, so today I checked. All code compiled in gcc4 -O3.


(gdb) x/8i foo1
0x1fa0 <foo1>: push %ebp
0x1fa1 <foo1+1>: mov %esp,%ebp
0x1fa3 <foo1+3>: mov 0x8(%ebp),%eax
0x1fa6 <foo1+6>: leave
0x1fa7 <foo1+7>: xor $0x1,%eax
0x1faa <foo1+10>: and $0x1,%eax
0x1fad <foo1+13>: ret
0x1fae <foo1+14>: xchg %ax,%ax


(gdb) x/8i foo2
0x1fb0 <foo2>: push %ebp
0x1fb1 <foo2+1>: mov %esp,%ebp
0x1fb3 <foo2+3>: mov 0x8(%ebp),%eax
0x1fb6 <foo2+6>: leave
0x1fb7 <foo2+7>: xor $0x1,%eax
0x1fba <foo2+10>: and $0x1,%eax
0x1fbd <foo2+13>: ret
0x1fbe <foo2+14>: xchg %ax,%ax

Yep, same thing. This worked the same with signed vs. unsigned ints on my intel mac. According to this thread, that isn’t the case on all architectures.

Premature optimizers should stop prematurely optimizing. If taking the remainder when divided by 2 is clearer to you, then do it that way. The compiler doesn’t care.

Convex Hull Algorithm in Scala
 

I’ve decided to forreals learn Scala and forreals learn some computational geometry (I’ve been faking it at my job for a year now). I’m using Processing to draw pixels to the screen, and it turns out to be amazingly easy to access the Java library from Scala. My first algorithm was the naive implementation of convex hull generation (given a set of points, draw the tightest convex polygon possible that contains all points). This implementation runs in O(n^2) but I plan to reimplement it using the O(nlogn) algorithm that’s only slightly more complex.

I think graphics is an interesting example of combining functional and imperative aspects of Scala. The program is all about the side effect of putting pixels on the screen, but functional tools help me get there. Check out this method I used for testing:

  def drawLine(pts: List[Point2D]): Unit = {
      def drawLines(p1: Point2D, p2: Point2D) : Point2D = { line(p1.x, p1.y, p2.x, p2.y); p2 }
      pts.reduceLeft(drawLines);
  }

This is a little hacky but, in my opinion, fairly elegant. Reduce does the work of giving me consecutive pairs and then I use line() to draw to the screen.

I’ve put up all the code here.

Download the Daily Show/Colbert Automagically
 

iTunes has been pissing me off lately– when you download the HD version of something it forces you to download the SD version also to play on an iPod. There is no getting around this other than to let it download. It’s also getting slow and clunky and I’m done buying video episodes off of it.

I tried using an app called ted to automatically download torrents, but it needed constant tinkering and ultimately wasn’t worth the hassle. So yesterday between 9:01am and 9:11am I hacked out a python script to fetch me the Daily Show and Colbert Report off of mininova. To use the script you need python and a torrent client that can watch a directory for incoming torrents; I use Transmission.

To use the script, download it and edit it to put torrents where you want them. Configure your client to watch the same directory, and run the script. I recommend you schedule the script to run every morning using launchd. Notice that the script has no error handling whatsoever and might totally not work for you. It’s worked the past two days for me, so, you know, it works on my machine.

Download script

Diff two XML documents
 

I came across the need to compare two XML trees for equality in unit tests I’m writing for a project you’ll hopefully hear about soon. I started to write a tree-walking comparer then thought I’d google around to see if anyone had done this for me. simplexmldiff is a clever script that just reads in two XML docs, pretty prints them, and then diffs their output. Brilliant. It isn’t perfectly accurate due to whitespace, but I don’t care about that for this project. I reimplemented what I needed of it in 4 lines of python. hellah.