You don't realize you're addicted to variables...

We started off the PyGTA meeting with brainstorming for fun projects that we could do in a "pure functional" manner. Lots of complex and involved things that we'd whip off in a few minutes. But first, just to be sure we were all on the same page, let's whip off a quicky...

  • given a file
  • output the file on stdout sorted by the 5th character of each line
  • use as pure a functional form as possible, with no variable assignments

It is *amazing* how addicted to variables a Python programmer is. All of us could trivially write the code as something like this:

import sys
print "".join(
    [
        x[1]
        for x in sorted(
            [(line[4:5],line) for line in open(sys.argv[1])]
        )
    ]
)

But sorted is just syntactic sugar for creating and sorting a list, we wanted something that didn't use any variables/state, and we wanted to implement the sort ourselves to see how to do it purely functionally. We eventually came up with something... but OMG it was a mess, iterating over the whole file N times (via itertools.tee) searching for the N+1th item. Worse, the implementation of itertools.tee would mean that it was just creating N copies of the lists in dequeues.

So I looked up the Haskel qsort algorithm when I got home... the Python equivalent looks something like this:

def qsort( seq ):
    try:
        n = iter(seq).next()
    except StopIteration, err:
        pass
    else:
        seqs = itertools.tee(seq)
        for i in qsort( x for x in seqs[0] if x < n ):
            yield i
        yield n
        for i in qsort( x for x in seqs[1] if x > n ):
            yield i

Note the presence of variable assignments... which is one of the things we were considering a disqualification for "pure" functional style. The Haskel version uses "lazy lists" for everything, so they don't have the itertools.tee() call, and the n variable is part of the matching-head "unpack".

Which suggests we were being *way* too rigorous in our functional purity to work within the Python context. Sprinkling a bit of functionalism here and there might be useful, but the utility of eliminating all variable assignments is... suspect. That said, it was a very fun exercise to work on as a brain teaser, you don't realize how addicted you are to variable assignments until you declare that you cannot use them.

Comments

  1. Gary Capell

    Gary Capell on 12/16/2009 1:57 a.m. #

    print ''.join(sorted(fileinput.input(), key=lambda x: x[4:5]))

    (I know, using 'sorted' was declared as cheating)

  2. Nick

    Nick on 12/16/2009 4:46 a.m. #

    You're looking at the syntax, not the semantics. There is no "assignment" happening in the Haskell code, I think :)

    http://en.literateprograms.org/Quicksort_%28Haskell%29

    I do not know an awful lot about the language implementation, but I'm under the impression that 'let' and 'where' are pretty much syntactic sugar

  3. grzywacz

    grzywacz on 12/16/2009 6:16 a.m. #

    I'd argue that generators imply more state than using sorted(). ;-)

  4. j_king

    j_king on 12/16/2009 10:05 a.m. #

    There are "assignments" in the sense of lexically bound and immutable "bindings."

    ghci> let pi = 3.14159265
    ghci> pi * 5^2

    It's simply a name bound to some value (or future value). It exists solely within its scope and is immutable. They don't take the name variable since they never vary.

    I think functional Python code can have assignments. You just have to be extra careful than in languages such as Haskell. Be mindful of implicit refferents and watch out for expressions leaking names outside of their scope (I'm looking at you, list comprehension!). As long as your code never re-assigns a value to a declared name in a given scope, what's the difference? :)

    I think your example is pretty good.

  5. Luke Plant

    Luke Plant on 12/16/2009 10:06 a.m. #

    You seem to be avoiding 'sorted' because *internally* it uses state. Surely that misses the point - as long as a function is pure from the outside, it *is* pure. If you disqualify functions that are impure internally, you disqualify all real computer programs - even in Haskell you can't get around putting things in RAM, which by definition will mutate what was there before, and garbage collection etc.

    To put it another way - how do you know that 'sorted' is impure? Its documentation states:

    "Return a new sorted list from the items in iterable."

    How can you prove that it constructs a list and then sorts it? If you have to look up the source code, and cannot prove it from within your Python program, then it *is* a pure function. http://en.wikipedia.org/wiki/Pure_function

    In fact, 'sorted' is *not* pure by that definition, because if you pass it an iterator, that iterator will be observably different afterwards. But exactly the same is true of qsort.

  6. Michele Simionato

    Michele Simionato on 12/16/2009 10:14 a.m. #

    Functional means no mutation, not no variables! You can use as many bindings as you want, provided you no mutate them

  7. Mike C. Fletcher

    Mike C. Fletcher on 12/16/2009 11:18 a.m. #

    I certainly realize that "real world" functional programming can use variables. We were looking for a "purer" form, one where *everything* is reduced to a function with inputs and outputs only. That should always be possible (continuations being a valid way of implementing any language), but it really is a challenge. This was a co-learning exercise, after all.

    The idea of the challenge was to explore the zen-ish core of functionalism where all is functions and everything boils down to mapping and reducing across inputs to map them into outputs. Eliminating variables (and "sorted") was a short-hand to make the "easy" paths off-limits and force us to think purely in terms of decomposing the tasks into compositions of applied functions.

    That said, we came out of the exercise thinking "wow, I learned a lot about thinking in terms of functional operations", we also came out realizing that this approach (restricting to no variables) is not a particularly practical way to program in Python :) .

    We all had the intuitive sense of what qsort would look like (we described it to each other a number of times in approximately the form you see above), we just couldn't come up with a *straightforward* implementation that was composed entirely of map/reduce style functions without assigning to variables.

    The shorthand rules that made the exercise force us to think "functionally" were too rigorous to make clean implementations easy for programmers steeped int the "normal" way of programming. Good for forcing new thought processes, bad for real-world coding.

  8. John

    John on 12/16/2009 7:28 p.m. #

    Here's a simple naive "functional" (meaning no mutation) implementation of quicksort in python:

    http://pastie.org/746576

    I would've pasted it but the comment box is not code friendly.

Comments are closed.

Pingbacks

Pingbacks are closed.