The Beginning and End.

OCaml! Haskell! Gunzip!

For a couple of days, I told myself that I would study OCaml now that I have a decent grasp of Haskell. But it’s actually been pretty difficult to leave Haskell because I am rather enamoured with the language. ❤

I did however, manage to read through a good chunk of OCaml from the Very Beginning, and work through the exercises that accompany each chapter. I found it to be a very solid introductory text, as the author makes no assumptions about the reader’s background. I plan to finish it up at some point, and move on to Real World OCaml.

A couple of things that jumped out at me as I started learning OCaml:

  • The standard library is pretty sparse. Will be checking out Jane Street’s Core.
  • There are mutable data structures.
  • OCaml supports imperative and functional styles.
  • The introductory text, at least OCaml from the Very Beginning, emphasized understanding tail recursion (which I didn’t really think about when learning Haskell, but maybe I did it wrong)
  • Utop is pretty cool.
  • I got confused by = and == because I wanted to use == but that seems to be reserved for people who really know what they’re doing.

For the past few days, I have been exploring file compression and decompression. I needed a new project and starting looking at what previous HSers had done, and I found this blog post by alumna Julia Evans on implementing gunzip in Julia. So I’ve decided to implement it in Haskell! :D :D :D

I’ve been reading up on the algorithm specifications for inflating and deflating and the format information. I tried looking at the source in C but my head started to hurt, so I have been studying Julia’s implementation, which despite me never having seen Julia code before, has been very helpful.

I spent today working on parsing a gzipped file’s header and metadata information. Tomorrow I will read up more on the algorithm description and dive into implementing inflate.

Recursion Friday

Yesterday at Hacker School was Recursion Friday. Basically, a group of HSers sat down and worked on various recursive problems and brain teasers in the languages of our choice.

Recursion used to make me go like this:

Scared

But somewhere along the way as I was learning Haskell, recursion started to click with my previously recursion-addled brain.

One of the questions yesterday was to find all of the permutations of a string. After some white boarding with Denise and Georgi, talking it through with Alan and Tom, we basically came to the conclusion that the algorithm should be something like this (in pseudocode):

1
2
3
4
5
6
7
8
Permute("abcd") =
    "a" + Permute("bcd") 
    <list concatenation> 
    "b" + Permute("acd") 
    <list concatenation>
    "c" + Permute("bcd") 
    <list concatenation>
    "d" + Permute("abc") 

So, generate all of the permutations of the string without one of the elements of the original string, and concat it with all possible permutations of the smaller string, then concat all of those together to get the final list of permutations.

After speaking with Alan about a Haskell implementation, this is what I ended up with

1
2
3
4
5
6
import Data.List

permute :: Eq a => [a] -> [[a]]
permute [] = [[]]
permute xs = [ y | prefix <- xs, y <- map (prefix:)
                   $ permute $ xs \\ [prefix]]

An obnoxiously short list comprehension.

Pull the prefix from the original string, map a list concatenation of the prefix, (prefix:), to the various permutations of the rest of the string. \\ is this nifty function in the Data.List library that performs list difference, so it was an easy way to exclude the prefix from the original string (thanks to Alan for showing me that).

1
2
3
4
5
6
ghci> permute "abcd"

["abcd","abdc","acbd","acdb","adbc","adcb",
"bacd","badc","bcad","bcda","bdac","bdca",
"cabd","cadb","cbad","cbda","cdab","cdba",
"dabc","dacb","dbac","dbca","dcab","dcba"]

Recursion still blows my mind.

Sometimes, Giving Up Is A-okay.

After spending about a week and a half on my Haskell BitTorrent client, I’ve decided to give it a rest. Am I done? No. Am I close to being done? Depends on what “done” means (but no, probably not).

I won’t lie. I felt a pang of guilt today when I decided to put the project on the backburner.

Finn crying
Actually, I was something like this on the inside.

But what I’ve learned is that when a project stops bringing joy and feels tedious, it is much better to move onto something else so my time can be spent productively. I asked my fellow Hacker Schoolers when they felt it was an appropriate time to move on from a project, and I got a lot of great responses. I was told it was okay to abandon a project if it made me feel sad, but I could write some documentation for it to feel some closure. It also wasn’t apparent to me that because I was getting so hung up on trying to finish this project that I was missing other valuable opportunities, and this was pointed out to me by several alums.

I’m pretty glad that I asked this question, because I’m not sure how much longer I could have dragged on the BitTorrent project. I think I’m ready to move onto OCaml and work through Real World OCaml and start afresh.

Bencoding and Networking

One of the things that I have always been fearful of is networking. Sockets, TCP/IP, and forking, all of that. In a class at university, I had done some networking in C; I remember reading Beej’s guide to network programming, and I found myself returning to it, and appreciating it a bit more this time. It’s funny how when you’re learning something in school, you could care less about it. You just want to get from point A to point B in the shortest way possible. But when you have the time to be in a self-directed environment like Hacker School, you find yourself wanting to revisit things that you didn’t learn properly the first time around, wanting to tell your past self: “You should have paid more attention when you were learning this!”

So, to learn networking in Haskell, I have decided to incrementally work towards building a simple Bittorrent client. Yesterday, I read about bencode, the encoding used by the Bittorrent protocol for torrent files (metadata).

Bencode supports four types of values:

  • Integers
  • (Byte) Strings
  • Lists
  • Dictionaries (Hash Maps)

Integers are encoded as i<integer encoded in base 10 ASCII>e. Integers can be negative, but 0 cannot be -0.

ByteStrings are encoded as <length>:<contents>. So “foo” would be encoded as 3:foo. The length of the content can be 0, but cannot be negative.

Lists are encoded as l<contents>e. Contents are bencoded elements of the list (in order), and are concatenated. Something like “cat31” would be encoded as l3:cati31ee.

Dictionaries are encoded as d<contents>e. The keys must be bytestrings, and the dictionary is ordered lexiographically by key. The encoded key value pair follow each other immediately. {“cat”: “meow”, “dog”: 44} would be encoded as d3:cat4:meow3:dogi44ee.

After reading this, I used the Parsec library and wrote a parser for bencoded values in Haskell. I tested it on a torrent for Ubuntu and got back this:

"Metadata"

We can see things like:

and so on.

The crazy wall of text is mostly due to all of the pieces for this particular torrent.

After this, to get a better sense of networking in Haskell, I read up on TCP/IP and started working towards this tutorial on building a multi-threaded chat server in Haskell, and getting a feel for the Network.Socket library.

At the very basic level: Socket –> Bind –> Listen –> Accept.

My networking journey will continue today…

Hacker School: Day 13

Day 13 of Hacker School went by pretty quickly, mostly due to some of us playing hooky to go watch the Germany vs. USA game. We were initially going to go to a German bar, but it turns out there were a lot of Germans there already (I guess they like their futbol). We ended up back tracking and found a bar called Sweet & Vicious, which was sweet (their jargaritas are great), but not particularly vicious. The game was slow paced, and the atmosphere of the bar made it a nice spot to lounge.

dat jargarita

I guess my main reason for leaving in the middle of the day was that I was in a weird headspace after the goal making workshop. I started to doubt my decisions about the things I wanted to do this summer and that was bumming me out. On top of that, I couldn’t install threadscope to save my life, and spent almost five hours trying to get that to run (yay yak-shaving) before just giving up and deciding to read Simon Marlow’s book without it.

I did have a productive talk with Allison yesterday. My worries were that I was working on Haskell, a language that is rather impractical for the purpose of jobs and I wasn’t sure if I should switch to Python or not. Which turns out it was a rather silly fear to have since I am going back to school and won’t ostensibly be looking for a new job for at least two more years. So I have decided to stick with Haskell, and will work on some side projects in Python once in a while when the mood strikes me.

Hacker School: Days 11 & 12

I’m getting behind on my blogging, meep.

Day 11 recap:

  • Spent half of the day trying to debug my sudoku solver’s board parser, which was pretty difficult as everything type checked, but I was getting the wrong output.
  • Learned about the GHCI interactive debugger, which let me place break points and inspect variables where the execution of the program has stopped.
  • Alan gave me a quick overview of “equational debugging”, which is to test little snippets of your program and make sure that it is returning the right output.
  • I ended up pulling out more complicated bits of my code into smaller functions to make equational debugging more manageable.
  • Spent the rest of the day finishing up the sudoku solver. I debugged the constraint propagation aspect of the program, and then implemented the depth first search function which utilized the constraint propagation logic.

Day 12 (today!), I decided to take (another) break from Haskell. I played around for most of the day with Paper.js, trying to make something beautiful, but eventually found that it didn’t actually interest me too much. Around 3:00 pm, Stacy from Winter 2013 batch came by Hacker School to do a goal making workshop. It did really get me thinking about my time at Hacker School, and how I need to figure out when I can say that I have achieved my goal of being literate in Haskell. I think I will spend maybe a couple more weeks max on Haskell, and then switch over to Python. My main fear is that while Haskell is awesome, it isn’t exactly a language that I will be able to work in once I become a professional software engineer.

After hitting a dead end with Paper.js, I decided to pivot and play around with Python. I implemented a simple echo server in Python, where the client will send back what the server sent to it. Then I got the idea to work on a GIF bot for Zulip because sometimes emoji just aren’t enough :). You can ask gif bot “gif me cats” and an API query is sent to Giphy, and gif bot returns a gif that has been tagged with cats, in this example. It’s basically done at this point, and works locally, but I’m not sure what the protocol is for having a permanent presence from the bot. Do I serve it on AWS/Heroku? I think I will ask Allison tomorrow. I’m also curious if I should demo this tomorrow afternoon.

Here is gif bot in action:

"Gif Bot in action"

Hacker School: Days 9 & 10

I didn’t get a chance to blog about last Friday (Day 9) so I’ll do a quick recap:

  • First day of interview prep.
  • Worked on Manage those Phonebooks in Python.
  • Turns out I had a lot of trouble parsing command line inputs, so I spent a good chunk of time on that before I decided to come back to it at the end.
  • This exercise made me feel a bit uncomfortable about how well I know a language. I’ve been spending all of my time on Haskell and haven’t touched Java or Python in a while.
  • Continued reading about Monads.

Yesterday (Day 10), we received new check-in groups. It seems that they rotate every two weeks, and I’m enjoying that as it gives me the chance to interact with more of my fellow Hacker Schoolers in a smaller setting.

I spent a good chunk of the morning finishing up the last assignment from Brent Yorgey’s Haskell course programming a Risk, the boardgame, simulator. It was cool to play around with Random generators to simulate die rolls, and also to simulate 1000 invasions (repeated calls to battle until there are no defenders remaining, or fewer than two attackers left) of varying defending and attacking army sizes. This was encompassed in a Battlefield data type, where Battlefield is defined as:

1
2
3
4
type Army = Int

data Battlefield = Battlefield
                   { attackers :: Army, defenders :: Army }

After this, I ended up stumbling on Peter Norvig’s Sudoku solver paper and decided to implement it in Haskell. I’ve been struggling through lots of type errors but I was surprised and happy to find that I got so engrossed in my task that I stayed at Hacker School for almost 12 hours yesterday. It’s really neat to be this absorbed in something, and feeling myself gain more confidence through being challenged by my own curiosities. Once I finish this up, I will definitely be needing some code review.

Hacker School: Day 8

End of week two. Crazy!

This morning, I spent a bit of time pairing with Dana and Georgi on Arduino and hardware things. Dana introduced us to Firmata, a protocol for interfacing with the Arduino in different programming languages (instead of using the Arduino language, C/C++, or Assembly). She gave us examples in Node.js using Johnny Five and Python using PyFirmata. I ended up playing around with the Python example and an Arduino that was set up with a potentiometer, an LED, and a servo motor. Depending on the value read from the petentiometer, the LED would blink at a different rate, and the servo would move to a different angle.

After seeing how this worked, and looking through the spare hardware parts at Hacker School, we found a photocell, which senses the amount of light being received. I hooked up the photocell to the breadboard and Arduino using this guide from AdaFruit ❤ and a little help from Dana, and now the LED and servo were reacting to the readings from the photocell.

"PyFirmata + Arduino"

After playing around with this, I got an idea to use a temperature and humidity sensor for garden/plant health. The Arduino would gather data, and if your plants need a little love, it would trigger something that would text you (maybe using Twilio) to tend to them. I need to order the temperature/humidity sensor before I can get started but I think this would be a fun hack that encompasses hardware and web development.

Once the hardware workshop concluded, I went back to Haskell where I started reading about Monads (eeep). Actually, Monads so far haven’t been that scary. Confusing, yes, but not scary. Just another level of complexity up from an Applicative Functor. In fact, all Monads are Applicative Functors. The distinguishing factor is that Monads have the (>>=), or bind, method in their definition.

A Monad is defined as:

1
2
3
4
5
class Monad m where
    (>>=) :: m a -> (a -> m b) -> m b
    (>>) :: m a -> m b -> m b
    return :: a -> m a
    fail :: String -> m a

From what I can understand, (>>=) basically lets you pass some value of a Monadic context to a function that expects a normal value, and outputs a Monadic value.

Something else that struck me was that the definition of (>>) seems exactly like the definition of (*>):

1
2
(>>) :: Monad m => m a -> m b -> m b
(*>) :: Applicative f => f a -> f b -> f b

So cool.

(>>) and (*>) ignore the result of the first monadic value/applicative functor but not their effects.

I ended up reading most of LYaH’s chapter on Monads, and have gotten a better understanding of both applicative functors and monads. I still have lots of reading material left, so I will continue that tomorrow.

As today was Thursday, Hacker Schoolers demo’d projects that they had been working on. Many of the projects were interesting, challenging, and quite frankly, a bit intimidating. They ranged from maze/percolater solvers, to CPUs in Clojure, to letting you edit Jekyll blogs in the browser! I wondered to myself if I would ever be presenting at a demo session this summer. The thought did cross my mind that maybe I should give up on learning Haskell because it is getting incredibly difficult for me to process, and everyone else seemed to be much more productive with their time than I was. I don’t know, I will think on this more.

Mihai’s demo on creating music and outputting the result as MIDI files actually inspired me to search for something that would allow me to do that in Haskell. I was able to come across the Euterpea library, which looks very promising. I think it would be fun to create a Markov Chain that took training data of different songs, and created new compositions. I hope to play around with this some in the coming days.

To conclude the day, Marisa, Alex, Rachel, and I started watching a documentary on the font Helvetica. It was very interesting to see how Helvetica’s usage has evolved over time, and also the connotations that it has to people. Going from a clean, revolutionary font to an overused “nameless” font (as it is now employed by many corporations and government agencies) has given Helvetica an interesting reputation. Some people see it as the font of wars (Vietnam, Iraq war), and some see it as the quintessential modern typeface; a debate between modernists and postmodernists.

Hacker School: Day 7

I can’t believe it has already been a week and a half at Hacker School. Wow. 12 weeks seems like a lot but the time is already flying by.

Today, I decided to resume my studies in Haskell. I worked on the week 11 assignment for Brent Yorgey’s Haskell course. I continued working with the Parser data type, implementing some utilies and several new Parsers.

This is where I was introduced to *> and <* and the concept of lifting in functional programming (which I am still having a hard time fully understanding).

Described in the source for module Control.Applicative as:

1
2
3
-- | Sequence actions, discarding the value of the first argument.
    (*>) :: f a -> f b -> f b
    (*>) = liftA2 (const id)

and

1
2
3
-- | Sequence actions, discarding the value of the second argument.
    (<*) :: f a -> f b -> f a
    (<*) = liftA2 const

Using some examples I worked out by hand:

1
ghci> runParser (spaces *> posInt) " 345"

I understood this computation as first:

1
2
ghci> runParser spaces " 345"
Just (" ", "345")

The " ", the result, is disregarded. We pass on “345” to the next call:

1
2
ghci> runParser posInt "345"
Just (345, "")

Since we are ignoring the result of runParser spaces " 345", the final result is:

1
Just (345, "")

Using the same inputs, but for <* this time:

1
ghci> runParser (spaces <* posInt) " 345"

First we run:

1
2
ghci> runParser spaces " 345"
Just (" ", "345")

This time, we don’t disregard " ".

We pass on “345”:

1
2
ghci> runParser posInt "345"
Just (345, "")

Here we are ignoring 345, the result of runParser posInt "345".

The final result ends up being:

1
Just (" ", "")

Cool, how about one more.

1
ghci> runParser (ident <* posInt) "hello 345"
1
2
ghci> runParser ident "hello 345"
Just ("hello", " 345")

We keep "hello" and pass on " 345".

1
2
ghci> runParser posInt " 345"
Nothing

Uh oh. We got Nothing. There is no state for us to use in the return Maybe tuple, so we end up with Nothing.

  • posInt :: Parser Integer : Checks for positive integers in String input.
  • spaces :: Parser String : Checks for spaces in String input.
  • ident :: Parser String : an identiļ¬er can be any nonempty sequence of letters and digits, except that it may not start with a digit.

I got this far in my understanding of these functions, but I still do not quite understand what is being “lifted” behind the scenes.

Lifting in the case of (a -> b) -> f a -> f b is defined in LYaH as “a function that takes a function and returns a new function that’s just like the old one, only it takes a functor as a parameter and returns a functor as the result.” I will mull this over more tonight, and perhaps ask Alan or some alumni in Zulip tomorrow if I am still having a hard time understanding the concept.

For about 30 minutes, I worked through the first six problems of 99 Haskell Problems. It’s been pretty fun to revist things I learned a while back now that I am getting wrapped up in the intricacies of Haskell. My brain feels like it is exploding all the time.

Learning Haskell has been challenging. It is definitely rewarding, but there are many times when I feel discouraged because I don’t know when I will understand things like Applicative Functors like the back of my hand. I often feel afraid that I won’t ever understand these things, but then I tell myself, what’s the rush?

Hacker School: Day 6

Today, I decided to take a mini break from Haskell (erm, I still did a wee bit of Haskell) and work on a one off project. I got in a bit early, around 9:30 am to finish up the Applicative Functors blog post. I was pretty impressed with how many Hacker Schoolers were already in at that point, and felt eager to start working.

The little bit of Haskell I did today was to implement an Alternative instance for a Parser.

If we remember from my previous blogpost, Parser is defined as such:

1
newtype Parser a = Parser { runParser :: String -> Maybe (a, String) }

The Applicative Parser instance is useful for making parsers of fixed and simple formats. Because we want to be able to handle the concept of choice, we turn to the Alternative class.

The Alternative class is defined something like this:

1
2
3
class Applicative f => Alternative f where
    empty :: f a
    (<|>) :: f a -> f a -> f a

(<|>) denotes a choice, so something like p <|> q means a choice between p or q.

So, my Alternative instance ended up being:

1
2
3
instance Alternative Parser where
    empty = Parser $ const Nothing
    p <|> q = Parser $ \s -> runParser p s <|> runParser q s

An important hint here is that there is already an Alternative instance for Maybe:

1
2
3
4
instance Alternative Maybe where
    empty = Nothing
    Nothing <|> p = p
    Just x <|> _ = Just x

This means that we can simply call runParser p s and runParser q s, which gives us Maybe tuples. We can then use (<|>) from the Alternative instance of Maybe to choose between those two tuples and pass it to the Parser constructor.

If runParser p s succeeds, return its results, else try runParser q s. If that succeeds, return its results. Otherwise, a Nothing will be returned.

Pretty nifty!

The rest of today, I worked on a very simple URL shortener.

I found a tutorial online that used the md5 hashing algorithm and a base 64 encoding to generate a randomized string. The last five characters of this were taken, sanitized, and then used as the key for a key value store in Redis as well as a slug.

I whipped up a simple Flask app for 301 redirects from the shortened URL to the original URL. A quick lookup in Redis, and then a return redirect(location, 301) call did the trick.

http://blog.dariajung.com results in http://localhost:5000/iJp4fHs. When I go to that URL, I am then redirected to my original URL.

The functionality is all very simple, so I don’t check that the protocol identifier and such are in the URL. That might be for another day.

Overall, I think today was rather productive!