Monadic IO in Standard ML

by Alex Nelson, 29 July 2021

As a Haskeller learning Standard ML, I wondered: could one write exclusively “pure” Standard ML? There are a lot of practical reasons why you would want to avoid this, which I’ll get to later, but right now it seems like we should be able to do something like Monadic input/output in Standard ML.

Puzzle: Impure Currying

For the time being, suppose we can distinguish “pure functions” (written generically as f : a -> b) from “impure functions” (written generically as f : a ~> b). We know we can curry pure functions, that is, we have an isomorphism of types

('a, 'b) -> 'c == 'a -> ('b -> 'c)

Naively, we may hope to curry impure functions similarly

naive: ('a, 'b) ~> 'c == 'a ~> ('b ~> 'c)

But if f : ('a,'b) ~> 'c and curryF is its curried form, then is curryF val_a going to trigger any side effects?

No! It’s not until a second argument is supplied to curryF will impure side effects be triggered. This suggests the correct form should be

('a, 'b) ~> 'c == 'a -> ('b ~> 'c)
                 (*  ^^ pure *)

So far so good, right? Well, what does this have to do with monadic IO?

Deriving an IO Type

We see that ('a, unit) is isomorphic to 'a. (This isomorphism is unique up to unique isomorphism, just using the universal properties of the product object.) Plugging in 'b = unit to the Currying isomorphism for impure functions:

('a, unit) ~> 'c == 'a -> (unit ~> 'c)

Then using the isomorphism of ('a, unit) with 'a on the left-hand side gives us

'a ~> 'c == 'a -> (unit ~> 'c)

which gives us an isomorphism of an impure function type on the left, with a pure function on the right. We just christen the unit ~> 'c type as IO 'c (or, in Standard ML, it would be type 'c IO = IO of unit -> 'c). Well, IO is reserved as a module name, so let’s call it JOB instead.

We want Monads!

Where are the >>= and return functions? Where are the monads? Where are the snowdens of yesteryear?

For simplicity, we’ll just try to wrap around print : string -> unit and TextIO.inputN : TextIO.instream * int -> TextIO.vector. These, uh, don’t look like they match the unit -> 'c signature we were hoping for…so, what do we do?

The trick: think of print : string ~> unit which would have its type be isomorphic to string -> (unit ~> unit) = string -> unit JOB. So we can encode print as a monadic function putStr : string -> unit JOB.

Similarly, inputN : TextIO.instream * int ~> TextIO.vector would have its type be isomorphic to (by Currying) int -> (TextIO.instream ~> TextIO.vector). We see TextIO.instream ~> TextIO.vector = TextIO.instream -> (unit ~> TextIO.vector). Thus inputN' : int -> TextIO.instream -> TextIO.vector JOB is an isomorphic Curried version; fixing the input stream to be stdin gives us a getStr : int -> TextIO.vector JOB function.

Converting these impure functions into monadic counter-parts requires similar “massaging” of type signatures to figure out how to implement them.

In this manner, we can implement monadic IO in a straightforward manner. This is all found in the results of Gordon’s 1994 PhD thesis (table 9.1). Well, we patch his code so it works:

infix 1 >> >>=
abstype 'a Job = JOB of unit -> 'a
with
    (* exec : 'a Job -> 'a *)
    fun exec (JOB f)  = f ()
    (* return : 'a -> 'a Job *)
    fun return x      = JOB (fn _ => x)
    (* (>>=) : 'a Job * ('a -> 'b Job) -> 'b Job *)
    fun (JOB f) >>= q = JOB (fn _ => exec (q (f ())))
    (* getStr : int -> TextIO.vector Job *)
    fun getStr n      = JOB (fn _ => TextIO.inputN(TextIO.stdIn, n))
    (* putStr : string -> unit Job *)
    fun putStr s      = JOB (fn _ => print s)
end

(* (>>) : 'a Job * 'b Job -> 'b Job *)
fun p >> q  =  p >>= (fn _ => q);

(* gettingLine : string -> string Job *)
fun gettingLine s =
    getStr 1 >>= (fn c => if c = "\n"
                          then return(s)
                          else gettingLine (s^c));

(* getLine : string Job *)
val getLine = gettingLine "";

val main : unit Job =
    putStr "First name: " >> getLine >>= (fn first =>
    putStr "Second name: " >> getLine >>= (fn second =>
    putStr ("Hello "^first^" "^second^"\n")));
(* exec main; *)

Exercise 1. Prove the >>= and return functions we just implemented actually satisfies the monad laws. (End of Exercise 1)

Exercise 2. Implement this monadic IO as a struct. (End of Exercise 2)

Conclusion: “More Monads! More Monads!”?

I’m mulling over how to approach monadic coding in Standard ML. One of the reasons I’m doing this in Standard ML is, well, there’s an austere beauty to Standard ML that I find in Lisp: Standard ML is a “bare bones” language which allows you to grow around.

But now that I’ve coded up monadic I/O (or, more precisely, have summarized the work others have done), I’m not sure how profitable it would be to continue investigating monads in Standard ML. It’s a bit of a parlor game: amusing but useless.

From the perspective of Standard ML, the only other impure effect which needs to be wrapped in a monad would be references…or something. I imagine this would be done with of Haskell’s IORef, STRef, and ultimately MutVar primitive.

On the other hand, for monads, I imagine the only interesting monads worth implementing would be State and ST, but I also could be pursuaded that Functor and Applicative would be fun.

Remark (As a Parlor Game…). I do think it is fun trying to implement Haskell concepts in Standard ML, just to better understand the concept. I’m not sure it’s worth doing for production stuff. This “growing a Haskell” is a curious challenge, but one like sudoku: I don’t imagine it ever being useful in real life.

There may be further monads worth playing with, or creating a hierarchy of monad modules. (End of Remark)

Appendix: Should we be Pure in Standard ML?

There’s some folklore suggesting there’s performance hits if we try being pure in a call-by-value language. Pippenger first published this result using a small Lisp.

Bird and friends cleared this up, that lazy languages do not experience similar performance hits.

Together, these papers have been heralded as the alpha and omega on the subject of pure functional programming: it is performant only in lazy languages.

References

Monads in Standard ML