Statically-Typed Functional Languages
Table of Contents
1. Introduction
Statically-typed functional programming languages appear to be variations on a theme with slight differences among syntax. One way to derive statically-typed functional languages is:
- Start with a typed lambda-calculus
- Add algebraic data types
This will give the "floor model" of such languages, then we can add more bells and whistles and syntactic sugar.
1.1. What are "algebraic data types"?
I have not found a rigorous definition anywhere, but a suitable implementation of algebraic data types requires the following:
- Iso-recursive types
- Variant types
- Product ("tuple") types
What's an "iso-recursive type"? Well, recursive types are formalized using a new binder (μ) with a type variable \(X\). This is called the Fixpoint Type Constructor.
We then have two functions
- \(\mathrm{unfold}\colon\mu X.\tau\to\tau[(\mu X.\tau)/X]\) where \(\tau[(\mu X.\tau)/X]\) is obtained by replacing all free instances of \(X\) by \(\mu X.\tau\), and
- \(\mathrm{fold}\colon\tau[(\mu X.\tau)/X]\to\mu X.\tau\)
Such that
- \(\mathrm{fold}\circ\mathrm{unfold}=\mathrm{id}_{\mu X.\tau}\) is the identity function, and
- \(\mathrm{unfold}\circ\mathrm{fold}=\mathrm{id}_{\tau[(\mu X.\tau)/X]}\) is the identity function.
An algebraic data type is then a variant, each branch being a product type (possibly recursive). For example, a list:
- Type definition: \(\tau~\mathrm{list} = \mu X.(\mathbf{1} + (\tau\times X))\)
- Technically, we have it be \(\forall\tau.\mu X.(\mathbf{1} + (\tau\times X))\)
- Haskell swaps \(\mu\) and \(\forall\) binders, confusingly enough
- Constructors:
- \(\mathrm{nil}_{\tau} = \mathrm{fold}_{\tau~\mathrm{list}}(\mathrm{injl~unit})\)
- \(\mathrm{cons}_{\tau} = \lambda x{:}\tau.\lambda\ell{:}\tau~\mathrm{list}.\mathrm{fold}_{\tau~\mathrm{list}}~\mathrm{injr}(x,\ell)\)
- Pattern matching (destructuring, etc.) then uses "unfold", specifically \(\lambda\ell{:}\tau~\mathrm{list}.\mathrm{case}~(\mathrm{unfold}_{\tau~\mathrm{list}}\ell)~\mathrm{of}\dots\)
2. GHC Core
We can write a minimal functional language using System F, which is what Haskell does with its intermediate Core language:
data Expr = Var Var | Lit Literal | App Expr Expr | Lam Var Expr -- Both term and type lambda | Let Bind Expr | Case Expr Var Type [(AltCon, [Var], Expr)] | Type Type -- Used for type application data Var = Id Name Type -- Term variable | TyVar Name Kind -- Type/kind variable type Kind = Type data Type = TyVarTy Var | LitTy TyLit | AppTy Type Type | TyConApp TyCon [Type] | FunTy Type Type -- Not really necy | ForAllTy Var Type
2.1. Prenex Polymorphism
We actually don't need "full" System F. We can restrict attention to rank-1 polymorphism, where the ∀ and big Λ binders appear in "prenex form" (i.e., outermost position). So, we can write:
data MonoType = TyVarTy Var | LitTy TyLit | AppTy MonoType MonoType | TyConApp TyCon [MonoType] | FunTy MonoType MonoType -- Not really necy data Type = TSType MonoType | ForAllTy Var TypeScheme
Or something like that.
This will have decidable type checking, whereas System F does not have decidable type checking (it's known to be undecidable).