XUnit Framework - SML

1. Overview
2. First Draft
3. Refactoring Test Results and Test Reporting
4. Summarize results
5. Quality of life helper functions
6. Reporter Module
7. XML Output
8. Test Discovery

1. Overview

I'm going to try to iteratively construct an xUnit testing framework in Standard ML, just to get acquainted with programming in Standard ML.

We will have one test suite per file, which consists of many test cases (or just "tests"). A test case consists of one or more assertions.

2. First Draft

We will just have an assert function which reports failure by means of raising an AssertionFailure exception. Since Standard ML is not lazy, we need to make an assert function checks if a given condition has been satisfied and, if not, raises an AssertionFailure exception with a user-given message for details.

exception AssertionFailure of string;

fun assert (msg : string) (is_success : bool) : unit =
    if is_success then () else raise AssertionFailure msg;

Now for Test, which is a composite pattern of test cases. A test case has a name encoded as a string, and a function which encodes the various assertions.

A TestSuite is just a list of Test instances, with some name (also encoded as a string).

datatype Test = TestCase of string*(unit -> unit)
              | TestSuite of string*(Test list);

Great, now we need to run through the tests.

For test cases, TestCase (name, assertion), we call assertion() then handle any exceptions raised. There are two types of exceptions we expect: first, AssertionFailure exceptions reflect an assert failed; second, any other exceptions that may occur which the user did not adequately handle.

For TestSuite instances, we simply call run recursively on its tests.

fun run (TestCase (name, assertion))
    = (assertion()
       handle AssertionFailure msg => print (concat ["Test ",
                                                     name,
                                                     " failed: ",
                                                     msg])
            | e => print (concat ["Unhandled exception in test",
                                  name,
                                  ": ",
                                  exnMessage e]))
  | run (TestSuite (name, tests)) = app run tests;

Now we could leave it here, and we have a nifty little testing framework. But I'd like to add two convenience functions to make writing test cases (and test suites) a little easier.

fun test (name:string) (assertion:unit -> unit) : Test =
    TestCase (name, assertion);

fun suite (name:string) (tests : Test list) : Test=
    TestSuite (name, tests);

Now we could run this on some example tests:

val ex_tests = suite "Arithmetic Tests"
                     [test "Arithmetic test 1" (fn () => assert "1+1=2" (1+1=2)),
                      test "Arithmetic test 2" (fn () => assert "1+1=3" (1+1=3))];

run ex_tests;

3. Refactoring Test Results and Test Reporting

Now, xUnit testing wants to produce an "artifact" from running through the tests which reports the results. We did this by simply having run print whether the test succeeded, failed, or had an unexpected exception raised.

A better approach would be to create a TestResult datatype, have run : Test -> TestResult, and introduce a new function report : TestResult -> unit which produces some kind of artifact.

We have a TestResult keep track of the TestOutcome and some metadata (like the amount of time it took to execute the test). Like the Test datatype, the TestResult is a composite pattern (since we want to keep track of possibly nested test suites).

(* Extend the framework to handle printing test outcomes. *)
datatype TestOutcome = TestSuccess
                     | TestFail of string
                     | TestException of exn;

datatype TestResult = ResultCase of string*Time.time*TestOutcome
                    | ResultSuite of string*Time.time*(TestResult list);

We can now modify our run function to produce a TestResult instance.

fun run_test (test : Test) : TestResult =
let
  val start = Time.now()
in
  case test of
      (TestCase (case_name, assertion)) =>
      ((assertion();
        ResultCase (case_name, (Time.-)(Time.now(), start), TestSuccess))
       handle AssertionFailure msg => ResultCase (case_name,
                                                  (Time.-)(Time.now(), start),
                                                  TestFail msg)
            | e => ResultCase (case_name,
                               (Time.-)(Time.now(), start),
                               TestException e))
    | (TestSuite (suite_name, tests)) =>
      (let
          val results = map run_test tests
          val dt = (Time.-)(Time.now(), start)
      in
          ResultSuite (suite_name, dt, results)
      end)
end;

Now the first stab to report these results will be simply to print them to the screen. Superficially, this will appear no different than before. Later, we can abstract away reporting to a signature REPORTER, and have different formats and reports implemented accordingly.

We will modify our printed output a little to include the number of seconds it took to execute the test (and the suite). For our toy test suite, this will be so fast it will probably print "0.000" seconds. If we want to know exactly how long the test took, we could print the microseconds instead.

(* `Time.fmt n interval` will produce a string representation of the
number of seconds in the interval, to `n` digits after the decimal
point. *)
fun intervalToString (dt : Time.time) : string =
    Time.fmt 3 dt;
    (* (LargeInt.toString (Time.toMicroseconds dt))^"ms"; *)

(* Print the results to the terminal *)
fun report_results (ResultCase (name,dt,outcome)) =
    (case outcome of
         TestSuccess => print (concat ["Test ",
                                       name,
                                       " (",
                                       intervalToString dt,
                                       ")",
                                       ": SUCCESS\n"])
       | (TestFail msg) => print (concat ["Test ",
                                          name,
                                          " (",
                                          intervalToString dt,
                                          "): FAIL -- ",
                                          msg,
                                          "\n"])
       | (TestException e) => print (concat ["Test ",
                                             name,
                                             " (",
                                             intervalToString dt,
                                             "): UNHANDLED EXCEPTION ",
                                             exnMessage e,
                                             "\n"]))
  | report_results (ResultSuite (name,dt,outcomes)) =
    (print (concat ["Suite ", name, " (", intervalToString dt, ")\n"]);
     app report_results outcomes);

Now we could run this on the example tests by doing something like:

val ex_results = run_test ex_tests;

report_results ex_results;

(* Prints:

Suite Arithmetic Tests (0.000)
Test Arithmetic test 1 (0.000): SUCCESS
Test Arithmetic test 2 (0.000): FAIL -- 1+1=3
*)

So far, so good!

4. Summarize results

It's not terribly useful to print out every test case which succeeds, since that clutters up the screen. We will then only print out failures, and produce a one line summary of the number of tests run, the number of failures, and the number of unhandled errors. We will simply use a record type to track the number of successes, failures, and errors.

type ResultSummary = {success : int, fail : int, errors : int};

val no_results = {success=0, fail=0, errors=0};

Now, we could create a function which transforms a Test into a ResultSummary, but we need to collapse them into a single summary instance. Towards this end, we have a merge_summaries function which will be used to fold them all together.

fun merge_summaries (({success=s1, fail=f1, errors=e1},
                     {success=s2, fail=f2, errors=e2}) : ResultSummary*ResultSummary)
    : ResultSummary
    = {success = s1+s2,
       fail = f1 + f2,
       errors = e1 + e2};

Now we summarize each TestCase by its outcome, and TestSuite by summing over the summaries of its constituents.

(* summarize : TestRest -> ResultSummary *)
fun summarize (ResultCase (_, _, outcome)) =
    (case outcome of
         TestSuccess => {success=1, fail=0, errors=0}
       | (TestFail _) => {success=0, fail=1, errors=0}
       | (TestException _) => {success=0, fail=0, errors=1})
  | summarize (ResultSuite (_, _, outcomes)) =
    foldl merge_summaries no_results (map summarize outcomes);

Now we could check the summary of our toy test suite:

summarize ex_results;

(* SML/NJ produces:

val it = {errors=0,fail=1,success=1} : {errors:int, fail:int, success:int}
*)

5. Quality of life helper functions

Right now, it's rather tedious to write a test suite. The conventions I'm loosely following (xUnit, influenced by elements of JUnit) has each test suite be contained in its own file. So I would love to write something like:

suite "FooBarTests";

test "BaazTest1" fn () => (* ... *);

test "BaazTest2" fn () => (* ... *);

(* etc. *)

I'd like the library to simply accumulate the test cases as they are defined into the current test suite. This requires side effects, specifically uses ref instances.

val current_suite : (string*(Test list)) ref = ref ("", []);
val all_suites : (Test list) ref = ref [];

The test runner will simply iterate through all_suites to produce a corresponding list of test results, which will be iteratively processed by a reporter.

Now, we revisit our suite and test functions, to make them behave as we would like. Specifically, our suite function will append whatever the current_suite's contents are to all_suites (after storing them in a TestSuite instance). I'm not sure if I will need suite to do anything else, so I will place this procedure to append the current suite to the running list of all suites in a helper function:

fun append_suite (name : string) =
    let
        val (name, tests) = !current_suite
    in
        all_suites := (TestSuite (name, tests)) :: (!all_suites);
        current_suite := (name, [])
    end;

Now, we have our redefinition for suite:

fun suite (name : string) =
    append_suite(name);

The test will update current_suite to add a new TestCase to its list of tests.

fun test (name : string) (assertion : unit -> unit) : unit =
    let
        val (suite_name, tests) = !current_suite
    in
        current_suite := (suite_name, (TestCase (name, assertion))::tests)
    end;

6. Reporter Module

We can now refactor the reporter routines into their own modules. The first example of this will be a terse summary of failed tests and a summary of each test suite's results.

First, we abstract away the signature we'd expect for a reporter. It has a single function, report, which will produce an artifact for a given test result. Sometimes we just print a summary of the results to the screen, in which case the artifact has type unit. Other times, we may produce an XML snippet for each test suite, which are then written to a file.

signature REPORTER = sig
    type t; (* type of the artifact produced *)
    val report : TestResult -> t;
end;

We will begin with a summary of the test results along the lines of JUnit, namely just printing if a test case has failed or experienced an error, then print a summary of the test suite. These can be done in two helper functions, but I'm lazy, so I'm going to write it all at once.

The output for a test suite which has no errors or failures would consist of two lines:

Running <file path>
Tests run: <number>, Failures: <number>, Errors: <number>, Skipped: <number>, Time elapsed: <interval s - in <file path>

For now, I will simply use the test suite name instead of the path. We package this together in JUnitTt, a structure writing to the terminal (hence the Tt suffix) a summary imitating JUnit's output.

fun count_tests_run ({success=s1, fail=f1, errors=e1} : ResultSummary)
    = s1 + f1 + e1;

structure JUnitTt : REPORTER = struct
  type t = unit;

  fun report (ResultCase (name, dt, outcome)) =
      (case outcome of
           (TestFail msg) => print (name^" FAIL: "^msg^"\n")
         | (TestException e) => print (name^" ERROR: "^(exnMessage e)^"\n")
         | _ => ())
  | report (r as (ResultSuite (name, dt, results))) =
    let
        val summary = summarize r
    in
        (print ("Running "^name^"\n");
         app report results;
         print (concat ["Tests run: ", Int.toString (count_tests_run summary),
                        ", ",
                        "Failures: ", Int.toString (#fail summary),
                        ", ",
                        "Errors: ", Int.toString (#errors summary),
                        ", ",
                        (* "Skipped: ", #skipped summary, " ", *)
                        "Time elapsed: ", intervalToString dt,
                        " - in ", name,
                        "\n"]))
    end
end;

Now we can run this on our example test results, which will produce something like the following:

- JUnitTt.report ex_results;
Running Arithmetic Tests
Arithmetic test 2 FAIL: 1+1=3
Tests run: 2, Failures: 1, Errors: 0, Time elapsed: 0.000 - in Arithmetic Tests
val it = () : JUnitTt.t

7. XML Output

This is just to have a Jenkins-compatible artifact, so I could have a continuous integration framework test whatever I'm working on (at least, in theory). The best summary of JUnit's schema seems to be found here. Actually, with our work done so far, this amounts to be just an exercise in writing some SML code.

Schematically, the XML output from JUnit looks like:

<?xml version="1.0" encoding="UTF-8"?>
<!-- a description of the JUnit XML format and how Jenkins parses it. See also junit.xsd -->

<!-- if only a single testsuite element is present, the testsuites
     element can be omitted. All attributes are optional. -->
<testsuites disabled="" <!-- total number of disabled tests from all testsuites. -->
            errors=""   <!-- total number of tests with error result from all testsuites. -->
            failures="" <!-- total number of failed tests from all testsuites. -->
            name=""
            tests=""    <!-- total number of tests from all testsuites. Some software may expect to only see the number of successful tests from all testsuites though. -->
            time=""     <!-- time in seconds to execute all test suites. -->
	    >
  <!-- testsuite can appear multiple times, if contained in a testsuites element.
       It can also be the root element. -->
  <testsuite name=""      <!-- Full (class) name of the test for non-aggregated testsuite documents.
                               Class name without the package for aggregated testsuites documents. Required -->
         tests=""     <!-- The total number of tests in the suite, required. -->
         disabled=""  <!-- the total number of disabled tests in the suite. optional -->
         errors=""    <!-- The total number of tests in the suite that errored. An errored test is one that had an unanticipated problem,
                           for example an unchecked throwable; or a problem with the implementation of the test. optional -->
         failures=""  <!-- The total number of tests in the suite that failed. A failure is a test which the code has explicitly failed
                           by using the mechanisms for that purpose. e.g., via an assertEquals. optional -->
         hostname=""  <!-- Host on which the tests were executed. 'localhost' should be used if the hostname cannot be determined. optional -->
         id=""        <!-- Starts at 0 for the first testsuite and is incremented by 1 for each following testsuite -->
         package=""   <!-- Derived from testsuite/@name in the non-aggregated documents. optional -->
         skipped=""   <!-- The total number of skipped tests. optional -->
         time=""      <!-- Time taken (in seconds) to execute the tests in the suite. optional -->
         timestamp="" <!-- when the test was executed in ISO 8601 format (2014-01-21T16:17:18). Timezone may not be specified. optional -->
         >

    <!-- Properties (e.g., environment settings) set during test
     execution. The properties element can appear 0 or once. -->
    <properties>
      <!-- property can appear multiple times. The name and value attributres are required. -->
      <property name="" value=""/>
    </properties>

    <!-- testcase can appear multiple times, see /testsuites/testsuite@tests -->
    <testcase name=""       <!-- Name of the test method, required. -->
          assertions="" <!-- number of assertions in the test case. optional -->
          classname=""  <!-- Full class name for the class the test method is in. required -->
          status=""
          time=""       <!-- Time taken (in seconds) to execute the test. optional -->
          >

      <!-- If the test was not executed or failed, you can specify one
           the skipped, error or failure elements. -->

      <!-- skipped can appear 0 or once. optional -->
      <skipped/>

      <!-- Indicates that the test errored. An errored test is one
           that had an unanticipated problem. For example an unchecked
           throwable or a problem with the implementation of the
           test. Contains as a text node relevant data for the error,
           for example a stack trace. optional -->
      <error message="" <!-- The error message. e.g., if a java exception is thrown, the return value of getMessage() -->
         type=""    <!-- The type of error that occured. e.g., if a java execption is thrown the full class name of the exception. -->
         ></error>

      <!-- Indicates that the test failed. A failure is a test which
       the code has explicitly failed by using the mechanisms for
       that purpose. For example via an assertEquals. Contains as
       a text node relevant data for the failure, e.g., a stack
       trace. optional -->
      <failure message="" <!-- The message specified in the assert. -->
           type=""    <!-- The type of the assert. -->
           ></failure>

      <!-- Data that was written to standard out while the test was executed. optional -->
      <system-out></system-out>

      <!-- Data that was written to standard error while the test was executed. optional -->
      <system-err></system-err>
    </testcase>

    <!-- Data that was written to standard out while the test suite was executed. optional -->
    <system-out></system-out>
    <!-- Data that was written to standard error while the test suite was executed. optional -->
    <system-err></system-err>
  </testsuite>
</testsuites>

8. Test Discovery

Ideally, my project should look like:

├── src
│   ├── main.sml
│   └── foo.sml
├── test
│   ├── xunit.sml
│   ├── runner.sml
│   ├── main.sml
│   └── foo-test.sml
└── README.md

That is to say, the tests reside in their own directory, separate from the source code being tested. We would like to have runner.sml iterate through the files and subdirectories, execute the tests and suites defined.

(The reason we have separate main.sml files is because of how idiosyncratic each SML compiler is, and we want to support Poly/ML and MLton.)

Writing code for test discovery may be a bit tricky. I think the fact of the matter is that we'll need to add each test suite to the build file, and automatically use the relative path as the test suite's name. This would involve using OS.FileSys and possibly OS.Path, and revising our suite method to take no arguments (since it would be inferred automatically).

We can make test discovery done "by hand" using whatever build process, and the functions suite and test. The test runner will then iterate through the test suites (which are loaded into all_suites), and execute them. Thus test discovery boils down to maintaining a list of files to compile in the Makefile (or equivalent) in the project/test/ subdirectory.