How to test Scalaz type class instances with specs2 and ScalaCheck

A few days ago I found out how easy it is to test Scalaz type class instances with specs2 and ScalaCheck. It was impressive to see that it only required a few additional lines of code in my test class to check that my monoid not only has a Monoid instance but that it actually satisfies the monoid laws.

In this post I want to show a minimal monoid example and what is needed for testing the monoid laws. Let me briefly recap the monoid laws and why you want to test them. As Wikipedia will tell you, a monoid is a set, S, together with a binary operation "•" (called append by Scalaz) that satisfies the following three axioms:

Closure: For all a, b in S, the result of the operation a • b is also in S.

Associativity: For all a, b and c in S, the equation (a • b) • c = a • (b • c) holds.

Identity element: There exists an element e (called zero by Scalaz) in S, such that for all elements a in S, the equation e • a = a • e = a holds.

In any implementation of Monoid[S] only closure is guaranteed automatically by the compiler because of the signature of append (def append(a: S, b: => S): S). But if append is really associative or if zero is really the identity element under append is not guaranteed at all. To make sure that your Monoid is actually a monoid, you could either prove the laws for your current implementation (and prove them again if you change the implementation) or you could test if those laws hold for some random values. And this is where Scalaz, specs2, and ScalaCheck play nicely together and make your law testing a lot easier!

Let's look at the example code. We start with the class for which we'll add a Monoid instance:

case class Vec(x: Int, y: Int)

Vec is a simple container of two integers and our Monoid[Vec] provides componentwise addition:

import scalaz._

object Vec {
  implicit object VecMonoid extends Monoid[Vec] {
    def zero: Vec = Vec(0, 0)
    def append(v1: Vec, v2: => Vec): Vec = Vec(v1.x + v2.x, v1.y + v2.y)
  }

  implicit object VecEqual extends Equal[Vec] {
    def equal(v1: Vec, v2: Vec): Boolean = v1 == v2
  }
}

Equal[Vec], which can test two Vecs for equality, is required for the law testing.

The next step is to write the actual test code. ScalaCheck requires an implicit Arbitrary[Vec] in order to generate random Vec values. Once we have that, we ask scalaz-specs2 to check all monoid laws for Vec in one line of code (and while we are at it, we also check the equal laws):

import org.scalacheck.Arbitrary
import org.specs2.scalaz._
import scalaz.scalacheck.ScalazProperties._

class VecSpec extends Spec {
  implicit val arbitraryVec = Arbitrary {
    for {
      (x, y) <- Arbitrary.arbitrary[(Int, Int)]
    } yield Vec(x, y)
  }

  checkAll(monoid.laws[Vec])
  checkAll(equal.laws[Vec])
}

Et voilà! That is all it takes to test the laws for Monoid[Vec]. Simple, isn't it? If we run VecSpec, we'll get the following output:

VecSpec

monoid must satisfy
+ monoid.semigroup.associative
+ monoid.left identity
+ monoid.right identity

equal must satisfy
+ equal.commutativity
+ equal.reflexive
+ equal.transitive
+ equal.naturality

Total for specification VecSpec
Finished in 20 ms
7 examples, 700 expectations, 0 failure, 0 error

For each law 100 tests with random Vec values are performed and all passed without any error. Now we are a lot more confident that our Monoid[Vec] is actually a monoid.

The complete example with a sbt build file is on GitHub (scalaz-monoid-example). Just clone it and run sbt test to see the law testing in action. :-)

Abusing a non-deleting shared_ptr for lifetime tracking

In this post I want to demonstrate how C++11 smart pointers can be abused to track the lifetime of an object without taking ownership of them. Due to the resemblance to std::enable_shared_from_this I call this technique enable_weak_from_this. One word of warning though, this technique is a crude hack and can lead to dangling pointers and undefined behavior. You should NOT use it in production code.

Intent

Allow an object t whose lifetime is managed by another object (which is not a smart pointer) to create smart references to itself that know whether t still exists or if it has been deleted.

Implementation

The core of enable_weak_from_this is a std::shared_ptr private member variable that is constructed with the this pointer and a custom deleter that does not delete the "managed" pointer on desctruction. At first, this might seem pointless since the job of shared_ptr is to delete pointers when there are no other shared_ptr that reference them. But what remains is the reference counting mechanism which can be used to track the lifetime of t. The shared_ptr member of t guarantees that during its lifetime the reference count is always equal or greater than one. After t is deleted the reference count drops to zero. So the reference counter of the private shared_ptr member "tracks" the lifetime of t.

To create smart references from t to itself, there is a public method that returns a std::weak_ptr which is constructed from the shared_ptr member. These non-owning weak references now allow to access t (via a prior conversion to a shared_ptr) during its lifetime.

Here is the complete definition of enable_weak_from_this:

Sample Code

And here is a small example that uses enable_weak_from_this, which I'll explain below:

A inherits from enable_weak_from_this<A> and therefore objects of type A are able to create those smart references to themselves. A also has a method to create a "child" object of type B which is constructed with a weak reference to its creator. All the usage of the weak reference is in B::callHello where weak_ptr::lock() is used to create a temporary shared_ptr<A> of the creator object if it still exists. If the creator has been deleted, lock() will return an empty shared_ptr. Finally main demonstrates what happens on b->callHello() calls before and after the creator of b is deleted.

Problems

It is possible to create a shared_ptr of t whose lifetime exceeds the lifetime of t by storing the return value of weak_ptr::lock(). So care must be taken that t is not deleted while using the acquired shared_ptr.

Combining predicates in Scala

This post is a follow-up to my Stack Overflow question on Combining predicates in Python, but with the focus on Scala. It is also a little warm-up programming exercise for similar techniques I think I'll will learn from the book DSLs in Action by Debasish Ghosh that I recently received as present.

The initial situation is fairly easy. I have a set of predicates that I want to logically combine to create new predicates. A predicate in this context is a function that takes one argument of an arbitrary type and returns a boolean value. The Scala type of a generic predicate would be Any => Boolean. Here are some very simple examples that take integers as input:

val isDivisibleBy13: Int => Boolean = _ % 13 == 0

val isPalindrome: Int => Boolean = x => {
  val s = x.toString
  s == s.reverse
}

Now that I have some predicates to play with, I want to combine them to create a new predicate. E.g. a predicate that tests whether a given integer is a palindrome and is divisible by 13. That's fairly easy:

val isPalindromeAndDivisibleBy13: Int => Boolean = x => {
  isPalindrome(x) && isDivisibleBy13(x)
}

So far so good, but there are two things about this new predicate that I don't like. First, I had to specify the type of the function, which means if the types of the inner predicates change, all compound predicates need to be updated as well. And second, I had to specify the argument x at all although it is only required to be passed to the inner predicates. So the question is: can we do better? Can the compound predicate be written in a point-free style and without specifying its type? Ideally it would look like this:

val isPalindromeAndDivisibleBy13 = isPalindrome && isDivisibleBy13

Such a definition would be concise, expressive, and robust against changes of the inner predicates, since all unnecessary details are hidden. New predicates could then be composed by just connecting any number of predicates with logical operators (like && and ||).

As it turns out, it is indeed possible to combine predicates in Scala as described above by using the Pimp my Library pattern and a little bit of Scala's syntactic sugar. The key ingredients to make this work is a class that is a thin wrapper for predicates and a method to implicitly convert any predicate to an object of that class. Let's start with the wrapper for predicates that I will call Predicate. Here is its full definition:

class Predicate[-A](val pred: A => Boolean) extends (A => Boolean) {
  def apply(x: A) = pred(x)

  def &&(that: A => Boolean) = new Predicate[A](x => pred(x) && that(x))
  def ||(that: A => Boolean) = new Predicate[A](x => pred(x) || that(x))
  def unary_! = new Predicate[A](x => !pred(x))
}

Let me dissect Predicate:

  1. It wraps a predicate of type A => Boolean.
  2. It extends (A => Boolean) (which is syntactic sugar for Function1[A, Boolean]) so that it is itself a predicate of type A => Boolean. That means that instances of Predicate can be used as arguments for its constructor and its other methods that take predicates.
  3. Instances of Predicate are callable via the apply method which just forwards its argument to the wrapped predicate.
  4. It defines the methods && and || that each take another predicate as argument and return a new instance of Predicate constructed with an anonymous predicate that is the composition of the wrapped predicate and the passed argument. The unary_! method is similar but instead of composing two predicates it returns a modified version of the wrapped predicate.

So Predicate is a transparent wrapper for predicates with methods for composing them. The compound predicate can now be written like this:

val isPalindromeAndDivisibleBy13 = Predicate(isPalindrome) && isDivisibleBy13
//           which is the same as: Predicate(isPalindrome).&&(isDivisibleBy13)

The last ingredient to simplify the composition of predicates, and to make my second definition of isPalindromeAndDivisibleBy13 work, is the implicit conversion of predicates to instances of Predicate which is as simple as:

implicit def toPredicate[A](pred: A => Boolean) = new Predicate(pred)

That's it! With Predicate and this implicit conversion, you can now easily combine your predicates. The complete code with some examples is available in predicates.scala.

Further reading:

Vim syntax highlighting for the SLHA

In my diploma thesis I'm working with supersymmetry spectrum calculation programs (primarily SPheno) which use the SUSY Les Houches Accord (SLHA), defined in arXiv:hep-ph/0311123 and arXiv:0801.0045 [hep-ph], for data input and output. To ease editing input files and reading output files with my favourite text editor Vim, I wrote a corresponding syntax file for the Accord. It is really helpful for preventing spelling errors of block names (which are partially cryptic) and for navigating and extracting numerical data.

To enable syntax highlighting in Vim for the SLHA, download slha.vim and scripts.vim. Copy slha.vim into ~/.vim/syntax/ and scripts.vim into ~/.vim/ or if you have already a scripts.vim file append the content of my file to it. Without further ado Vim should now highlight SLHA files. Here is the compulsory screenshot:

b is a power of 2

One of the most interesting books I've read in the past years is without doubt Gödel, Escher, Bach by Douglas Hofstadter. There he introduces TNT and asks the reader to translate the statement "b is a power of 2" into TNT. My proposed solution, which I wrote down two years ago, is

$$ \sim\exists a: \, \left\langle \sim\exists c: \, a=\left(c+c\right) \, \wedge \, \forall d: \, b=\left(\mathrm{SS}a \cdot d\right) \right\rangle \ . $$

This roughly translates to English as: "There does not exist an odd number \(a\) such that for every \(d\) \(b\) equals \(\left(a + 2\right) \cdot d\)." The reason, why this statement is only true if \(b\) is a power of 2, is based on the following equation:

$$ \left\{u \cdot n \,|\, u=2k+1 \, : \, k, n \in \mathbb{N}\right\} = \mathbb{N} \,\backslash\, \left\{2^n \,|\, n \in \mathbb{N}\right\} \ $$

which means that every natural number except powers of 2 can be written as a product of an odd natural number \(u\) and a natural number \(n\). This can be easily understood. Just take the prime factors of an arbitrary natural number. If one of these prime factors is odd, the number can be written as a product of this odd prime factor and the product of the remaining prime factors (which can be even or odd). If there is no odd prime factor, all prime factors must be even and since 2 is the only even prime number the original number must be a power of 2.

I'm wondering if my solution is correct and how other people have written this statement in TNT.

Converting git-svn tag branches to real tags

Over the last few days I've converted some Subversion repositories used for Debian packaging to Git. None of these Subversion repositories contained upstream sources because of Subversion's storage inefficiency. With Git I wanted to change that and also track upstream sources in my repositories. To move a Git repository from a debian-only to debian+upstream layout I found zack's recipe very helpful.

The actual conversion to Git was done with git-svn. I cloned the Subversion repositories with git-svn's --prefix=svn-import/ and --stdlayout options so that the trunk and all branches and tags were imported as remote branches with svn-import/ prepended. Then I created local branches of the previous Subversion branches and removed the remote remnants, for example:

git checkout -b debian/backports/etch svn-import/branches/etch-backports
git branch -d -r svn-import/branches/etch-backports

The Subversion tags are imported as svn-import/tags/<version> where <version> is the Debian version number. I don't need branches for every version of my packages but I wanted to convert these tags to real Git tags but without losing the actual commit dates and messages. To achieve this I wrote this small script:

#!/bin/bash

for branch in `git branch -r`; do
    if [ `echo $branch | egrep "svn-import/tags/.+$"` ]; then
        version=`basename $branch`
        subject=`git log -1 --pretty=format:"%s" $branch`
        export GIT_COMMITTER_DATE=`git log -1 --pretty=format:"%ci" $branch`

        echo "Tag $version [Y/n]?"
        read yesno
        if [ -z $yesno ] || [ $yesno = "Y" ]; then
            git tag -s -f -m "$subject" "debian/$version" "$branch^"
            git branch -d -r $branch
        fi
    fi
done

For each remote branch that contains svn-import/tags/ it gets the version number, the commit message of the tag and exports GIT_COMMITTER_DATE (man git-tag) with the value of the date the version was tagged in Subversion and then asks you if it should tag the parent of the tag branch with the original commit date and message and remove the then unnecessary tag branch. The "safety measure" is there because people sometimes commit directly to Subversion tags (ugh!) and then "$branch^" would not be the commit you want to tag. To decide which Subversion tags are safe to tag with this script, gitk --all can be of great help.