Sunday, August 8, 2010

Hacking closure in Java 7.0 (1): SAM Conversion

I used a lot of functor library from Apache couple years ago and found it very handy to write with minimal boiler-plate code. At that time, I wondered, whether closure (or functor at that time) would be included in the language itself. Now, it is indeed to be included in Java 7.0 however unclear it is. In OpenJDK, through Lambda project, there would very likely be -- but again, still unclear -- a closure support in Java 7. Indeed, in lambda-dev mailing list, a lot of things seem to be more or less finalized, and we may really hope to have closure in Java 7.0 in September.

I spent some times yesterday and today to play around with the closure prototype right from the source. First, I reinstalled the binary jdk 7 to the build 103 one and then get the langtools project from mercury. Finally, I built the project and got my javac that can compile closure. 

The first use of closure is to simplify the implementation of Simple Abstract Method (SAM). So, what is SAM ? It is a method of interface that contains only one method, like run method of Runnable, call method of Callable, compare method of Comparator, and so on.
For example, we can use sort that accepts Comparator as follow:


       List<String> myList = new LinkedList<>();
   myList.addAll(Arrays.<String>asList(
       "AAA", "BB", "CCCC", "D", "FDFFFF"));   
     Collections.sort(
       myList,
       new Comparator<String> () {
         public int compare(String o1, String o2) {
           return o1.length() - o2.length();
         }
      });

(Note also how I use the diamond <> for myList creation, which is also introduced in Java 7).
This is not cool  because the need to write the inner class Comparator makes the code unreadable. And guess what, some companies even don't allow writing those kinds of inner class. Good news is, using closure, we can write an implementation of SAM shorter. Here is the code:

        List<String> myList = new LinkedList<>();
     myList.addAll(Arrays.<String>asList("AAA", "BB", "CCCC", "D", "FDFFFF"));   
     Collections.sort(myList, 
                      #(String x, String y) { x.length() - y.length() });
Much better, isn't it ? We can also write the comparator as follow:
         Comparator<String> lengthComparator = 
      #(String x, String y) { x.length() - y.length() }

Or runnable as follow:
     Runnable r = #() { method1(); method2();};

Now, let's put into practice.

Let's imagine we need to write a sum, multiplication, max, or min of a list. Without closure we'll have to have four different implementations of for loop for each:
public class ListIntUtil {
    public static Integer sum(List<Integer> list) {
        Integer result  = 0;
        for (Integer el : list) {
            result += el;
        }
        return result;
    }

    public static Integer multiply(List<Integer> list) {
        Integer result  = 1;
        for (Integer el : list) {
            result *= el;
        }
        return result;
    }
}
With closure, we can define a method that accepts the initial value and the function in addition to the list itself. So, we want to call like


foldLeft(myList, 0, #(Integer x, Integer y) { x + y })

for sum and 


foldLeft(myList, 1, #(Integer x, Integer y) { x * y })

for multiplication.

The definition of foldLeft is the following:


public static Integer foldLeft(Integer initial, List<Integer> integerList, BinaryFunction fn) {
     Integer result = initial;
     for (Integer in  : integerList) {
        result = f.apply(result, in);
     }
     return result;
}

with BinaryFunction is a SAM defined as follow:
public interface BinaryFunction {
   Integer apply(Integer x, Integer y);
}


Again, there's no need to implement BinaryFunction, it's enough to to write #(Integer x, Integer y) { x + y }. 
Indeed, we can assign the closure directly to BinaryFunction :

BinaryFunction fn = #(Integer x, Integer y) { x + y }.

With foldLeft above, we can easily apply max and min of a list 




Integer maxNumber = foldLeft(myList, Integer.MIN_VALUE, #(Integer x, Integer y) { Math.max(x,y) })
Integer minNumber = foldLeft(myList, Integer.MAX_VALUE, #(Integer x, Integer y) { Math.min(x,y) })

Let's try with something else, let's say a method that maps a list of integer to another list of integer that contains the double of the value of each element of the first list. That is, if we have (1, 3, 3, 4), the function returns (2, 6, 6, 8).

First, we need a SAM definition (note that, we usually don't need to do this, some existing libraries already exist, but here I just want to show that it really works with any interface).

public interface UnaryFunction {



  Integer apply(Integer x);
}


Then, the implementation of double is as follow:
    List<Integer> twice = map(intList, #(Integer x) { x * 2 }); // (2, 6, 6, 8)




That can be modified easily when multiplication by three is expected instead:   
    List<Integer> thrice  = map(intList, #(Integer x) { x * 3 }); // (3, 9, 9, 12)



Finally, the implementation of map is as follow:


public static List<Integer> map(List<Integer> list, UnaryFunction fn) {
       List<Integer> result = new LinkedList<>();
       for (Integer el : list) {
           result.add(fn.apply(el));
       }
       return result;
}



All right. Let's refactor a little bit more. Now, we have seen that BinaryFunction and UnaryFunction that works with Integer type.  Of course, we can use generic to make them more reusable. So, here is the generified version of the two interfaces:





public interface UnaryFunction<T> {



  T apply(T x);
}





public interface BinaryFunction<T> {
  T apply(T x, T y);
}

The adapted version of foldLeft and map is as follow:





public static <T> T foldLeft(T initial, List<T> integerList, BinaryFunction<T> fn) {



     T result = initial;
     for (Integer in  : integerList) {
        result = f.apply(result, in);
     }
     return result;
}





public static <T> List<T> map(List<T> list, UnaryFunction<T> fn) {
     List<T> result = new LinkedList<>();
     for (T el : list) {
        result.add(fn.apply(el));
     }
     return result;
}






With that we can now sum a list of double by using this code:
   foldLeft(0.0d, doubleList, #(Double x, Double y) {x + y})


Or why not 
   foldLeft("", stringList, #(String x, String y) {x + y})



and map:



    map(doubleList, #(Double x) { x * 2 });


The last syntax of closure (version posted by Maurizio Cimadamore on August 4) also allows to use sequence of statement, typically for SAM that does not have a return value. For example:





      foreachRun(myInt, #{method1(); method2();});





With foreachRun is defined as follow: 

void foreachRun(List<Integer> list, Runnable sequence) {
    for (Integer el : list> {
         sequence.run();
    }
}

No comments: