Thursday, November 18, 2010

Notes on Devoxx 3rd day

We entered the conference today, and yes, it was a long queue everywhere: transportation, breakfast, vestiaire, session, everywhere. 
But, the sessions were very interesting, at least majority of them.

Here is my summary.

Introduction by Stephan Janssen
Cool introduction by Stephan Janssen and on Parleys. We can have the video of all presentations through Parleys for only 79 Euros. With that, you can watch Parleys presentation in the toilet, if you have iPad of course.

Keynotes by Mark Reinhold
Mark Reinhold presented a lot of cool things on Java. I tweeted something that I regreted afterward, I tagged his presentation was disappointing. It was indeed disappointing for me, because I expected to have some new information on Java evolution. He didn't present anything new, but his presentation was an excellent presentation on Java evolution and roadmap. You cannot not have better presentation on the roadmap than his presentation.

 He started his presentation with a statement that Oracle wanted Java to exist until 2030.  He then explained some axis of improvements: productivity, performance, universality, modularity, integration, servicability. 

For productivity, he cited some of the works on Project Coin, like the infamous <> symbol and Automatic Resource Management (ARM). 

For performance, he mentioned about some changes on the language by introducing the results from project lambda like lambda expression, method extension, and some of other things on the subject. But, as Brian Goetz confirmed later on, this must be done in the context of performance improvements in the context of multicore.

For universality, he mentioned about support of JVM for other languages. I saw more or less the same cloud of languages before in SophiaConf, with Scala and Groovy in relatively small size. I found the presentation on the subject was very short. There will be a lot sessions on new language on JVM at Devoxx, but no presentation on the corresponding JSR.

For modularity, he presented quickly the jigsaw project with the concept of simplifying classpaths with jmod. jmod install, jmod add-repo can be used. A little bit inspired by ruby gem, I suppose.

For serviceability, he talked a little on a quite sensitive subject: JVM convergence.

He mentioned two interesting things: about the possibility of having reification in Java and also value class (= Scala case class?). 

Overall, nothing really new in the presentation, but I think that was a comprehensive presentation that Java developers expect to have directly from the source.

The State of the Web by Dion Almaer and Ben Galbraith
You know what? This was the most entertaining presentation so far in the conference. Small problem: it was so entertaining, so well choreographed,  that I didn't really get the contents. 

I didn't even take time to take some notes, the presentation was just too entertaining with graphisms and stuffs, really good for your eyes (but maybe not for your brain). So, just couple Math.random points that I vaguely retained from the presentation. 

  • Application is content.
  • Mobile application becomes very important.
  • HTML 5 is great.
  • Web needs app store model.
Not much really, so I should stop here too.

JPA by Linda de Michiel

I think I made a mistake on coming to the session after reading back to back at least 2 times JPA book by Schincariol and Keith. All in  Linda's presentation are in the book. Sorry, but I could not really tell much on the presentation, just read the book. It was an extremely dense presentation though.

The State of Hadoop by Tom White
In term of style, Tom White's presentation on Hadoop was completely opposite of Almaer & Galbraith presentation: Tom's style is very monotonic and without rythms. But, guess what? I got his contents better.

Tom started by explaining the replication mechanisms in HDFS, followed by on how read and write are done. He made quick overview on the algorithms of the read and write that optimizes the bandwidth usage.

Then, the explanation moved to failure modes, especially the differences between data node crash and name node crash. When data node crashes, client will read from other replicas and name node instructs data node to replicate. Name node crash is a much important issue because it means down time. There is an ongoing effort on high availability name node: .

The presentation continued to map reduce overview. Input => Map => Shuffle => Reduce => Output. Tom used an illustration of unix pipeline to illustrate the concept. Pretty cool.
Tack tracker and job tracker failure modes were then explained. 

He showed also some examples on Hadoop use. 

The presentation moved forward to ecosystem. He showed an overly complicated project graphs showing the relations among the Hadoop projects. Very complex, but then he simplified (he mapped and reduced, right ?).

Fundamental Projects

Some projects are fundamental projects. They are HDFS (the file system), map reduce frameworks, zoo keeper, and avro. Zoo keeper is a coordintion service for distributed application, including leader selection algorithms and distributed locking. Avro is data serialization library.
The main challenges on the fundamental projects are the API update impacts and multi programming language support, typically Python.

Component for Analysis

Some projects in Hadoop are intended for analysis. We have Pig, Hive, Cascading, Mahout. Pig and Hive are the data retrieval components using SQL like language or a pretty independent query language (Pig). Mahout is a library for machine learning that implements some of map/reduce algorithms.

Howl project is intended to share table among services.

Components for Data Loading

Some projects are intended to facilitate data load components like Sqoop and Flume. 

Components for Coordination

Oozie and Whirr are examples of components that handle coordination.

I found the presentation extremely fluid and informative. Unfortunately, like the other Tom White's session, the audience is not that responsive. I wonder why.

Lambda Project by Brian Goetz

One thing that one must always keep in mind, lambda project is intended not to make Java more concise, but to make the modification that is applicable for parallelization. With this in mind, not every cool things that language like Scala propose would be available in Java.

Nothing really new in Brian's presentation: 
  • Use of SAM instead of function type.
  • Some starter SAMs are to be included in the JDK.
  • Method references that allow do Collections.sort(persons, #Person.getLastName). 
  • I did not hear the word "defender methods" anymore, I heard a lot extension methods instead.
  • Exception transparency.
  • Interface conflicts handling.
What's new in Scala 2.8 by Dick Wall and Bill Venners
Dick Wall and Bill Venners presented new things included in 2.8. They presented using live coding. Very interesting way of presenting things, although some errors made the presentation long. But, this duo is simply my favorite in the conference so far. Dick Wall and Bill Venners for President !
  • Tooling. Dick showed IntelliJ behavior on implicit method.
  • The presentation started with REPL. One new thing I learnt was :sh to invoke shell command. Other cool things on REPL were also presented.
  • Default value for case class. Interesting example on copy that comes together with case class. I should give this a try.
  • tailrec annotation to make sure that the recursion that we think tail recursive is indeed tail recursive. I have used the annotation in the codes inside this blog though. Fibonacci and Factorial were used as examples. Classical ones. 
  • Nested package.
  • @specialized
  • continuation, scarry things, to get a way from this unless you know what you're doing, too advanced.
Nothing much to say , the demo was just great.

Scala Collection and Parallelization by Martin Odersky
That was not the title. The real title is more enigmatic: Future Proofing Collections from Mutable to Persistent to Parallel. But it's actually this: scala collection and parallelization.

Martin started with the slide that he called "If I have to keep one slide, this is the one". Basically, he explained his concept of scalable language that can be agile, but type safe and performant. This looks contradictive, but  this can be reached by combining Object Orientation and Functional Programming. 

The focus of the presentation was on Collection, because Collection is heavily used in the codes, and a lot of problems are on the collection. Collections in Scala, especially in 2.8 are:
  • Object oriented
  • Generic
  • Persistent
  • Higher Order 
  • Uniform Return type principle: Function should return collections of the same type as the (left hand side) operand. That is, map of List should return the same class of List. If it is set, it should return the same class of Map, and so on.

Concurrency is hard. Actors and STM are two good tools to address concurrency, but they are not enough. In concurrent world, for safety and performance, immutable collections are needed. That's exactly what Scala proposes. Scala proposed immutable collections, it also proposed parallelization (well, not in 2.8 though).  

He took an example coming from a paper of Lutz Prechelt that compared several programming languages using a case study of phone code. Martin showed how Scala implemented the solution of phone code. It was a code of more or less 30 lines of codes, excellent. To take advantage of concurrency, the collection used in the example can be changed to parallel computation and par method can be used. 

Bit Rot is Dark Side

So, Scala was, at least around 18.30 Antwerp time, was good. But, there was a dark side: the implementation of Uniform Return Type principle. It turned out that implementing the principle represented an important challenge. The duplications are needed. The function filter, for example, needs to be reimplemented in every class. What a mess !

We then reached something Martin called as Bit Rot: lots of duplication methods, inconsistencies, and broken window effect. 

Scala does not let you down on this. Martin Odersky presented the solution of higher kinded types that one could encode in Scala, thanks to implicit construct of Scala. Martin Odersky presented how this encoded, especially in the Scala collection code -- to implement Uniform Return Type principle.

The complete solution included in the presentation was very hard to grasp (I might want to blog my understanding on the solution some day). I was just wondering, why Martin Odersky insisted to present this at the Devoxx conference. Shouldn't it better for him to get into detail on parallelization of par and parallel collections instead ?

But anyway, the dark side of his presentation gave me a homework to do. Hopefully I would be able to solve it on time, right Professor Odersky ?

Frites et Mayo
The day terminated -- for me , there were still couple of BOFs running in the evening -- with Frites et Mayonnaise to consume. Great ones, with very long queues for everybody. Before the conference, I could not imagine to be in the same queue as Mark Reinhold (and yes, he was 4 people behind me, not sure he got the frites though).

Play Framework Meet-up at Axxess
Before going back to the hotel to web-caming with my children, I took some times to attend Play Framework Meet-up at Axxess. Met couple of interesting people there, of course Peter Hilton and Nicolas Leroux, and somebody from Alfresco who will present Activiti tomorrow (18 November). They are nice people.
I have played with Play couple times ago, and I think of playing it again after the conference. 

No comments: