Sunday, October 31, 2010

Problem: Euler Task 6 - Difference between Square of Sums and Sum of Squares

Well, task no 6 is pretty straight forward, so I write the solution quickly. Promise. I will stop coding this evening even when it is still raining  out there.

The problem is the following:

"The sum of the squares of the first ten natural numbers is,
12 + 22 + ... + 102 = 385
The square of the sum of the first ten natural numbers is,
(1 + 2 + ... + 10)2 = 552 = 3025
Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.
Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum."


First let's define square function. Easy:

def square(x : Int) = x * x

We define then the diff method that calculate the difference. The function receives a Range as its input. It is defined as follow:

def diff(r: Range) = 
   square( r.foldLeft(0)(_ + _)) - r.foldLeft(0)(_ + square(_))


The answer to the problem is then diff( 1 to 100).

While the exercise is easy, this is kind of problem where Scala, or functional programming in general, at its best: only three lines to solve this problem (I can even tweet the solution). That's why I find this exercise quite interesting.





Problem: Euler Task 3 - Largest Prime Factor

(Décidement, je suis en forme pour coder ce weekend. Après tout, avec le mauvais temps à Cote d'Azur ...)

Still from euler project. Now for task 3: largest prime factor. The integer factorization is an important subject and indeed it finds its application in cryptography. The fact that integer factorization is hard is used in many cryptography algorithms. But, to answer the question in euler project task 3, brute force approach should be enough since the integer to factorize is relatively small.

So, here we are. Here is the problem (http://projecteuler.net/index.php?section=problems&id=3)
"The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?"

First, the observation. The largest prime factor of 2 is 2. In general, I consider also that the largest prime factor of a prime number is the number itself. 

Then, we will try to check that the number is divided by a number. We check for every number from 2 until the square root of the checked number. Once found, the number is divided by the checked number and the result is again checked with the same algorithms, but with the checked number as the starting point (not 2). 

This yields the following implementation:

 1 
 2 import scala.collection.immutable.NumericRange
 3 
 4 def largestPrimeFactor(x: BigInt) = { 
 5   def largestPrimeFactor(x: BigInt, max: BigInt) : BigInt = {
 6      if (x == 2)  x 
 7      else {
 8         val nextFactor = 
 9           new NumericRange.Inclusive[BigInt](max, csqrt(x), 1).
10              find(x % _ == 0)
11         nextFactor match {
12           case Some(fct) => largestPrimeFactor( x / fct, fct)
13           case None => x
14         } 
15     }
16   }
17   largestPrimeFactor(x, 2)
18 }

Note the use of csqrt function that is  sqrt function implementation that works with BigInt. Then, line 8 - 10 is to get the first number that divides x. Line 11 - 13 acts accordingly, that is, if a factor is found, then it must be a prime number. Then x is divided by the found factor followed by recursive call to the same function.
If the find returns None, then it is the largest prime number.

As mentioned before, the brute force approach above is not practical for real problem with bigger number, especially for a number that is obtained by multiplying two large prime numbers. More sophisticated algorithms are required, and maybe I'll back to some of them one day in this blog. But for now, it's brute force.

Problem: Euler Task 4 - Largest Palindrome

Let's continue with the euler task programming series (http://projecteuler.net/index.php?section=problems ). Now, we will attack the 4th problem. 

OK. Here it is, problem number 4:

"A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.
Find the largest palindrome made from the product of two 3-digit numbers."

First, we need  a function to check if a String is palindrom. Here is the code:

   1 def isPalindrome(s: String) = {
   2    val incr =  s.length % 2   
   3    s.substring(0, s.length/2) == 
   4        s.substring(incr + s.length/2).reverse
   5 }

Note the use of incr that has value either 0 or 1. It is intended to handle the case where the length of the string is odd.
Then, a little analysis. 
The product of two 3-digit numbers has 5 or 6 digits. 
Assume that we will find the largets palindrome as 6 digit number.. The number then  will be in the form of 'abccba' with the following property: 
The largest palindrom 
100000 a + 10000 b + 1000 c + 100 c + 10 b + a 
= 10 0001 a + 10010 b + 1100 c
which is a multiply of 11. So, one of the 3 digit number will then multiply of 11. This information can be used to reudce the brute force nature of the search. 
The largest multiply of 11 is 11 * 90 = 990. 
The search then starts from 990, followed by 979, and subsequent mutliply 11 until 11 * 9 that is no longer 3 digit number. The multiply 11 number is then multiplied by all 3 digit numbers. 
To optimize further, we take advantage of the iteration nature that goes down from the largest multiply 11 until 99. If  we have a maximum number  xi * y, the product of xii < xi with z can only be larger than xi * y if z > y. This allows us to stop earlier. The code is the following. Note the annotation tailrec that asks the compiler to check if the method is tail recursive.
 1 @tailrec
 2 def getLargestPalindrome(
 3     x : Int, y : Int, 
 4     max: Int, xMax:Int, yMax: Int) : Int  = {
 5   if ( x * 11 < 100) {
 6      max
 7    } else {
 8        val mul = x * 11 * y
 9        
10        if ( mul > max ) {
11           if (isPalindrome(mul.toString))
12              getLargestPalindrome(x - 1, 999, mul, x, y )
13            else 
14              getLargestPalindrome(x, y -1, max, xMax, yMax)
15        } 
16        else {
17            getLargestPalindrome(x - 1, 999, max, xMax, yMax)
18        }
19    }
20 }
When executed, the largest palindrom is indeed 6 digits, so the algorithms above are applied. What's xMax and yMax for ? It's the two 3 digit numbers. 

Saturday, October 30, 2010

Problem : Euler Task 2

Let's continue the effort of answering euler project challenge. Here, we answer the challenge no. 2.


Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:
1, 2, 3, 5, 8, 13, 21, 34, 55, 
89, ...
Find the sum of all the even-valued terms in the sequence which do not exceed four million.

First, let's investigate the implementation of fibonacci. A simple recursive program is the following

1 def fib(x: Int) : Long = {
2    if (x >= 2) {
3       fib(x -1) + fib(x -2)
4    } else {
5       1
6    }
7 }
The problem with the code above is the performance. fib(45) in my computer took 77 s. Fortunately, Scala can optimise tail recursion and indeed fibonacci can be implemented using tail recursive algorithms. The implementation is shown in the following code:
1 def fibiter(index: Int) : Long = {
2  def fibiter(x: Long, acc: Long, count: Int) : Long = {
3      if (count == index) 
4          acc
5      else fibiter(x + acc,x, count + 1)
6   }
7   fibiter(1, 0, 0)
8 }

To implement the algorithms, we need to define an auxiliary method, fibiter/3 that receives the "next" fibonacci number, the "current" fibonacci number, and the index of the count that needed to be compared against the requested index.
Scala optimises the code above so that it is treated just like a loop instead of a recursive call. Executing fibiter(45) does not take even 1ms. 
Having defined fibiter method, we're now ready to attack the problem, that is to sum of all even-valued terms. First, just sum everything, not necessarily the even ones. The code is implemented as follow:
 1 def fibSum(max: Long) : Long = {
 2   def fibSum(x: Long, acc: Long, sum: Long) : Long = {
 3      if (x > max) {
 4          sum
 5      }
 6      else { 
 7        fibSum(x + acc, x, sum + x) 
 8      }
 9  }
10   fibSum(1, 0, 0)
11 }
12 
Like fibiter, we use here an auxilary method, fibSum with 3 parameters. acc is used as the accumulator of fibonacci number, just like in fibiter. In addition, sum is used as the accumulator of the sum of fibonacci number computed so far. The condition at line 3-5 is used to see if the next fibonacci number exceeds the max. If it is the case, the sum so far is returned, otherwise it recursively computes the next fibonacci number.
With the above algorithms in hand, we're now ready to attack the problem, that is, to sum only even fibonacci number:
 1 def fibSum(max: Long) : Long = {
 2   def fibSum(x: Long, acc: Long, sum: Long) : Long = {
 3     if (x > max) {
 4          sum
 5      }
 6      else if (x % 2 == 0) {
 7          fibSum(x + acc, x, sum + x) 
 8      } else {
 9          fibSum(x + acc, x, sum) 
10      }
11   }
12   fibSum(1, 0, 0)
13 }
The lines 6-10 are the key of the solution. The sum is updated only when the current fibonacci number is even. 
Finally, to further generalise the method, the fibSum is modified to accept a function Long => Boolean as follow:
 1 def fibSum(max: Long, filter: Long => Boolean) : Long = {
 2   def fibSum(x: Long, acc: Long, sum: Long) : Long = {
 3     if (x > max) {
 4          sum
 5      }
 6      else if (filter(x)) {
 7          fibSum(x + acc, x, sum + x) 
 8      } else {
 9          fibSum(x + acc, x, sum) 
10      }
11   }
12   fibSum(1, 0, 0)
13 }
To sum all even fibonacci numbers under 4000000, fibSum is called as follow: fibSum(4000000, x => x % 2 ==0).

Friday, October 29, 2010

Problem: Euler task 1


From Euler project, task 1.


"If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000."


Pretty Simple:


(1 to 1000).filter( x => (x %3 == 0) || ( x%5 == 0) ).
            foldLeft(0)(_ + _)

or:

(1 to 1000).foldLeft(0)
   ( (x,y) => x + (if (y % 3 == 0 || y % 5 ==0) y else 0))


The first looks better since the addition is only done when the number is divided by 3 or 5. The second one always executes the addition and the comparison.

However, the first one creates the list of number divided by 3 or 5 first before folding.

So, I think that the second version is better in term of number of created list.

Anyway, I think the following is the best one:

(1 to 1000 / 3).foldLeft(0)( _ + _ * 3) + 
(1 to 1000 / 5).foldLeft(0)( _ + _ * 5) -
(1 to 1000 / 15 ).foldLeft(0)( _ + _ *15)


The first two versions execute 1000 comparisons inside the (1 to ...) , while the second has no comparison. So, I believe the last version should be better than the first two.


In conclusion, my guess of the best performing implementation is the following:
1. Third implementation.
2.  Second implementation.
3.  First implementation.

Maybe the first implementation performs better than the second but the first consumes more memory.

Quick profiling on time of average execution , using 1000 000 as limit instead of 1000 in my machine: 
1. First implementation: 130 ms
2. Second implementation: 65 ms
3. Third implementation: 32 ms.

Finally, I found this one as even more better solution:

(1 to 1000 / 3).foldLeft(0)( _ + _ * 3) + (
 1 to 1000/ 5).foldLeft(0)( 
   (x,y) => x + (if (y % 3 ==0) 0 else (y * 5)))

Any thought ?

Tuesday, October 5, 2010

Happy number 2: The Journey for Happiness Begins

Happy number in my last blog  is pretty straight forward. But there are a lot interesting things from the example. I'm now starting to search the algorithms that

1. find the list of all happy numbers in a given range. For example, give me a list of all happy number between 1 000 000 and 2 000 000.

2.  find any n happy numbers in a given range. For example, give me any 100 happy numbers between 1 000 000 and 2 000 000. This should be slightly different than the first problem, because it is easy to see that 1 111 111 is a happy number with the knowledge that 7 is a happy number. So, the algorithms to find any 100 happy numbers might be much more efficient than finding the first consecutive 100 happy numbers.

I would also like to know the behavior of happy number algorithms when the search is done in several threads. Would it be faster ? How much faster ? What kind of multi threading algorithms will solve the problem better ?

Another interesting question would be how to store the happy or sad number discovered so far. The lst in my previous blog for example stores the numbers being visited in the calculation. Once the decision whether a number is happy or sad, the happiness of  the visited numbers (that is, the lst) can be determined. For example, from our calculation of 7, we can deduce that 97, 130, and 10 are happy numbers. Similarly, from our calculation of 4, we deduce that 16, 37, 58, 89, 145, 42, and 20 are sad. What about the store itself, should it be a bitset so that I can store a lot of number ?

I'm pretty sure that there have been a lot attempts to answer the same questions, and yes, I tell myself, let's start the journey of research ..., the research for happiness.