Archive for April, 2009

Problem 2 ver. 2, 3, 4: logarithmic complexity

April 17th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Find the sum of all the even-valued terms in the Fibonacci sequence which do not exceed four million

The last time we had the straightforward O(n) solution: building a sequence, filtering out even values and adding them. We can improve a bit, noticing that actually, every third member of the Fibonacci sequence is even. We don't check then for evennes, but just jump over three components each time. This version 2 (I don't publish it here) should be several times faster, but still is O(n) in performance.

We can also express a member of the Fibonacci sequence via the third and sixth members from behind: and compute those values as the values of a new sequence: . This version 3 is essentially the same as the previous one and again, I don't publish it here.

The drastic improvement is obtained using the expression (I've added it and a proof to the wikipedia article, but they immediately reverted my changes as "unsourced" --- this is pathetic). Now the sum is obtained just computing one Fibonacci member, and this can be done with O(log n).

Indeed, we can compute a Fibonacci member exponentiating the appropriate matrix, and this exponentiation, just like usual one, can be done with O(log n). I prefer this solution over using the golden ratio exponentiation formula (again logarithmic complexity), because only integer-operations are involved. So, this is the version 4 of the solution.

  1.  

I quickly outlined a class for 2D matrices and operations with it:

  1.  

The solution is O(1) in memory and O(log n) in performance - of course, where n denotes the index of a number in the Fibonacci sequence. And we have been questioned about the cutset, where members of a sequence are less than certain number. Then an additional function (closest-fibonacci-index) comes in handy (see the wiki for explanation):

  1.  

The final touch is asking ourselves about complexity of the (log) function. Well, it can be computed fast enough not to spoil complexity of the algo's main part.

Problem 1 ver. 3: optimization

April 5th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Find the sum of all the multiples of 3 or 5 below 1000.

Let us generalize again to a finite set of factors.

There is a formula for the power of finite sets

which can be generalized to a finite number of finite sets

or in a somewhat less understandable, but concise notation

Here is a measure (i.e. it commutes with the union sign) and can be replaced with --- power of a set sign or, if we are in the natural numbers space, with the sum of elements sign, as in our case. is not a multiindex, but a subset of the natural numbers cut from to .

Now by we denote the set of all the multiples of factors , less than certain number N, where i varies from 1 to n (each is respectively the set of multiples of a factor ). We use the above-mentioned formula to compute the measure of the union via measures of all and measures of all finite intersections of them.

Suppose, we have a number , prime or not, and the set of all it's multiples (they include only numbers less than N). Power of this set is of course (div operation) and the sum of its members can be calculated by the well-known formula for the sum of an arithmetic progression.

As regards all the intersections, it is understandable that we ought to calculate the least common multiple (LCM) of taken factors, and the set-intersection of their multiples will be just a set of its multiples. However, the current version of the solution assumes that we take primes as factors, then the LCM of them is just their product. When I calculate proper LCM in Problem 5 (up to now there is a bruteforce version), I will switch the temp version to it.

Let's see the solution. New util functions:

  1.  

The function that calculates subsets of a set:

  1.  

Important thing about this function is that it returns the empty set as the first element and the full set as the last element of a result list, all other subsets are in between. The number of subsets of a finite subset is just , so the complexity is --- it would be better visible with an imperative-iterative version of this function (I'm not posting it here). As regards memory, the function generates all the subsets as lists which in whole contain elements (strange, this neat formula isn't on Wikipedia yet, I should add it there), that is the memory load is . This is a not-so-good idea to load everything into memory, as we can rewrite this function (and the function that is down here in the post) iteratively with O(n) memory complexity --- taking advantage of combinadics, but for now I am satisfied enough with this version.

Using the formula above the solution now as simple as

  1.  

With that (cdr) I cut off the empty subset, whose measure is zero (otherwise the (sum-of-one) function has to be a bit more complex).

Let's be careful with notation: n here is actually not the same n, as in the (subsets) function, but the number up there, the maximum of our multiples-sets. The performance complexity depends on k and N, but we are interested only in complexity, depending on N. Let's assume that k is small comparing to N, which should be the usual case. Then the complexity is roughly speaking O(1), doesn't depend on N, as we wanted (I remind that in the previous version we had O(N) complexity).

The final touches are the regression tests:

  1. span class="co1">;(assert (=
  2. ;        (sum-list (multiples-less-than-bruteforce 1000 '(3 5 15)))
  3. ;       (sum-multiples-less-than 1000 '(3 5 15))))

The last commented one breaks, of course, as 15 is not prime - the LCM algo should be updated still.