Tag Archive for "scheme" tag

Haskell-like function definitions in Scheme

June 26th, 2013 by Ivan Lakhturov | 0 Category: Programming | Tags: | |

Haskell allows to decompose arguments straight in a function signature. While pattern matching is implemented in some Scheme-s, it's not applied to the signature. Luckily, we have Scheme macros to alter that.

Let's say we started out with a new language, bootstrapping it with Scheme. And in the beginning we have no lexing / parsing, just redefinitions. First of all, we change the names for the pairs, as that's the cornerstone of Scheme and some other functional languages.

Why not to change those, indeed? Say, what is the meaning of 'cdr'? Perhaps, earlier "Contents of the Decrement part of Register number" meant something. But nowadays - nothing. The only advantage perhaps is that you can quickly write some 'cadadr'-like compositions.

So, we redefine pairs' functions:

  1.  

Then, suppose we want to write a swap function, which interchanges the former and the latter parts of a pair:

  1.  

Now compare this to the Haskell version:

  1.  

Haskell decomposes a pair into "a" and "b" straight in the function signature. One can argue that it looses "p" variable name for the whole pair. But for that it has a special syntax:

  1.  

Notice, we could not use the "pair" identifier in Scheme version, as we would shadow the constructor.

The Haskell's variant is more declarative. Also, if the variable "p" would have a longer name, we'd suffer, as it's used twice. Let's make use of macros to implement Haskell-like function definitions in Scheme:

  1. span class="st0">"input did not match the function definition"

The second transformer is to allow identifier definitions like "(≝ x 10)". The "match" is defined in PLT/Racket and some other implementations. Now we can write our swap-function like this:

  1.  

or even like this:

  1.  

- if we remember about the dot-notation of the Scheme's lexer (reader).

However, that will not work. The reason is simple, the redefinitions that we wrote for the pair funcions are executed in the last phase (run-time), but the "match" macro needs it already during the macro-application phase (the expander phase). That "match" does not know about new pair's constructor.

And to mitigate that there is a clumsy "define-match-expander":

  1.  

The first transformer is used in the "match" context, and the second is for everything else (i.e. it replaces our "cons" pair constructor). The transformers are identical, but I didn't manage to declare one separately and re-use. Quick search reveals that others also have some problems with that.

So, now we can imitate Haskell notation. For one parameter and one pattern only, but that could be extended. Haskell, however, also has a limit of one pattern, as to declare a different pattern one needs to repeat a function name on a different line. That's what I don't like in Haskell. It is really enough to write a function name once. Say, in Nemerle with indentation-based syntax one can write:

  1. def fibonacci(i)
  2.   | 0 => 0
  3.   | 1 => 1
  4.   | _ => fibonacci(i - 1) + fibonacci(i - 2)

There are case expressions in Haskell, but those oblige to declare additional identifiers for function's arguments (those could be replaced with _ in Nemerle). And hey! The "case" from Haskell resembles define/match, which is already implemented in PLT Scheme/Racket. And define/match is a good enough substitute for the default Haskell notation, I think. So, instead of "(define (swap p) (pair (latter p) (former p)))" we would write:

  1.  

A bit too many parentheses to support multiple patterns for one branch (different from normal "match"), but in other respects nice.

So, actually I would choose Haskell-like notation for functions with one pattern, and define/match for many-pattern functions, like Fibonacci one. I guess it's possible to combine these in one ≝-notation. For that we would need to enhance the ≝ macro. But that is already "a topic of the future research", as scientists say (when they don't want (or cannot) continue).

Scientific Literature Browser

April 9th, 2013 by Ivan Lakhturov | 0 Category: Miscellaneous, Programming | Tags: | | |

Recently I've finished the website called Choose Your Textbook. This is a Scientific Literature Browser. One can browse there through the tree of science (the branches are taken from the wiki), and see the short description of those sciences. To the right there is an Amazon search box, where the name of the currently chosen branch is dynamically loaded, thus it shows the (most) relevant books at Amazon for a specific discipline.

Technically speaking, this is a mashup, but part of mashing is done offline. It's with my tools written in Scheme, and it is possible to do that dynamically, as there exist Scheme-s embedded to JS. But not for this project. The tooling converts that specific wiki-page in a chain HTML -> SXML -> (filtering) -> JSON. The latter is embedded to the website's JS. A nice tree / graph renderer called JIT is used to show the tree. JIT does animation / morphing and supports a few layouts. Seems, though, it cannot switch between them dynamically. I've stumbled at some other restrictions also, e.g. couldn't setup decent automatic node sizes.

Despite I've designed the website as an Amazon affiliate, I knew there wouldn't be much popularity. I tried to put a link to Hacker News and to Reddit, but both unsuccessful. So, there are almost no visitors at all. Nevertheless, my point was to try a few things: make my first mashup, write some tooling in Scheme, including DOM-manipulation, try out some web graph renderer, and last but not least, I wanted to skim through the tree of Science looking with one eye on existing appropriate literature list. And those goals are accomplished.

HQ9+, H9+, KL esoterical languages and the beer song

April 4th, 2011 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Let's first sing a beer song (in R6RS Scheme):

  1. span class="st0">"No more""no more"" bottle""" "s")
  2.               " of beer"" on the wall""Take one down and pass it around, "
  3.                  "Go to the store and buy some more, ""\n"", "".\n"".\n"""))))

Then let's make an extra library:

  1. span class="st0">""

... and we are ready to write yet-another HQ9+ interpreter:

  1. span class="co1">; HQ9+ interpreter v0.1 (Ivan Lakhturov)
  2. ; http://esolangs.org/wiki/HQ9
  3. "Hello, World!""") ; (number->string i))
  4. "HQ9+"))

HQ9+ is a joke language, featuring "Hello world" command, quine command, beer-song command and a counter increment (counter cannot be accessed or printed out). Quine implementation here is the classical "quine-cheating", where a program has access to its source. To make the quine more 'honest' somebody designed H9+. This is the same as HQ9+, but without "Q" command, and additionally, all characters on input are ignored, except for H, 9 and +. Then, obviously, "Hello, World!" program will be a quine. Let's implement H9+:

  1. span class="co1">; H9+ interpreter v0.1 (Ivan Lakhturov)
  2. ; http://esolangs.org/wiki/H9
  3. "Hello, World!""") ; (number->string i))
  4. """Hello, World!"))

And let's implement also a variation of this theme, the esoterical language KL:

  1. span class="co1">; KL interpreter v0.1 (Ivan Lakhturov)
  2. ; http://ivanguide.ru/kl/
  3. "Привет, мир!""Я узнал, что у меня
  4. Есть огpомная семья –
  5. И тpопинка, и лесок,
  6. В поле каждый колосок!
  7.  
  8. Речка, небо голубое –
  9. Это все мое, pодное!
  10. Это Родина моя!
  11. Всех люблю на свете я!""\n""+/-/*/extras"))

The semantics is as following: + is printing "Hello, world!" in Russian, - is printing a program's source, / is making a newline, and * print outs a poem from Russian movie "Brother".

To complete the picture, we can mention other close related to HQ9+ joke languages: HQ9++, CHIQRSX9, HQ9+B, HQ9+2D. HQ9++ is 'an object-oriented extension of HQ9+'; not interesting. CHIQRSX9+ adds eval, ROT-13 and sorting of input lines. ROT-13 (Caesar cipher) is a nice exercise to implement, but let's leave it for later. HQ9+B adds Brainfuck: this is definitely a thing to implement, but I will deal with Brainfuck later. HQ9+2D is not properly specified (even for a joke language), but commands it adds remind me 2D Turing-machine, so called Langton's ant. I want to implement and play with different Turing-machines, but later.

Later I will also look through the list of joke languages. For example, the first there is a 'language' 99, which just prints out '99 bottles of beer' song. Anyways, I hope, there could be something exciting in the list.

Problem 4 ver. 4: optimization

December 18th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Find the largest palindrome made from the product of two 3-digit numbers.

And the last scratch for now. It is possible to prove that 11 divides a palindromic number. Indeed,

and here is a multiple of 11 (divisibility by 11 criterion).

The factor 11 can belong to a - and in this case we step just 1 in b. But if 11 doesn't divide a, then we can increase b by 11 each time.

  1.  

This speeds up the previous result around ten times, leaving an asymptotic behavior the same. The memory use is the same O(1).

Let's look at results:
k = 2 => N = 9009
k = 3 => N = 906609
k = 4 => N = 99000099
k = 5 => N = 9966006699
k = 6 => N = 999000000999
k = 7 => N = 99956644665999
k = 8 => N = 9999000000009999
...
We could improve our algo drastically, if proven that the sought-for palindrome is less or equal (and mirrored). I have the feeling that for even k it is equal. But I don't know how to prove it. (I calculated for k = 10 and this does not hold, N = 99999834000043899999).

Problem 4 ver. 3: optimization

December 17th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Find the largest palindrome made from the product of two 3-digit numbers.

An author, however, advises a simpler approach. As we are looking for a palindrome a*b, let's iterate a and b in a top-down direction. After finding some palindrome, impose it as a top boundary for palindromes, that is, iterating in the inner loop for b, we stop when a*b cannot be large than that anymore. If we found a new palindrome, it will replace the boundary. Stop condition is finishing the outer loop in a, i.e. when it drops to 2-digits number (k-1, generally speaking).

  1.  

Complexity in memory now is just O(1). Performance complexity by my impression is better than in the previous variant. The outer loop has n - n/10 steps, so it cannot be less than O(n). Assuming that a desired palindrome (left half of it, actually) lies close to (which should be proved, strictly speaking), we make no more than operations until find it, and no more than the same afterwards.

This is the worst case, however, and I hope that we find some worse-than-ideal palindrome quick enough. Suppose, we can use the estimate ab origin, i.e. the inequality holds, where f = n - a, g = n - b. Then we can calculate an estimate of operations as area under a curve y = n / x:

So, the actual algo performance is between and .

Problem 4 ver. 2: optimization

November 29th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Find the largest palindrome made from the product of two 3-digit numbers.

Last time we had a straightforward algo with complexity and at least O(n) memory use. Now let's enhance that. Instead of iterating over multipliers it's reasonable to iterate over palindromes, starting from the largest. I.e. over sequence 999999, 998899, 997799, and so on.

Remark. The largest product of two 3-digit numbers is 999 * 999 = 998001. So, in principle, we could start from the palindrome 997799. But this saves just 2 iterations.

Having a palindrome m, we factorize it and look at all the subsets of the factorization. Assume, we have one subset already. Let's name the product of those factors as p. If this number p has k digits (k = 3 for now) and the number m/p has k digits, than we found the palindrome, which is a product of two k-digit numbers.

In Scheme that will be written as:

  1. span class="co1">;(display (list n '= (car factors) '* (/ n (car factors)))))))

Here I used a few new util functions:

  1.  

which make numbers out of their base-k representation.

Complexity now is hard to calculate. The worst case scenario gives quite a bad upper boundary. However, the worst case will never be realized.

Looking at what it gives out (9009, 906609, 99000099, 9966006699, 999000000999, ...), I could guess that the required palindrome is found after roughly iterations. So, in total I hope for less than complexity.

The memory use depends on factorizations - we store one whenever a palindrome is taken and lose it when proceed with the next palindrome.

All subsets of a set

July 2nd, 2009 by Ivan Lakhturov | 3 comments Category: Programming | Tags: |

As I already posted in Scheme, a function computing all subsets of a set would be:

  1.  

The same in Haskell:

  1. s = "abcd"" Set: "" Subsets: "

For imperative languages I'd rather prefer bitwise approach. Here is in C#:

  1. span class="st0">"abcd"

Problem 2 ver. 2, 3, 4: logarithmic complexity

April 17th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Find the sum of all the even-valued terms in the Fibonacci sequence which do not exceed four million

The last time we had the straightforward O(n) solution: building a sequence, filtering out even values and adding them. We can improve a bit, noticing that actually, every third member of the Fibonacci sequence is even. We don't check then for evennes, but just jump over three components each time. This version 2 (I don't publish it here) should be several times faster, but still is O(n) in performance.

We can also express a member of the Fibonacci sequence via the third and sixth members from behind: and compute those values as the values of a new sequence: . This version 3 is essentially the same as the previous one and again, I don't publish it here.

The drastic improvement is obtained using the expression (I've added it and a proof to the wikipedia article, but they immediately reverted my changes as "unsourced" --- this is pathetic). Now the sum is obtained just computing one Fibonacci member, and this can be done with O(log n).

Indeed, we can compute a Fibonacci member exponentiating the appropriate matrix, and this exponentiation, just like usual one, can be done with O(log n). I prefer this solution over using the golden ratio exponentiation formula (again logarithmic complexity), because only integer-operations are involved. So, this is the version 4 of the solution.

  1.  

I quickly outlined a class for 2D matrices and operations with it:

  1.  

The solution is O(1) in memory and O(log n) in performance - of course, where n denotes the index of a number in the Fibonacci sequence. And we have been questioned about the cutset, where members of a sequence are less than certain number. Then an additional function (closest-fibonacci-index) comes in handy (see the wiki for explanation):

  1.  

The final touch is asking ourselves about complexity of the (log) function. Well, it can be computed fast enough not to spoil complexity of the algo's main part.

Problem 1 ver. 3: optimization

April 5th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags: |

Find the sum of all the multiples of 3 or 5 below 1000.

Let us generalize again to a finite set of factors.

There is a formula for the power of finite sets

which can be generalized to a finite number of finite sets

or in a somewhat less understandable, but concise notation

Here is a measure (i.e. it commutes with the union sign) and can be replaced with --- power of a set sign or, if we are in the natural numbers space, with the sum of elements sign, as in our case. is not a multiindex, but a subset of the natural numbers cut from to .

Now by we denote the set of all the multiples of factors , less than certain number N, where i varies from 1 to n (each is respectively the set of multiples of a factor ). We use the above-mentioned formula to compute the measure of the union via measures of all and measures of all finite intersections of them.

Suppose, we have a number , prime or not, and the set of all it's multiples (they include only numbers less than N). Power of this set is of course (div operation) and the sum of its members can be calculated by the well-known formula for the sum of an arithmetic progression.

As regards all the intersections, it is understandable that we ought to calculate the least common multiple (LCM) of taken factors, and the set-intersection of their multiples will be just a set of its multiples. However, the current version of the solution assumes that we take primes as factors, then the LCM of them is just their product. When I calculate proper LCM in Problem 5 (up to now there is a bruteforce version), I will switch the temp version to it.

Let's see the solution. New util functions:

  1.  

The function that calculates subsets of a set:

  1.  

Important thing about this function is that it returns the empty set as the first element and the full set as the last element of a result list, all other subsets are in between. The number of subsets of a finite subset is just , so the complexity is --- it would be better visible with an imperative-iterative version of this function (I'm not posting it here). As regards memory, the function generates all the subsets as lists which in whole contain elements (strange, this neat formula isn't on Wikipedia yet, I should add it there), that is the memory load is . This is a not-so-good idea to load everything into memory, as we can rewrite this function (and the function that is down here in the post) iteratively with O(n) memory complexity --- taking advantage of combinadics, but for now I am satisfied enough with this version.

Using the formula above the solution now as simple as

  1.  

With that (cdr) I cut off the empty subset, whose measure is zero (otherwise the (sum-of-one) function has to be a bit more complex).

Let's be careful with notation: n here is actually not the same n, as in the (subsets) function, but the number up there, the maximum of our multiples-sets. The performance complexity depends on k and N, but we are interested only in complexity, depending on N. Let's assume that k is small comparing to N, which should be the usual case. Then the complexity is roughly speaking O(1), doesn't depend on N, as we wanted (I remind that in the previous version we had O(N) complexity).

The final touches are the regression tests:

  1. span class="co1">;(assert (=
  2. ;        (sum-list (multiples-less-than-bruteforce 1000 '(3 5 15)))
  3. ;       (sum-multiples-less-than 1000 '(3 5 15))))

The last commented one breaks, of course, as 15 is not prime - the LCM algo should be updated still.

Learned a bit about PLT-Scheme

March 20th, 2009 by Ivan Lakhturov | 0 Category: Programming | Tags:

I've looked through the Quick: An Introduction to PLT Scheme with Pictures document. And what I've learned is:

  • There is a library (#lang slideshow) embedded into PLT-Scheme, which provides some easy-to-use graphic primitives and a GUI library (scheme/gui/base). The first can be used for drawing on GUI's canvases.
  • There is an OOP library (scheme/class). I should look what's the backbone later.
  • PLT has the distribution system for libraries. The first eval of (require (planet something)) downloads from the PLaneT server and caches 'something' locally.

The last is quite nice, I should run through that server and look which libraries are actually implemented.