BlogKontaktTagcloud

Memcache with quicklz

Memcache is a fast, distributed key-value-storage. Yes, this day you would say a non-permanent NoSQL storage, but hey memcache is around since 2003 (initially written by Brad Fitzpatrick) long before the NoSQL-Hype. Memcache has a very broad user base which includes quite a few top 100 sites. You can connect to the Memcache Server (called memcached) via an ASCII or a Binary protocol, but for the most programing language there are already client API’s that makes it easy to use memcache. For PHP there are even two libraries PECL memcached, which use libmemcached, and PECL memcache. I will use memcache for this article. But hey, that’s the easy stuff now it's time to go on.


Compressing the cache

Ok, so it would be great to compress this cache to have even more cache available. It does also make sense to compress the data in the client library; therefore you not only save time in the cache but you also get a speedup on the transfer to the cache.

And guess what, PECL memcache has already build in this feature. If you add the MEMCACHE_COMPRESSED flag to your store operation the library will compress your date by using zlib. You can even define a threshold with which every entry above a certain size will get compressed automatically.  PHP does even set this threshold to a certain value initially, but in my point of view this is a bug.

Compression is always an architectural tradeoff. By compression you gain speed on the transmission and space in the storage, but you lose time on compressing and decompressing it on the client. In nearly all cases, or in least the cases your caching is designed well, you gain much more by compression.


Compressing the cache even better

Ever since Urban’s talk about quicklz at the webtuesday I wanted to implement the memcache compression with this algorithm. Due to the high workload at eth, the project did get delayed over and over again, and I also got stuck at benchmarking memcache correctly (also because of the above mentioned bug).


So I did add quicklz to PECL memcache 3.0.5 which was not too hard. The compression happen in the file memcache_pool.c in the functions mmc_compress and the decompression in mmc_uncompress  accordingly. I added the files quicklz.c and quicklz.h to the folder and changed the functions accordingly (see the patch here). As I’m a bit lazy I ignored most of the error handling. (Compression shouldn’t fail, right!?) Don’t forget to add quicklz.c to the config9.m4 that it will be compiled as well. As PECL build & install did not work proper for me I did the installation “manual”. “phpize; configure; make; make install”, in the main source folder does the job. Thanks a lot to Alvaro Videla for the help, seems like this twitter-thingy is useful for something after all.


Benchmarking it

It was a bit hard to get a good benchmark. But I used the abstract dump from Wikipedia which is a big xml document. I splited it into chunks of 512k (“split -b 512k -a 3 enwiki-latest-abstract1.xml wiki_”, unix is awesome). I uploaded then all this files into memcached with a small PHP-script and downloaded all of them in a random order 3 times in a row. I repeated these steps over 10 iteration. I started memcached on the local machine with 10 gigs off storage and the debugging option “-vv” on an extra-large amazon ec2 instance.


Uncompressed they did need 642 MB of cache storage; the average write time is 3.89 seconds whereas the read time 5.26 seconds (keep in mind that the read time is always over three iteration).  Compressed with zlib the storage space sink to 87 MB but at the same time the read and write time increases significant. The average write time is now 27.66 seconds and the read time goes up to 19.55 seconds. But now the result we are really interested in. I used quicklz level one and did not get a big difference for level two, for level three I did get crashes so I could not do any measurements. The result for quicklz: The read values are with 6.68 seconds pretty good as this comes nearly to the uncompressed, but what is really impressive is the write per write performance which is with 3.54 seconds even better then uncompressed. Unfortunately quicklz uses with 118 MB way more storage then I did expect.


Keep in mind that this measurement are done on a unusual setting, the machine was way faster than a usual web box, so the compression did of course benefit from this. On the other hand usually the memcached (or at least not all of them) are not on the same box and the network speed matters. So if you have a slow network the size of the cache content matters a lot. And at the end you will not compress a so much data like I did, so compression speed should not make a huge difference in your webapp. Also if you have a smart caching strategy, caching will be always faster than calculating the data, no matter how fast compression is.


Conclusion

Well at the end I’m not really happy with the size of the data quicklz produces even if the speed of the algorithm is impressive. But at the end of the day for most web app the cache size matters much more than the speed, because you get a lot of speed improvement out of the cache. That’s the reason why I did not improve the patch the way one could use it in production.

Ähnliche Beiträge:
Mobile App Hackathon
Types inference FMFP
Third Week FMFP
Solution Second Week FMFP
Third Week FMFP
Comments (0)  Permalink

Mobile App Hackathon

My friend Jonas is now working for the early stage startup betacular. They organize a hackathon in the night from the 30. April to the 1. of May 2011 in Zurich where you have the possibility to use their brand new API. You not only get free food and beverages, but you might also win a 1000 CHF cash prize. Register now at http://socialapphackathon.eventbrite.com/ and let the world see that you participate at techup.ch.

Ähnliche Beiträge:
Make it human (or how to crack a CAPTCHA)
Named parameters in Java (bgl-style)
Jira status
Mailstatus in Skype
PHP Quine
Comments (0)  Permalink

Types inference FMFP

Here you have some functions were you can train your type inference skills. You might look up functions like map, +, <, ==, head, filter, elem, etc. if you don't know them by heart (but you should, you will be faster on the exam if you do). I will not publish solution for that. Use ":type" followed by a space and the function in ghci to get a result. I wish you good luck with the exam.

map ((+) 3) :: ?
(\x y -> x < y) :: ?
(\x -> x (<) + 1) :: ?
(\x z -> z (x z)) :: ?
(\x y -> x (\z -> y z)) :: ?
(\x -> x (==), 1) :: ?
filter (\x -> x elem) :: ?
map map :: ?
map.map :: ?
[\x a -> (a 1) x] :: ?
head.fst.head :: ?
map (+) [1,2,3,4] :: ?

Ähnliche Beiträge:
Third Week FMFP
Third Week FMFP
Proof by induction example
Solution First Week FMFP
First Week FMFP
Comments (0)  Permalink

Third Week FMFP

A bit late, but here you get the solutions from week three.

Usage of List

Multiply all numbers from 1 to 100 by using a prelude function (and no recursion)

foldr (*) 1 [1..100]

Sum all odd numbers bellow 50. Use a prelude function and the haskell list in a clever way.

sum [1,3..49]

Generate the string "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz" (remember a string in haskell is just a list of chars)

['a'..'z'] ++ ['a'..'z']

List functions

sum :: (Num a) => [a] -> a
sum [] = 0
sum (x:xs) = x + (sum xs)

merge :: [a] -> [a] -> [a]    -- merge two list (in haskell (++))
merge [] ys = ys
merge (x:xs) ys = x : (merge xs ys)

nthElement :: [a] -> Int -> a -- give the n-th element of a list (in haskell (!!))
nthElement (x:xs) 0 = x       -- errors ignored
nthElement (x:xs) n+1 = nthElement xs n

List comprehension

Find all numbers bellow 1000 that are devidable by 3 or 5 with a list comprehension. 

[x | x <- [1..999], (mod x 3)==0 || (mod x 5)==0]

A Proof for the end

You can also do recursiv proofs over lists. Given the following programm:

map f [] = []                    (m.1)
map f (x:xs) = f x : map f xs    (m.2)
times2 x = 2*x                   (ti)
twice [] = []                    (tw.1)
twice (x:xs) = (2* x):(twice xs) (tw.2)

Prove "FORALL xs::[int]. twice xs = map times2 xs" with recursion.

FORALL xs::[int]. twice xs = map times2 xs
Let P(xs) = twice xs = map times2 xs. 
Show FORALL xs::[int]. P(xs) holds by induction over xs

Base case: P([])
twice [] = []               (tw.1)
         = map times2 []    (m1.rev)

Step case: Show FORALL x::int, xs::[int]. P(xs) => P(x:xs) (IH = P(xs))
Twice (x:xs) = (2 * x):(twice xs)         (tw.2)
             = (times2 x):(twice xs)      (ti.rev)
             = (times2 x):(map times2 xs) (IH)
             = map times2 (x:xs)          (tw.2.rev)

Ähnliche Beiträge:
Third Week FMFP
Solution Second Week FMFP
Proof by induction example
Types inference FMFP
Solution First Week FMFP
Comments (0)  Permalink

Solution Second Week FMFP

Here is the solution for the second week extra excercise. Download it form: leo.freeflux.net/files/fmfp/CA%20Solution.pdf

Ähnliche Beiträge:
Third Week FMFP
Third Week FMFP
Proof by induction example
Memcache with quicklz
Mobile App Hackathon
Comments (0)  Permalink

Third Week FMFP

The "Listtatzelwurm" or "Listmonster" you might already know from the tutorial. It's from the site "learn you a haskell for greater good", if you need more help about list, the chapter with the Listmonster might be a help for you.

Let me give you some extra exercises for the current chapter. The solution will be online in around a week.

Usage of List

Multiply all numbers from 1 to 100 by using a prelude function (and no recursion)

Sum all odd numbers bellow 50. Use a prelude function and the haskell list in a clever way.

Generate the string "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz" (remember a string in haskell is just a ... of ...)

List functions

TA's usually just let you implement prelude functions to train list functions. So then, implement this functions recursiv:

sum :: (Num a) => [a] -> a
merge :: [a] -> [a] -> [a]    -- merge two list (in haskell (++))
nthElement :: [a] -> Int -> a -- give the n-th element of a list (in haskell (!!))

List comprehension

You already talked about list comprehension in the lecture. If you want to repeat it a bit, read the first chapter here http://www.haskell.org/haskellwiki/List_comprehension. Then find all numbers bellow 1000 that are devidable by 3 or 5 with a list comprehension. (When you sum this up you have already solved the first problem of project euler.)

A Proof for the end

You can also do recursiv proofs over lists. Given the following programm:

map f [] = []                    (m.1)
map f (x:xs) = f x : map f xs    (m.2)
times2 x = 2*x                   (ti)
twice [] = []                    (tw.1)
twice (x:xs) = (2* x):(twice xs) (tw.2)

Prove "FORALL xs::[int]. twice xs = map times2 xs" with recursion.

Ähnliche Beiträge:
Third Week FMFP
Proof by induction example
Types inference FMFP
Solution Second Week FMFP
Solution First Week FMFP
Comments (0)  Permalink

Proof by induction example

I just want to write down the proof that we did as an example in the last exercise session as you might want to use it as a reference for the current exercise sheet.

If we have the following haskell programm:

sum 0 = 0              --sum.1

sum n = n+sum (n-1)  --sum.2

And you want to prove forall n element of Nat. sum n = n*(n+1)/2 then we write the following proof:

Lemma: forall n element of Nat. sum n = n*(n+1)/2
Proof: We show forall n element of Nat. sum n = n*(n+1)/2 by induction. Let P(n) = sum n = n*(n+1)/2.

Base Case: Show P(0).

sum 0
= 0 -- sum.1
= 0*(0+1)/2 –- arith


Step Case: Let n element of Nat be arbitrary and assume that P(n) holds. Show P(n+1).

sum (n+1)
= (n+1) + sum ((n+1)-1) -- sum.2
= (n+1) + sum (n) -- arith
= (n+1) + n*(n+1)/2 -- P(n)
= 2*(n+1)/2 + n*(n+1)/2 --arith
= (n+2)*(n+1)/2 --arith
= (n+1)*((n+1)+1)/2 --arith

qed.

(The italic text of the proof might be used as a template for all recursiv proofs.)

Ähnliche Beiträge:
Third Week FMFP
Third Week FMFP
Types inference FMFP
Solution Second Week FMFP
Solution First Week FMFP
Comments (0)  Permalink

Second Week FMFP

For the second week I will give you an excercise sheet created by former FMFP TA Martin Schaub. You find the sheet here. I will put the solution for this sheet online during this week.

The Solution for the official excercise is already online at https://www1.ethz.ch/infsec/education/ss11/fmfp/fp_material_secured.

Ähnliche Beiträge:
Welcome to FMFP'11
Memcache with quicklz
Mobile App Hackathon
Types inference FMFP
Third Week FMFP
Comments (0)  Permalink

Solution First Week FMFP

The master solution of the official excercise are online since quite some time on https://www1.ethz.ch/infsec/education/ss11/fmfp/fp_material_secured. Now I also put the solution for the first weeks extra exercises online.

The calculation in GHCi could look like:

Prelude> let x=31
Prelude> let y=11
Prelude> x+y
42
Prelude> x-y
20
Prelude> x*y
341
Prelude> div x y
2
Prelude> mod x y
9
Prelude> x^y
25408476896404831
Prelude> sqrt (fromIntegral x)
5.5677643628300215
Prelude> (fromIntegral x) ** 1/3
10.333333333333334

There are different methods to implement the n-th Factorial. Let me show you some examples. First with if, which is often considered as bad style.

fac n = if n==0 then 1 else n*(fac (n-1))

It's possible to implement it with pattern matching.

fac' 0 = 1
fac' n = n * (fac' (n-1)) 

Or with guards.

fac'' n = 
	| n==0 = 1
	| otherwise = n * (fac'' (n-1))

If this is all to easy for you. You might implement a endless list of factorials (this is beyond the course scope).

fac''' = 1:zipWith (*) fac''' [2..]

The proof for "kn(<k1,k2>), kn({s}k1) |- kn(s)" in ASCII-Art:

G := kn(<k1,k2>), kn({s}k1)

                    ------------------Ax
                     G |- kn(<k1,k2>)
----------------Ax  ------------------ Pair-EL
 G |- kn({s}k1)        G |- kn(k1)
------------------------------------ Enc-E
 G |- kn(s)

Ähnliche Beiträge:
First Week FMFP
Types inference FMFP
Third Week FMFP
Third Week FMFP
Proof by induction example
Comments (0)  Permalink

First Week FMFP

For the first week I got you some simple extra exercises. Most of them were already covered in the lecture or in the tutorial. But probably it might be helpful to repeat it by your own.

Start GHCi and define y and x by typing "let x= 31" and "let y=11". So now do some math with this directly in the GHCi-Console.

  • Add x and y
  • Subtract y from x
  • Multiply x and y
  • Divide 3 by 5
  • Divide x by y in such a way that you get the next lower Integer
  • Find the modulo of x divided by y
  • Get x to the power of y
  • Get the square root of x
  • Get the cube root of 27 (Hint: If you need help, look how the square root is defined in prelude)


Define the function for the n-th Factorial. You might try the different methods in haskell to get such a function.

Prove kn(<k1,k2>), kn({s}k1) |- kn(s) with the knowledge derivation rules in http://people.inf.ethz.ch/meiersi/teaching/fmfp10/tut-sheet1.pdf (that should be now really easy for you). Solve the knowledge proofs in this file without looking at the solution.

I will put up some solution for this some when next week.

Ähnliche Beiträge:
Solution First Week FMFP
Types inference FMFP
Third Week FMFP
Third Week FMFP
Proof by induction example
Comments (0)  Permalink
Next1-10/437