Thursday, January 10, 2013

Genetic Algorithms. Lame Example - Solving Quadratic Equation

Source code to this article may be found here.

There are numerous resources on the Internet, that provide description of the theory of Genetic Algorithms and theoretical explanation thereof. I, however, have found a bit more then none giving a real example (I may have not searched that good, though). Therefore, I decided to try and implement the theory into a live example. While there are thousands of areas where GA may be applied, I decided to choose trivial quadratic equation solution process for the sake of simplicity of the implementation. The equation is y = 13x^2 - 5x - 12. It has two roots (points where its graph crosses the X axis) at x = -0.7875184 and at x = 1.17213378. Although, this particular GA implementation has no application in real life (you can solve that equation on paper in several seconds), it demonstrates GA at its finest.

Data Encoding
Classic implementation of GA implies encoding of data as bit strings (chromosomes where each gene is represented by one or more bits), however, I decided to change it a bit (which is not prohibited) and use real numbers instead of the bit strings. Each chromosome includes two genes, two possible solutions, one for each root of the equation.

typedef struct _CHROMO_
{
   double   value1, value2;
   double   fitness;
}chromo_t;

The third member of the chromosome is fitness and is used only to store the fitness of the solution represented by this chromosome. The genes (values) are initiated to random values when the initial population is created:

#define POPULATION 2000

chromot_t population[POPULATION];

for(int i = 0; i < POPULATION; i++)
{
   population[i].value1 = (double)(rand() % 10) + 5.0;
   population[i].value2 = -(double)(rand() % 10) - 5.0;
}

Fitness Function
Fitness function is used to estimate the suitability of the current solution. In this particular case, the fitness function is the the sum of values produced by the equation when fed the values from the chromosome. This means that the lower the value returned by the fitness function the closed we get to the proper solution.

double fitness(chromo_t* ch)
{
   return fabs(13.0 * pow(ch->value1, 2.0) - 5.0 * ch->value1 - 12.0) + fabs(13.0 * pow(ch->value2, 2.0) - 5.0 * ch->value2 - 12.0);
}

While you iterate through the population calculating fitness for each phenotype, you should save the indices of two solutions (the amount may actually vary from case to case) as those would be used to produce the next generation.

Crossover
Crossover operator is the crucial part of any GA implementation. It is a function that takes two (or more) best phenotypes and uses them to produce a new generation. Suggestion is to crossover that pair to produce one (or more - depends on the implementation) child and then crossover the best phenotype with the rest of the population, thus, entirely replacing the old generation with the new one. In this case, the crossover function is rather trivial:

void cross(chromo_t* chMom, chromo_t* chDad, chromo_t* chChild)
{
   static int  mutation = 0; // Which operation to use for crossover
   switch(mutation)
   {
      case 0:
         chChild->value1 = chMom->value1 + chDad->value1 * MUTATION_RATE;
         chChild->value2 = chMom->value2 + chDad->value2 * MUTATION_RATE;
         break;

      case 1:
         chChild->value1 = chMom->value1 + chDad->value1 * MUTATION_RATE;
         chChild->value2 = chMom->value2 - chDad->value2 * MUTATION_RATE;
         break;

      case 3:
         chChild->value1 = chMom->value1 - chDad->value1 * MUTATION_RATE;
         chChild->value2 = chMom->value2 + chDad->value2 * MUTATION_RATE;
         break;

      case 4:
         chChild->value1 = chMom->value1 - chDad->value1 * MUTATION_RATE;
         chChild->value2 = chMom->value2 - chDad->value2 * MUTATION_RATE;
         break;

      case 5:
         chChild->value1 = chDad->value1 - chMom->value1 * MUTATION_RATE;
         chChild->value2 = chDad->value2 - chMom->value2 * MUTATION_RATE;
         break;

      case 6:
         chChild->value1 = chDad->value1 + chMom->value1 * MUTATION_RATE;
         chChild->value2 = chDad->value2 - chMom->value2 * MUTATION_RATE;
         break;

      case 7:
         chChild->value1 = chDad->value1 - chMom->value1 * MUTATION_RATE;
         chChild->value2 = chDad->value2 + chMom->value2 * MUTATION_RATE;
         break;
   }
   mutation++;
   if(mutation > 7)
      mutation = 0;
}

where the MUTATION_RATE is the speed at which solutions evolve (in this example set to 0.0001). The higher the MUTATION_RATE, the faster you may get to the proper solution, however, it may be less accurate. The best would be to reduce it while approaching the solution.

 As you may have noticed, crossover and mutation are combined in a single function, while most of the time you would want to separate them and use mutation only in case of stagnation (inability of the phenotypes to evolve into a proper solution).

Stop Condition
Not much can be said here.  You should stop the execution once you reach desired precision or once you realize that there is no solution as it is totally possible that certain quadratic equation has no roots, instead, it has its minimum or maximum. It is important to mention, that the precision and speed of certain GA implementation depends on the mutation rate and amount of phenotypes in population. In fact, you may implement another GA in order to find optimal values for the current one :).

Testing the Implemented Algorithm
This specific implementation approaches the roots of the aforementioned quadratic equation in a bit less then 23 seconds. Generated solutions are precise enough for most needs.

Let us take a look at the evolution of the solutions generated by the described implementation of GA.

This graph shows how solutions change over the execution time. 

The fitness curve looks quite similar:

Fitness curve
As you may see, GA is capable of finding the right direction in a very fast manner.


Conclusion
While this specific implementation is trivial and, to be honest, quite lame, I hope it shows that GAs are a very powerful tool. It is sage to say, that solving a quadratic equation is one of the simplest tasks where GA may be successfully applied. Genetic Algorithms of different kinds may be used for selection of Artificial Neural Network topology, different algorithm optimizations, etc. The success of certain GA only depends on the correctness of the chromosome and crossover implementations. Let me reiterate - it is always possible to implement another GA in order to get proper implementation of the current one.


Hope this post was helpful. See you at the next.



Monday, January 7, 2013

Anti Piracy? The insider's view


NOTE: This post has been modified (censored) in order to hide everything that may cause an NDA violation or harm my previous employer. Piracy will always be... Primarily, because no one is allowed to talk publicly (for normal legal reasons) and talking to vendors directly brings no result at all... The following is about as much as one can tell about piracy without a risk...

Achtung! This article may contain unpleasant language. Read it at your own risk.

There is a lot of noise about piracy (both software and media) in the past few years. The noise is at all levels, starting with the security community and ending up with governments. In my (very) humble opinion, most of this noise is a bit less then bullshit. File sharing services are a good example. I mean no one being sane would attack car vendors because people are getting killed in accidents. No one would blame hummer vendors for hummers being used as a murder weapon in certain cases. Same applies to file sharing services. Those are tools. Not more than that. All this sounds quite stupid to me. If I have an option to download a fresh release of my favorite Linux distro in a couple of minutes using torrent, instead of spending about an hour downloading it from the official site, then, damned, I will use torrent and I don't give a fuck about torrent being used by pirates. Fighting legitimate tools instead of enforcing right coding policies is a good evidence of lack of intelligence on the side of software vendors combined with ignorance of media vendors and politicians. That pisses me off big time. And I am sure I am not alone.

Enough swearing (for now), let's take a look at how things are in reality on the example of BD+. I am not going to describe BD+ internals here or give any information in addition to what is publicly available. Besides, I do think that it could be a powerful media protection (…… …………… …………… ……………… …………………… …………… …………………… ………………… ……………… …  …… … ……… …… …….. ………………… …………………. …………… …………….. …………..), unless...

Studios
Studios invest huge amounts of money into the fight against piracy. I have to admit, that unlike governments, they are investing in protection like AACS (which has been cracked long time ago and is all about hiding the key these days) and BD+ (which is good, but is fully …….. ………. ……. ……..). Honestly, almost all that money goes in vain mostly thanks to software "developers". For some (unknown) reason, studios do not want to mess with that crap. Which is quite pity due to the fact that BD media is ……… …………….. ……………. ………………… …………………. …………………… ………………… ……. …………………. ………………….. ………………….. ………………… ………………….. ………….. …………. …………….. ……………….. …………………. .

BD+
If you have not heard about BD+ yet, then this article covers it quite well. Basically saying, BD+ is a virtual machine implemented in BD players and an executable code and data supplied on BD media. ………. ………….. …………… ………….. ………………… ……………… ……………… ……………… ………… …………….. ……………… ……………… ……………… ……………. ………… ………….. ……… ……….. ……….. …………….. ………………. …………………… …………….. …………… …………… …….. …………. …………………….. …………. .

Software BD Players
There are four software BD players - PowerDVD by CyberLink, WinDVD by Corel, TotalMedia Theatre by ArcSoft and Blu-Ray Player by Nero. Just four. As vulnerable as a kitten on a busy hwy. Certain people have been praising SlySoft team for reverse engineering software players and their implementation of the BD+ VM. I am not trying to say that it is not a piece of work. Just the amount of code that has to be reversed. But is it that hard to reverse the code that is barely protected? Even a n00b malware researcher can do it, especially given the fact that modern malware, sometimes, has heavier protection t………. ……… ………  …….

Software players vendors may claim that they utilize the most recent versions of protection software like Themida and others... Well, they may keep claiming (in fact, I wrote to all four of them three months ago and got a response from only one of them this far). The facts tell us that they do not even know how to …….. …………… …………… …………….. …………… ………………….. ……………… …………… …….. …………… …………… …………….. …………… ………………….. ……………… …………… …….. …………… …………… …………….. …………… ………………….. ……………… ……………

Right now, about a year since BD players vendors …….. …….. …….. …….. …….. …….. …….. …….. in the way they "protect" their products. Everything one has to do in order to get unpacked code is …….. ……. ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… Protection you say?..

AnyDVD HD
AnyDVD HD is one of the two most popular BD rippers. ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… .

Hardware BD Players
"Our hardware BD player is impossible to hack!" C'mon! Do you really think so? That's the most stupid saying I've ever heard. Or are you trying to make a laugh of yourself? In this case it works perfectly.

It is right to say, that it is not always easy to get into a hardware BD player  ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… ……… your levels shifter to. This was the biggest problem in my case (……………………………………… ...) but not in case of Doom9. They've done that in a good way. Which is about the only extraordinary thing they did while hacking hardware implementations of BD players. In fact, there is no such thing as Hardware BD Player. There are devices that run Linux inside (or, may be another embedded OS) and have a software player that plays the media. Praising Doom9 for reversing hardware BD player is not that smart. Oh, may be just the part of getting the AACS keys. All the rest is …….. ……..…….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. ……...

Code Protection? It seems like embedded "software developers" have not heard about it at all. ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… is simply wide opened for hacking. The possibility to login ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… should NEVER be used in production. Quite silly, if you ask me.

The best evidence of H/W BD players vendors' ignorance is the ……. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ……….

Conclusion
Some of you may say that the article supposed to be about piracy, not about the protection of BD players. Well, it is about piracy. From my experience, BD player vendors, hardware and software as one, support piracy by constantly refusing to protect their products better. ……. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ……………. ……… ………. I tried to personally contact BD player vendors several months ago offering them assistance in armoring their code. Guess what was their reply? There was none... Well, there was one vendor that did reply, but they are not ready to change a thing. All the rest keep silence as if everything is good. One may say that BD+ is a pain in the ass for vendors. Well, it is. But it's vendors who make it painful instead of writing good secure code.

If I were one of the Studios, I would probably do my best to revoke all four. Especially those  ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… ……. ……… ……… The fact that BD rippers are better protected then BD players simply pisses me off big time.

The bottom line is, it would probably make more sense to require vendors to actually protect sensitive code instead of openly supporting piracy, rather then messing up with those downloading pirated content or using file sharing services. 

Stop being dumb, start acting.