Saturday, 29 January 2011

So, it works.

After another little while pottering away this afternoon, I've managed to get the code for the big GA into some kind of shape. All the essentials are there, as it were:
- chromosomes and genes
- a basic fitness function
- a basic parent-selection method
- a basic crossover system
- a basic mutation system

I've knocked together a rough initial dataset, and it's generating the sort of results that I'd expect; trending from worse to better. This is good :-)

Now we move onto the actual tricky part, which is working in the various other bits of the requirements, like the quality of service restrictions and so on. First up will be fixing the fitness function to be a bit more standardised.

Step 1: types of QoS attributes
First thing to consider is whether we should have different methods of combining attributes over the service. My initial reaction is "yes", and here's why:
For something like "cost", all we need to do to work out the total cost of a particular set of services is to add up the cost of each of them. Easy. But for something like "Availability", expressed in a percentage, we can't really just add up all the availability scores, can we? That'd produce an oddly skewed result at the end.
eg: we have three services, each of which has a 90% chance of being available. If we add those up, we get an availability score of 270 for the operation. We could divide that by the number of services to get back to the average availability, but that still only takes us to 90% -- in fact, the aggregate availability (the chance that all three services will be available during the invocation) is only 73%.
So, which will serve the algorithm better - the average availability of the services, or the actual computed aggregate availability for the whole set? I'm not sure. Possibly I should test both :-)

Step 2: weighting factors
Also, we should take into account a couple of other factors when calculating the fitness of a chromosome: partly a user-defined weighting (eg: if they would rather the algorithm brought the cost of the service composition down, rather than the execution time), and partly a dynamic weighting to try and meet any particular requirements (eg: if the maximum cost of the service composition is set to £10, and none of the current solutions match that, then the weighting factor assigned to the cost of the service composition should be increased).
What this means is that I'll have to take the fitness calculation of a chromosome out of the chromosome itself, and into the main population object; that way I can store a series of weighting factors that can be changed quickly and easily throughout the execution time.

No comments:

Post a Comment