A brief update on current progress
Currently evolution@home is in a transition period. Before 2011 there was no full-time staff and resources were extremely tight in many respects - despite the many things that needed to be done. Thus progress in general was very slow.
Since Laurence Loewe, the founder and lead developer of evolution@home moved to the University of Wisconsin-Madison at the end of 2010, things changed. For the first time there was an opportunity to expand the operation and develop evolution@home in a more systematic way.
2011 was characterized by starting up the research group that will continue to develop evolution@home. Currently we are working through a large amount of details that are important for developing the science, code base and back-end infrastructure that is needed to analyze results that are being computed by evolution@home. The lack of a scalable way of analyzing incoming results is the biggest block that slows down the current expansion of evolution@home.
On the scientific front work continues on theoretical models of evolution and methods for estimating parameters that are important for such models. Recent work also explores how systems biology can contribute towards answering some of the most difficult questions in evolutionary biology. Related work on formal modelling methods shows how some of the simulation approaches that are being used in systems biology can also greatly simplify simulation work in evolution and ecology. These foundational insights are of importance for developing the biology and computing technology behind future simulators. Currently, the implications of this work for evolution@home development are being worked out.
Modernizing the code base of Simulator005
Porting. The currently published Simulator005 was produced by a code base using the very dated Codewarrior compiler. Thanks to the efforts of Matt Myers, this was ported to a more modern platform. Some further testing etc will be necessary and then a newer version of Simulator005 will be released. This work is important for developing the back-end infrastructure as well as for advancing our understanding of how mutations accumulate in non-recombining populations.
Fully automated global computing. Many of you have repeatedly requested such basic global computing features like full automation of task distribution, results submission and highscores updates. While the first BOINC-evolution@home prototype developed by Rechenkraft.net's "yoyo" project has now been running for some time, a number of tasks remain.
- Data storage. The large number of new results demand a new and more scalable way of handling and storing the data. It is particularly challenging to deal with the
complexity of data management problems.
Evolution@home is one of those global computing efforts that are more of
the 'ant-hill'-type than of the 'ruby-in-the-rubbish' type. The latter
just need to write results to archives and keep those few records at
hand that exceed a specific score of interest. Ant-hill-type problems,
in contrast, need to keep almost all results at hand, because they are
typically used in combination to understand trends in the big picture.
Thus they require special data-warehouses that need a considerable
infrastructure to operate and to grow with the needs.
Such an infrastructure is under development at the moment.
- High scores improvements. Currently all BOINC results are only listed as one entry, without the possibility for more detailed rankings as provided by the current high scores. Also these high scores are still computed semi-automatically. This will change as soon as the new back-end infrastructure becomes operational.
A major emphasis of our current work is the development of a new Simulator that allows for much greater flexibility in scheduling simulations by allowing the specification of a simulation model in a biologist friendly model description language. This will break the "one model = one executable" paradigm that facilitated an easy start for evolution@home, but also placed substantial constraints on long-term usability. Current work focuses on:
- Model description language. This language should be as elegant, concise and expressive as possible to inspire biologists to build more quantitative models.
- Simulation algorithms. The language can only be as good as the simulation algorithms behind it. Thus we are currently investing heavily into understanding, implementing and improving the best of breed algorithms.
- Storing data appropriately. The way data is stored in the simulator needs to be compatible with the back-end interfaces. We are currently looking into the most efficient ways for storing data using the HDF5 data format.
- Checkpoints. Due to the absence of checkpoints in the current Simulator005, the length of BOINC simulations is seriously limited. We are looking into ways to solve this problem for good.
Computing and analysis progress of evolution@home
Computing progress. A big thank you to all who contributed over 600 years of CPU time to what is already the worlds largest database for Muller's ratchet simulation results. We have completed our first major computing projects (P1+P2) and are now in the process of crunching through the challenging part of the next project (P3). A new project is in preparation as well.
Biological analysis. Biological analysis continues. We will move analysis of Simulator005 results to the new back-end infrastructure as it gets available. To find situations where Muller's ratchet is of biological importance we investigate what parameter combinations apply to real organisms. To do this, we compile enough details about these organisms to have reasonable upper and lower estimates for the parameters in question (like population size, mutation rate etc...). We are also analyzing some theoretical questions that help us better understand how Muller's ratchet contributes to mutation accumulation in asexual populations. The results of these analyses are then written up and reviewed by other scientists for accuracy. Only after that will we publish them here.
We are in the process of reorganizing our web presence to make it easier to learn about evolution@home. More in due time.