Sunday, November 21, 2010

An observation

I am currently experimenting with a neuroevolutionary algorithm which is based upon two of my previous algorithms: the one from my master and doctoral dissertation and the algorithm based on some ideas from gene regulatory networks. Since the work is in progress I think that it's too early to provide a description of the algorithm, and for now it's enough to say that the algorithm considers evolution of both structure and weights of the neural network with emphasis on growing networks, but with possibility to reduce network size if necessary.

One of the features, which I'd like to see in my algorithm, is self-organization because I believe that it is a key feature for building complex systems, which are able to adapt to wide variety of changes.

Here is a very first result (which, hopefully is free of program bugs). One of my standard tasks to check algorithm's functioning is a well-known XOR problem, which means that ANN should be trained implement logical eXclusive OR operation. Though this problem is rather easy, it requires creation of ANN able to perform some intermediate calculation, due to the nature of the XOR operation output. Thus the NE algorithm is required to find ANN with hidden units and proper structure and weights for this problem. To track the algorithm performance I've plotted dynamics of minimal fitness, average number of hidden nodes and average number of connections. The result (without averaging over multiple runs, be careful!) is:


The generation number is plotted along the horizontal axis, and there are two vertical axis: the left one for minimal fitness and the right one for average number of hidden nodes and connections.

The most interesting thing is there is a kind of plateu for average number of hidden nodes. Each node can be added or deleted from ANN under some probabilistic conditions, which use infromation about current ANN structure. It might seem like the algorithm found optimal number of hidden nodes by itself and that the further search is conducted for different schemes of connections and combinations of weights. Since the results are (very) preliminary it's not very reliable to judge the algorithm properties for now, but I believe that there is a hope to construct NE algorithm which will show similar behaviour for many other problems.

One of possible improvements here is a temporal disabling of nodes adding/removal, which may improve the speed of search because this way the search becomes more 'concentrated' (more 'local' if one could say so). I just need to find good condition fo toggle this regime on.

Sunday, October 17, 2010

How can study of complex adaptive systems contribute to the neuroevolution?

Complex adaptive systems (CAS) domain studies properties large systems consisting of heterogenous elements. The research interests are focused on system-level phenomena like self-organization, adaptation, emergent properties, etc. One of the main organization which deals with CASs is well-known Santa Fe Institute.

It's not surprizing that CAS study also concerns neural networks. Quite a number of publications concerns a kind of multi-agent systems, where each agent is driven by its neural network, which in most cases evolves during the experiment. The examples are publications by the evolutionary robotics related groups members. Thus one can say that CAS utilizes neuroevolutionary algorithms to study system-level behaviour. This is possible because evolving ANNs tend to create more complex agents' behaviours, which leads to the more complex interactions. Good example here is an evolving predator-prey system where both predator and prey movement was driven by their evolved ANNs. As time passed several strategies had been tested by both sides and each following strategy was 'invented' to compete the current opposing strategy. Here is an illustrative picture from Floreano D, Keller L (2010) Evolution of Adaptive Behaviour in Robots by Means of Darwinian Selection. PLoS Biol 8(1): e1000292. doi:10.1371/journal.pbio.1000292


But this is about how NE can help to CAS. And is there a 'feedback'? How can CAS contribute to the neuroevolution? I think that since ANN is a complex adaptive system itself the CAS study can be helpful answering the following questions:
  • how to evolve irregular-structured ANNs, which possess some desired system-level properties *and* are robust and reliable.
  • how can neural modules, consisting of multiple nodes, be connected and arranged to perform some specified task.
  • what are the ways and principles to dynamically change ANN structure and interconnections between modules and nodes to provide adaptivity in changing environment.
These might be not all questions but, as I believe, they have a great importance for creating reliable and adaptive neuroevolutionary algorithms which perform well in off-line and on-line learning.

Monday, July 12, 2010

Lost in motivation

Motivation is a really tricky thing. Sometimes it seems that it's motivated people who are rolling the Earth (not Ralf Ringer shoes :)). Recently I've been thinking on motivation system for under- and post-graduate students in order to improve their results and gently push them towards self-study.

To tell the truth I was inspired by Steven MacConnell's book "Professional Software Development" where he describes the programme for the professional growth which is used in the Construx company. It gives quite clear picture of professional growth based upon improvement of experience of SW developer (side remark: funny, I've just noticed, that SW stands for both 'Software' and 'Star Wars' as well :)), and was created to motivate self-study, professionalism in general, communication etc.

So here is a sketch of what came into my mind. But in the first place some axioms, which I've tried to follow:
- people's relations are prior.
- benefit for the group.
- reasonable formalism -- there should be clear rules, but these rules should not constrain one's scientific growth and trajectory.
- motivation system should not lead to envy, misjudging, grievance etc. (hopefully)

...a-a-and here it comes:
==============================================
Student of the Month title is established accompanied by a humble reward. The winner is defined by the following measures (the calculation of final decision is yet to be thought about):
- Defense of the thesis.
- Passed exam.
- Personal grant/scholarship obtained.
- Participation in the process of winning group grant, contract (helping with aplicatin, making research for a scientific scientific reserve, reviews etc.).
- Awards in competitions.
- Receiving professional certificates.
- Documents, which acknowledge usage of scientific results in real systems.
- Published papers in journals, conference proceedings.
- Oral and poster reports made on conferences.
- Reports on inner group seminars.
- Scientific service for organization of conferences, workshops, journal reviews.
- Source code of algorithms and modules, which can be used by other members of the group (with a short how-to documentation/manual). Open-source is highly encouraged.
- Scientific web-publication (posts in blogs, personal web-site etc.)
- Attracting new members to the group.
- Various proposals to improve group's organization and working.
- Transition to the next Level (see below).
- Marriage, birth of the child and so on.

Attributes (to think further):
- Challenge cup
- Hat/Crown/Helmet of the Winner.
- (something else).

Remarks:
- The winner is defined at the beginning of each month.
- There might be no winner at all.
- If uncertain situation arises then group members to decide the winner themselves (via open or secret voting or something like that).
- Think about Student of the Year (with extended reward) and Student of the Half-Year.
==============================================

Levels - designed to show ones progress and scientific growth.

Level 1. Entry level. Very small experience in scientific research, writing, programming etc.

Level 2. Small experience in programming in the major domain. Knowledge of essential books (2-3) and some publications related to thesis. Short survey on the thesis theme.

Level 3. Average programming experience in the major domain. Quite good knowledge of books (5-6) and publications related to thesis. Knowledge of essential books (2-3) on neighboring themes. Good knowledge of publications made by leading specialists during last 3 years. The emergence of the second major. 1-2 written thesis chapters or 2-3 good journal papers. Scientific consulting/joint research with the student.

Level 4. Programming experience in the major domain is significantly above average. Very good knowledge of books (9-10) and publications related to thesis. Good knowledge of books (5-6) on neighboring majors. Good knowledge of publications made by leading specialists during last 5 years. The emergence of the second major. Thesis draft, or 4-5 good journal papers, or participation in writing joint monograph.

Level 5. Ready to independent scientific work. Defended PhD thesis. Great knowledge of books (14-15) and publications related to thesis. Good knowledge of books (6-7) on neighboring majors. Very good knowledge of publications made by leading specialists for the major during last 5 years and good knowledge of publications made by leading specialists for the neighboring majors during last 3 years. Reasonably good possession of the second specialization. Own published monograph.

Notes:
- levels are necessary for writing applications for grants and projects. A student can attract other group students only if he or she has level 3 or higher. Besides, only students with lower or equal level can be considered as collaborators. Only students at level 5 can invite PhDs as collaborators. I hope that this rule will push students to promote to higher levels to make their own projects.
- transition on each level can be made with violation of desirable achievement (for example level 5 can be reached without defending PhD).
- to approve book reading the student should make a report on the group seminar which states what information from the book and in what way can used to obtain dissertation goals. Some analysis of the approach should be made and it's quite desirable that some discussion could be made on this theme. Short excuse-phrases like "I'd like to use method X, I hope it'll work", should not score in no way, there should be some speculations behind this.
- student who only read books and don't do anything else should not be promoted to the next level. Knowledge is good only when combined with practice.
- a book should not be scientific in the first place. It's enough that the book contains useful ideas or approaches which can be used to make disseration (good examples: biographies of known people, some books which develop imagination, like 'Alice in Wonderland' etc.).

Wednesday, July 7, 2010

Reservoir neural network (a concept)

Intro
Sometimes rather interesting ideas come into my mind and I give myslef a promise to implement them and then put them aside, so that they just piled on backstages of my memory. Some details can dissappear with time, hence I think it's worth writing these ideas somewhere so that later they could be useful for me or somebody else.

Description

Goal: General-purpose learning with self-adaptation of the artificial neural network’s (ANN) structure and parameters.

Idea: The idea is inspired by influence of biochemical reactions and spatial relations on brain’s functioning, which is not considered in most known models of ANNs. To implement this ANN is placed inside an expandable virtual 2d or 3d reservoir which can have different zones affecting the signals transmitting, activation of nodes, learning rates etc.

The network’s structure and reservoir parameters change over time in the following ways:
- New nodes and connections can appear.
- Some nodes and connections can be removed dynamically (to implement forgetting and/or giving other nodes more space to function).
- Nodes can change their location, moving towards “coherent” nodes, so that nodes with correlated outputs would tend to be located closer organizing structures.
- Connections can change their weight and length (the latter should have some impact on signal’s transmitting).
- The size and form of different zones in reservoir change to affect ANN functioning without training or corrections.
- Reservoir can have regulators defining number and parameters of zones and which can be changed either externally (by user or some control-program, it models the case of eating, making some physical actions and psychology), or by some law (models change of daytime, biorhythms etc.), or from the current state of nodes within the reservoir (models self-control and self-regulation). Or combination of all these can be used.
- The reservoir’s size can change to house as many nodes as required or to shrink if there’s too much of free space. This is required to implement evolution of the ANN.

Extension 1
The scheme above gives a general outline for dynamical creation and training of ANN in complex environment. This extension provides the idea for building hierarchy within the network. There are several variants (which can be used simultaneously):

1. Embedded reservoirs. When new node appears it is placed in its own reservoir if this node is located rather far from all other nodes. This reservoir can have its own zones and reservoirs which are (partially) independent from the parent reservoir. Each reservoir can have only one parent, while each parent can have multiple sibling-reservoirs. The decision whether new reservoir should be created can be made via judging the minimal distance between a new node and existing ones. If this distance is less than some dist_critical then new reservoir is created. For the embedded reservoirs critical distance can be reduced logarithmically and can optionally depend upon reservoir’s size.
2. United nodes. As nodes with correlating outputs move towards each other then when maximal distance between such nodes is below some threshold these nodes can be separated by creating new reservoir in their current location and placing them inside it. Again for the embedded reservoir the value of threshold for uniting its nodes can be reduced in logarithmical scale.

The reservoirs described above can be used further as independent units, i.e. they can treated as single nodes or be copied/deleted, form their own reservoirs by the 2nd variant etc.

Problem with meta-knowledge
I position this concept as an approach to implement data-independent learning which deals with different types of input data and can solve learning, inference, recognition and prediction problems. But it’s unknown how to make such a system to decide which kind of data or task it faces. E.g. this system should somehow get to know about this variety of problems and data. In other words the system should be able to form and extract meta-knowledge. And implementation of this might be a big problem. I believe that to check this at least some.

Monday, July 5, 2010

Some thoughts on recognizing systems

Here are some thoughts just to organize my ideas for possible future research and clarify some problems and questions. The general problem is building a classifier (recognizing) system, which could evolve and learn to get new knowledge hopefully from various sources. This is a well known and a very challenging problem, which is not solved yet though a lot of people proposed their own solutions (like CEC, Phaeco, Adaptive Resonance Theory, ontologies, large-scale knowledge-based systems, biology-inspired approaches etc.). They are all good to some extent, but they also miss something, which makes them non-universal. I do not claim that I can propose my own solution (and quite possibly I'm not the person who will do this), but here are just some thoughts and classifications sketches, which are useful to think about.

===========================================
All learning systems can be divided into 2 major classes:
1. Supervised - there is a "teacher" which tells the System what output it should yield on a given input.
2. Unsupervised - the System itself to decide what to do with an input data looking for possible regularities. After the training such Systems can be used either "as is" (like Kohonen's maps) or their output can be further processed, e.g. mapped into classes space like in Echo-State Networks or Hawkins Networks.

===========================================
Input data is in general falls into (combination of) the following types:
1. Permutation-invariant, e.g. if we swap some values in object's description the object's class will remain the same. This is typical for images analysis problem when rotation or moving of the object on the image should not change the recognition result. This corresponds to the non-ordered feature vectors case. Some times this type of data doesn't demand full-scale permutation invariance, for example when recognizing hand-written text it's rotation or panning should not affect the recognition, but swapping pixels at random can distort the input critically.
2. Not permutation-invariant. Just the opposite case when order of feature vector components is significant which is often met in traditional setting of the classification problem, recognition of speech, processing time-series etc.
3. Scale invariant. This is a somewhat tricky thing. Input vector size remains the same but its spatial and/or temporal resolution changes. The example is recognition of letters and numbers from their images or recognition of fastened or slowed speech. The System, which is able to deal with this type of data, should adapt somehow to the characteristic scale of the input. In image processing this is sometimes reached via using the local maxima of Laplacian-of-Gaussian or Difference-of-Gaussian.

Note, that input data can be either of (1st and/or 3rd) or 2nd types, which means that types (1+3) and 2 are exclusive and their simultaneous processing demands either different subsystems or a kind of switching (with adaptive recognition of what type of data is being introduced on input).

===========================================
Most modern recognizing systems are just mere numerical windmills: they process numerical data using ad hoc + experimental assumptions like "if we have significant change of variance than the object under consideration changes its state", "it's enough to utilize information about lines orientation to recognize image category". But (almost?) no system have "semantic layer", which could consider interconnections between different classes and notions, organize them into sentences and statements etc., and which could affect the result of recognition. I believe that all researches feel that we've got to use semantic information as an important element in recognition process, but no good solution has yet been proposed. The problem here is twofold:
- it's extremely difficult to build such a system, because we do not have enough knowledge of what cognitive features are and how they can be modeled in a full-scale.
- most successive system rely on mathematics --> we will (almost?) definitely use numbers. Yes there are approaches which take into consideration fuzziness of the data, but such systems are still strictly governed by numbers, which define their vital parameters.

The big question is whether we can numerically approximate human's brain (or whatever is used for thinking, storing memories, setting goals, wishes, creating emotions and so on). There've been a lot of arguments on this theme, but the answer is unknown.

===========================================
I believe that the importance of multi-layered recognizing systems have been acknowledged for several past years. There are "deep-learning" architectures of neural networks (like those used by Yann LeCun and Geoffry Hinton), Hawkins networks, many good systems have a multi-component organization with separate stages of processing and recognition. However such system are still non-universal in terms of input data types (see text above) and solved problems and really important questions are still unanswered:
- How many layers to use for different problems? And how many different problems from the layers number points of view exist?
- How these layers should be formed?
- What components they should include?
- Is there a set of universal components (like nucleotides in DNA or subatomic particles)? And how strong they are "granulated" (how small they are)?
- How should we connect different layers?
- How to organize "natural" hierarchy so that a multi-layered architecture could be considered as a single layer?
- In brain spatial relations, bio-chemical and physiological  processes play very important role. How can we utilize our knowledge about them?
- How can we make such multi-layered systems in a fully automatic unified manner?

Since there is no system or approach which solves these problems automatically, we all have "hands to kiss and babies to shakes" ((c) by LeChuck), let's work :)

Sunday, April 4, 2010

Brain-drain: Is there a problem?

There is a growing worry in Russia concerning brain-drain. After iron curtain had fallen a lot of scientists from Russia and former soviet republics went abroad for working and living. It weakened Russian science a lot and as a result has a great impact on the current state and future of the scientific research in Russia. There were a lot of talks about brain-drain and how to avoid it, even a drastic action was suggested to oblige all scientists going abroad to pay a penalty for government if less then 10 years have passed since their graduation from institute, explained by the fact that education in Russia is free and thus someone who obtained it should work for Russian government. Last year some serious discussions were made how to return scientists back, which lead to rather contradictory and shocking letter from those scientists requiring privileged conditions and salary for them comparing with "local" scientists. For now a programme for collaboration of "local" and "drained" scientists is launched which concerns governmental grants support and some special conditions.

But is there a real problem? From one hand, yes, there is, because people go away and among them a lot of renown scientists. And there is a lot of criticism for government about bad conditions and finance for making science in Russia, very high level of corruption, low prestige of scientific profession etc.

But what I recently thought about is somewhat different. There were a lot of talks and actions during last 10-15 years about integration into the world scientific society and processes. This means that researchers will be travelling around the world working in different laboratories and institutions to share experience and knowledge through postdoc and fellowship programmes, and one of the countries involved in this should be Russia. And here we come to the interesting point. I didn't hear about foreign scientists having PhD come to Russian institution for a long period of time as a postdoc or a fellow. It's like a one-side diffusion: scientists from Russia go abroad, but no one comes in their place from a foreign country. And this is a real problem.

Just imagine football or hockey team, which loses its players for other teams, but do not buy other players instead simply trying to grow up new ones from juniors. This is silly. And in fact many promising hockey and football teams are sponsored by city and regional administration and by business and are able to sign a good contract with a solid foreign player to the joy of fans and club management. This is good for sports and ... there are no complains about sportsmen drain. And nothing like that is known about science in Russia. This is very interesting...

So why foreign scientists do not come to Russia for a long-term period (and maybe for entire life)? I think that the main problem is inappropriate salary, bureaucracy and corruption and until scientists in Russia don't have good salary and work conditions (just like many of Russian sportsmen do) none will come from abroad to substitute a "drained" researcher (just like many of foreign sportsmen do). We hear a lot about making innovations and building innovative business in Russia, many SMEs are being started on the basis of scientific results and achievements, many of them will perish, but hopefully some will stay and continue scientific research to demonstrate government and business that investing real money in science, and especially in fundamental science, is not only noble but also profitable (just like sports). Unless greed and corruption spoil everything...

Wednesday, March 24, 2010

Old Rutherford was right...

Today I went to post office to fax an agreement for funding my research by Human Capital Foundation (I happened to win a grant for this year, here (in Russian)). The woman, which was faxing the agreement, saw either the grant sum or just that it was a funding agreement and said (very quietly, but I could heard that!) about "raskulachit'" me. This is a word from revolution in October 1917, which means that I am bad and mean person, because I earn too much money while other people live hard, and thus some good and honest people should take my money away and beat me. But I'm a scientist in Russia, I can not be reach, it's a kind unnatural here.

At first I wanted to say her everything that I thought about this garbage of hers and that that was honest money, and I worked hard for this, and that she thinks so mostly because I'm a Korean (nationalism in Russia has spread wide during last two decades, and this is really annoying), and that even Robin Hood would not agree with her etc. But then it just came into my mind that it wasn't her fault from the very start (although I'm in no way approve her), it's that brainwashing politicians which:
 (1) Keep saying that Russia has enemies everywhere and that the whole world wants to bring Russia on its kneels. It's a rather popular tool (or should I say weapon?): when some politician needs something he says that this something will make Russia stronger, and when he doesn't he simply says that our enemies want this something. It's ridiculous, but it works and is applied very often. And a typical enemies for the casual citizen are either some non-Russian bastard, who is more prosperous then he, or a mean emigrant from a 3rd world country. Communism's legacy enforced by snobbery. Somehow it reminds me of Jehovah Witnesses which think that this world is hostile towards them because it belongs to Satan.
 (2) Trying to unlearn people from thinking by their own (it's enough to watch any newscast or analytical weekend program on major channels to understand this). The motivation is simple: when someone thinks independently he is not that easy to control. And here is a devilish contradiction with the (1): strong country consists of strong people, but not from a bunch of zombies.
And that woman just thought so because she was "forced" to.

Anyway, I'm still angry with that faxing woman... (oops, that was on the verge of foul ;)). At first I wanted to start with a story about Ernest Rutherford, who had the following policy: When a new assistant applied for job in Rutherford's lab, he submitted a research task for this assistant. If having the task done the assistant came for the next one Rutherford fired him, because to become a good scientist one should start analyzing, and thinking, and understanding what should be done further by his own. This is an essential for scientific maturity process. And I planned to make some sort of a good story about not being a bad assistant. But that faxing woman...

Tuesday, March 23, 2010

Instinct science

One of the last scenes in "Indiana Jones and the last Crusade" is about Indiana Jones hanging over the huge earth crack and trying to reach the Holy Grail. His father is trying to stop him but Indy keeps saying "I can reach it" hypnotized by the artifact. Fortunately he is stopped and everybody get out from the temple and go into the sunset.

I think that every person at least once felt something like this. It's like a hunter instinct: you see a prey and you know that you can make it, all you need is to give it a try. The same thing often happens in science. It just came into my mind this morning, when I woke up early this morning and went directly to the computer to continue the research which caught me until 1 am last night, because some good results had started to appear. I believe that without this hunter instinct, when you determine yourself to your goal and then try hard to get it, it's impossible to make a good science (neither it is possible to reach almost anything in other activities). Was this instinct elaborated by evolution or a social phenomenon? To me the former is more appropriate. Thanks, evolution!

Friday, March 19, 2010

Why I have doubts about neuroevolution and possible way out

Recent thoughts and readings made me doubt about promises of neuroevolution. I still believe that it is a really good research domain and that it is truly easy approach for many problems which are complicated to solve using traditional methods (board games playing, adaptive behavior, robotic control in complex situations, artificial music and art etc.). But the basis of my doubts is in randomness of evolutionary search which involves solutions with badly predictable structure and features, without any "physically" clear explanation of why this works and thus not very reliable when it comes to practice (nobody wants to risk much, you know).

Consider classification problem for supervised learning. Feasible approach is to extract regularities from training data and use problem dependent priors to create a structure for classifier (here I mean what parts this classifier has and how they are tuned and interconnected). This approach can be probabilistic since priors often come from statistics and somewhat tricky thing, which is called "experience", and hence the solution can have some variations when the approach is restarted. And one can even say that the result is in some way smart. But any way the classifier's structure is guided by the knowledge.

Next just look at the NE approach. We have (preprocessed) training data and then an NE algorithm is run to minimize the error function. The resulting structure of neural network will really have some reasons to be what it is (otherwise it would be unable to perform well). But I bet that in most cases it still have some unnecessary elements while some other useful elements and parts are not there. And there can also be some space for optimization of connection weights.

This can be exampled by house building for the purpose of living in it (yes, just this general). Traditional architect makes a plan, all the calculations etc. and builds a house. And the house's structure is guided by knowledge of how to build right. But evolutionary architect just builds the house up through trials and errors until something suitable is received. And imagine that our evolutionary architect is able to build any house (not only choosing, say, number of floors and roof type). And there is rather high probability that this house will be … um, very original and non-standard, not saying ridiculous (for example, having lots of wrongly shaped windows, which may be good for design, but bad for convenience). So evolutionary structure of the house is random. One can say that we can guide evolution through additional restrictions and penalties. But these restrictions can not be very strict since the purpose is very general (just like "minimize the error") and moreover it is almost impossible to consider all of them (because it can lead to the ridge objective function), thus the space of possible houses is till ve-e-ery large and mostly consists of "designed" houses, which are still good for living. Well at least for now I'm pretty sure that things are like this and that the same thing is about evolving neural networks.

Does it bad? Yeah. But… I believe that it is possible to involve self-organization principles, like those which lead to emergence of scale-free networks, together with some rules of thumb to make NE algorithms behave better and to produce more feasible solutions. And this is what I'm going to do for next several years.