Motivation is a really tricky thing. Sometimes it seems that it's motivated people who are rolling the Earth (not Ralf Ringer shoes :)). Recently I've been thinking on motivation system for under- and post-graduate students in order to improve their results and gently push them towards self-study.
To tell the truth I was inspired by Steven MacConnell's book "Professional Software Development" where he describes the programme for the professional growth which is used in the Construx company. It gives quite clear picture of professional growth based upon improvement of experience of SW developer (side remark: funny, I've just noticed, that SW stands for both 'Software' and 'Star Wars' as well :)), and was created to motivate self-study, professionalism in general, communication etc.
So here is a sketch of what came into my mind. But in the first place some axioms, which I've tried to follow:
- people's relations are prior.
- benefit for the group.
- reasonable formalism -- there should be clear rules, but these rules should not constrain one's scientific growth and trajectory.
- motivation system should not lead to envy, misjudging, grievance etc. (hopefully)
...a-a-and here it comes:
==============================================
Student of the Month title is established accompanied by a humble reward. The winner is defined by the following measures (the calculation of final decision is yet to be thought about):
- Defense of the thesis.
- Passed exam.
- Personal grant/scholarship obtained.
- Participation in the process of winning group grant, contract (helping with aplicatin, making research for a scientific scientific reserve, reviews etc.).
- Awards in competitions.
- Receiving professional certificates.
- Documents, which acknowledge usage of scientific results in real systems.
- Published papers in journals, conference proceedings.
- Oral and poster reports made on conferences.
- Reports on inner group seminars.
- Scientific service for organization of conferences, workshops, journal reviews.
- Source code of algorithms and modules, which can be used by other members of the group (with a short how-to documentation/manual). Open-source is highly encouraged.
- Scientific web-publication (posts in blogs, personal web-site etc.)
- Attracting new members to the group.
- Various proposals to improve group's organization and working.
- Transition to the next Level (see below).
- Marriage, birth of the child and so on.
Attributes (to think further):
- Challenge cup
- Hat/Crown/Helmet of the Winner.
- (something else).
Remarks:
- The winner is defined at the beginning of each month.
- There might be no winner at all.
- If uncertain situation arises then group members to decide the winner themselves (via open or secret voting or something like that).
- Think about Student of the Year (with extended reward) and Student of the Half-Year.
==============================================
Levels - designed to show ones progress and scientific growth.
Level 1. Entry level. Very small experience in scientific research, writing, programming etc.
Level 2. Small experience in programming in the major domain. Knowledge of essential books (2-3) and some publications related to thesis. Short survey on the thesis theme.
Level 3. Average programming experience in the major domain. Quite good knowledge of books (5-6) and publications related to thesis. Knowledge of essential books (2-3) on neighboring themes. Good knowledge of publications made by leading specialists during last 3 years. The emergence of the second major. 1-2 written thesis chapters or 2-3 good journal papers. Scientific consulting/joint research with the student.
Level 4. Programming experience in the major domain is significantly above average. Very good knowledge of books (9-10) and publications related to thesis. Good knowledge of books (5-6) on neighboring majors. Good knowledge of publications made by leading specialists during last 5 years. The emergence of the second major. Thesis draft, or 4-5 good journal papers, or participation in writing joint monograph.
Level 5. Ready to independent scientific work. Defended PhD thesis. Great knowledge of books (14-15) and publications related to thesis. Good knowledge of books (6-7) on neighboring majors. Very good knowledge of publications made by leading specialists for the major during last 5 years and good knowledge of publications made by leading specialists for the neighboring majors during last 3 years. Reasonably good possession of the second specialization. Own published monograph.
Notes:
- levels are necessary for writing applications for grants and projects. A student can attract other group students only if he or she has level 3 or higher. Besides, only students with lower or equal level can be considered as collaborators. Only students at level 5 can invite PhDs as collaborators. I hope that this rule will push students to promote to higher levels to make their own projects.
- transition on each level can be made with violation of desirable achievement (for example level 5 can be reached without defending PhD).
- to approve book reading the student should make a report on the group seminar which states what information from the book and in what way can used to obtain dissertation goals. Some analysis of the approach should be made and it's quite desirable that some discussion could be made on this theme. Short excuse-phrases like "I'd like to use method X, I hope it'll work", should not score in no way, there should be some speculations behind this.
- student who only read books and don't do anything else should not be promoted to the next level. Knowledge is good only when combined with practice.
- a book should not be scientific in the first place. It's enough that the book contains useful ideas or approaches which can be used to make disseration (good examples: biographies of known people, some books which develop imagination, like 'Alice in Wonderland' etc.).
Monday, July 12, 2010
Wednesday, July 7, 2010
Reservoir neural network (a concept)
Intro
Sometimes rather interesting ideas come into my mind and I give myslef a promise to implement them and then put them aside, so that they just piled on backstages of my memory. Some details can dissappear with time, hence I think it's worth writing these ideas somewhere so that later they could be useful for me or somebody else.
Description
Goal: General-purpose learning with self-adaptation of the artificial neural network’s (ANN) structure and parameters.
Idea: The idea is inspired by influence of biochemical reactions and spatial relations on brain’s functioning, which is not considered in most known models of ANNs. To implement this ANN is placed inside an expandable virtual 2d or 3d reservoir which can have different zones affecting the signals transmitting, activation of nodes, learning rates etc.
The network’s structure and reservoir parameters change over time in the following ways:
- New nodes and connections can appear.
- Some nodes and connections can be removed dynamically (to implement forgetting and/or giving other nodes more space to function).
- Nodes can change their location, moving towards “coherent” nodes, so that nodes with correlated outputs would tend to be located closer organizing structures.
- Connections can change their weight and length (the latter should have some impact on signal’s transmitting).
- The size and form of different zones in reservoir change to affect ANN functioning without training or corrections.
- Reservoir can have regulators defining number and parameters of zones and which can be changed either externally (by user or some control-program, it models the case of eating, making some physical actions and psychology), or by some law (models change of daytime, biorhythms etc.), or from the current state of nodes within the reservoir (models self-control and self-regulation). Or combination of all these can be used.
- The reservoir’s size can change to house as many nodes as required or to shrink if there’s too much of free space. This is required to implement evolution of the ANN.
Extension 1
The scheme above gives a general outline for dynamical creation and training of ANN in complex environment. This extension provides the idea for building hierarchy within the network. There are several variants (which can be used simultaneously):
1. Embedded reservoirs. When new node appears it is placed in its own reservoir if this node is located rather far from all other nodes. This reservoir can have its own zones and reservoirs which are (partially) independent from the parent reservoir. Each reservoir can have only one parent, while each parent can have multiple sibling-reservoirs. The decision whether new reservoir should be created can be made via judging the minimal distance between a new node and existing ones. If this distance is less than some dist_critical then new reservoir is created. For the embedded reservoirs critical distance can be reduced logarithmically and can optionally depend upon reservoir’s size.
2. United nodes. As nodes with correlating outputs move towards each other then when maximal distance between such nodes is below some threshold these nodes can be separated by creating new reservoir in their current location and placing them inside it. Again for the embedded reservoir the value of threshold for uniting its nodes can be reduced in logarithmical scale.
The reservoirs described above can be used further as independent units, i.e. they can treated as single nodes or be copied/deleted, form their own reservoirs by the 2nd variant etc.
Problem with meta-knowledge
I position this concept as an approach to implement data-independent learning which deals with different types of input data and can solve learning, inference, recognition and prediction problems. But it’s unknown how to make such a system to decide which kind of data or task it faces. E.g. this system should somehow get to know about this variety of problems and data. In other words the system should be able to form and extract meta-knowledge. And implementation of this might be a big problem. I believe that to check this at least some.
Sometimes rather interesting ideas come into my mind and I give myslef a promise to implement them and then put them aside, so that they just piled on backstages of my memory. Some details can dissappear with time, hence I think it's worth writing these ideas somewhere so that later they could be useful for me or somebody else.
Description
Goal: General-purpose learning with self-adaptation of the artificial neural network’s (ANN) structure and parameters.
Idea: The idea is inspired by influence of biochemical reactions and spatial relations on brain’s functioning, which is not considered in most known models of ANNs. To implement this ANN is placed inside an expandable virtual 2d or 3d reservoir which can have different zones affecting the signals transmitting, activation of nodes, learning rates etc.
The network’s structure and reservoir parameters change over time in the following ways:
- New nodes and connections can appear.
- Some nodes and connections can be removed dynamically (to implement forgetting and/or giving other nodes more space to function).
- Nodes can change their location, moving towards “coherent” nodes, so that nodes with correlated outputs would tend to be located closer organizing structures.
- Connections can change their weight and length (the latter should have some impact on signal’s transmitting).
- The size and form of different zones in reservoir change to affect ANN functioning without training or corrections.
- Reservoir can have regulators defining number and parameters of zones and which can be changed either externally (by user or some control-program, it models the case of eating, making some physical actions and psychology), or by some law (models change of daytime, biorhythms etc.), or from the current state of nodes within the reservoir (models self-control and self-regulation). Or combination of all these can be used.
- The reservoir’s size can change to house as many nodes as required or to shrink if there’s too much of free space. This is required to implement evolution of the ANN.
Extension 1
The scheme above gives a general outline for dynamical creation and training of ANN in complex environment. This extension provides the idea for building hierarchy within the network. There are several variants (which can be used simultaneously):
1. Embedded reservoirs. When new node appears it is placed in its own reservoir if this node is located rather far from all other nodes. This reservoir can have its own zones and reservoirs which are (partially) independent from the parent reservoir. Each reservoir can have only one parent, while each parent can have multiple sibling-reservoirs. The decision whether new reservoir should be created can be made via judging the minimal distance between a new node and existing ones. If this distance is less than some dist_critical then new reservoir is created. For the embedded reservoirs critical distance can be reduced logarithmically and can optionally depend upon reservoir’s size.
2. United nodes. As nodes with correlating outputs move towards each other then when maximal distance between such nodes is below some threshold these nodes can be separated by creating new reservoir in their current location and placing them inside it. Again for the embedded reservoir the value of threshold for uniting its nodes can be reduced in logarithmical scale.
The reservoirs described above can be used further as independent units, i.e. they can treated as single nodes or be copied/deleted, form their own reservoirs by the 2nd variant etc.
Problem with meta-knowledge
I position this concept as an approach to implement data-independent learning which deals with different types of input data and can solve learning, inference, recognition and prediction problems. But it’s unknown how to make such a system to decide which kind of data or task it faces. E.g. this system should somehow get to know about this variety of problems and data. In other words the system should be able to form and extract meta-knowledge. And implementation of this might be a big problem. I believe that to check this at least some.
Monday, July 5, 2010
Some thoughts on recognizing systems
Here are some thoughts just to organize my ideas for possible future research and clarify some problems and questions. The general problem is building a classifier (recognizing) system, which could evolve and learn to get new knowledge hopefully from various sources. This is a well known and a very challenging problem, which is not solved yet though a lot of people proposed their own solutions (like CEC, Phaeco, Adaptive Resonance Theory, ontologies, large-scale knowledge-based systems, biology-inspired approaches etc.). They are all good to some extent, but they also miss something, which makes them non-universal. I do not claim that I can propose my own solution (and quite possibly I'm not the person who will do this), but here are just some thoughts and classifications sketches, which are useful to think about.
===========================================
All learning systems can be divided into 2 major classes:
1. Supervised - there is a "teacher" which tells the System what output it should yield on a given input.
2. Unsupervised - the System itself to decide what to do with an input data looking for possible regularities. After the training such Systems can be used either "as is" (like Kohonen's maps) or their output can be further processed, e.g. mapped into classes space like in Echo-State Networks or Hawkins Networks.
===========================================
Input data is in general falls into (combination of) the following types:
1. Permutation-invariant, e.g. if we swap some values in object's description the object's class will remain the same. This is typical for images analysis problem when rotation or moving of the object on the image should not change the recognition result. This corresponds to the non-ordered feature vectors case. Some times this type of data doesn't demand full-scale permutation invariance, for example when recognizing hand-written text it's rotation or panning should not affect the recognition, but swapping pixels at random can distort the input critically.
2. Not permutation-invariant. Just the opposite case when order of feature vector components is significant which is often met in traditional setting of the classification problem, recognition of speech, processing time-series etc.
3. Scale invariant. This is a somewhat tricky thing. Input vector size remains the same but its spatial and/or temporal resolution changes. The example is recognition of letters and numbers from their images or recognition of fastened or slowed speech. The System, which is able to deal with this type of data, should adapt somehow to the characteristic scale of the input. In image processing this is sometimes reached via using the local maxima of Laplacian-of-Gaussian or Difference-of-Gaussian.
Note, that input data can be either of (1st and/or 3rd) or 2nd types, which means that types (1+3) and 2 are exclusive and their simultaneous processing demands either different subsystems or a kind of switching (with adaptive recognition of what type of data is being introduced on input).
===========================================
Most modern recognizing systems are just mere numerical windmills: they process numerical data using ad hoc + experimental assumptions like "if we have significant change of variance than the object under consideration changes its state", "it's enough to utilize information about lines orientation to recognize image category". But (almost?) no system have "semantic layer", which could consider interconnections between different classes and notions, organize them into sentences and statements etc., and which could affect the result of recognition. I believe that all researches feel that we've got to use semantic information as an important element in recognition process, but no good solution has yet been proposed. The problem here is twofold:
- it's extremely difficult to build such a system, because we do not have enough knowledge of what cognitive features are and how they can be modeled in a full-scale.
- most successive system rely on mathematics --> we will (almost?) definitely use numbers. Yes there are approaches which take into consideration fuzziness of the data, but such systems are still strictly governed by numbers, which define their vital parameters.
The big question is whether we can numerically approximate human's brain (or whatever is used for thinking, storing memories, setting goals, wishes, creating emotions and so on). There've been a lot of arguments on this theme, but the answer is unknown.
===========================================
I believe that the importance of multi-layered recognizing systems have been acknowledged for several past years. There are "deep-learning" architectures of neural networks (like those used by Yann LeCun and Geoffry Hinton), Hawkins networks, many good systems have a multi-component organization with separate stages of processing and recognition. However such system are still non-universal in terms of input data types (see text above) and solved problems and really important questions are still unanswered:
- How many layers to use for different problems? And how many different problems from the layers number points of view exist?
- How these layers should be formed?
- What components they should include?
- Is there a set of universal components (like nucleotides in DNA or subatomic particles)? And how strong they are "granulated" (how small they are)?
- How should we connect different layers?
- How to organize "natural" hierarchy so that a multi-layered architecture could be considered as a single layer?
- In brain spatial relations, bio-chemical and physiological processes play very important role. How can we utilize our knowledge about them?
- How can we make such multi-layered systems in a fully automatic unified manner?
Since there is no system or approach which solves these problems automatically, we all have "hands to kiss and babies to shakes" ((c) by LeChuck), let's work :)
===========================================
All learning systems can be divided into 2 major classes:
1. Supervised - there is a "teacher" which tells the System what output it should yield on a given input.
2. Unsupervised - the System itself to decide what to do with an input data looking for possible regularities. After the training such Systems can be used either "as is" (like Kohonen's maps) or their output can be further processed, e.g. mapped into classes space like in Echo-State Networks or Hawkins Networks.
===========================================
Input data is in general falls into (combination of) the following types:
1. Permutation-invariant, e.g. if we swap some values in object's description the object's class will remain the same. This is typical for images analysis problem when rotation or moving of the object on the image should not change the recognition result. This corresponds to the non-ordered feature vectors case. Some times this type of data doesn't demand full-scale permutation invariance, for example when recognizing hand-written text it's rotation or panning should not affect the recognition, but swapping pixels at random can distort the input critically.
2. Not permutation-invariant. Just the opposite case when order of feature vector components is significant which is often met in traditional setting of the classification problem, recognition of speech, processing time-series etc.
3. Scale invariant. This is a somewhat tricky thing. Input vector size remains the same but its spatial and/or temporal resolution changes. The example is recognition of letters and numbers from their images or recognition of fastened or slowed speech. The System, which is able to deal with this type of data, should adapt somehow to the characteristic scale of the input. In image processing this is sometimes reached via using the local maxima of Laplacian-of-Gaussian or Difference-of-Gaussian.
Note, that input data can be either of (1st and/or 3rd) or 2nd types, which means that types (1+3) and 2 are exclusive and their simultaneous processing demands either different subsystems or a kind of switching (with adaptive recognition of what type of data is being introduced on input).
===========================================
Most modern recognizing systems are just mere numerical windmills: they process numerical data using ad hoc + experimental assumptions like "if we have significant change of variance than the object under consideration changes its state", "it's enough to utilize information about lines orientation to recognize image category". But (almost?) no system have "semantic layer", which could consider interconnections between different classes and notions, organize them into sentences and statements etc., and which could affect the result of recognition. I believe that all researches feel that we've got to use semantic information as an important element in recognition process, but no good solution has yet been proposed. The problem here is twofold:
- it's extremely difficult to build such a system, because we do not have enough knowledge of what cognitive features are and how they can be modeled in a full-scale.
- most successive system rely on mathematics --> we will (almost?) definitely use numbers. Yes there are approaches which take into consideration fuzziness of the data, but such systems are still strictly governed by numbers, which define their vital parameters.
The big question is whether we can numerically approximate human's brain (or whatever is used for thinking, storing memories, setting goals, wishes, creating emotions and so on). There've been a lot of arguments on this theme, but the answer is unknown.
===========================================
I believe that the importance of multi-layered recognizing systems have been acknowledged for several past years. There are "deep-learning" architectures of neural networks (like those used by Yann LeCun and Geoffry Hinton), Hawkins networks, many good systems have a multi-component organization with separate stages of processing and recognition. However such system are still non-universal in terms of input data types (see text above) and solved problems and really important questions are still unanswered:
- How many layers to use for different problems? And how many different problems from the layers number points of view exist?
- How these layers should be formed?
- What components they should include?
- Is there a set of universal components (like nucleotides in DNA or subatomic particles)? And how strong they are "granulated" (how small they are)?
- How should we connect different layers?
- How to organize "natural" hierarchy so that a multi-layered architecture could be considered as a single layer?
- In brain spatial relations, bio-chemical and physiological processes play very important role. How can we utilize our knowledge about them?
- How can we make such multi-layered systems in a fully automatic unified manner?
Since there is no system or approach which solves these problems automatically, we all have "hands to kiss and babies to shakes" ((c) by LeChuck), let's work :)
Subscribe to:
Posts (Atom)