Reinforcement Learning Research Call

«Reinforcement Learning» is not only a fashion trend (please, see https: //www.ias.informatik.tu-darmstadt.de/Main/OpenPositions). It is also a main way to learn in everyday life («learn by own experience», «trial and error method»). Herein input data are complicated organized (structured) and consist of unknown and known components (semi-parametric).
«Reinforcement Learning» like «Machine Learning» in general implicitly uses a number of assumptions that are accepted without proof – axioms.
Axiom 1 (unity): The laws of nature are united for all living and non-living objects.
Axiom 2 (meaningfulness): Any information is semantic – filled with meaning.
Axiom 3 (complementarity): Learning takes place as a combination result of internal guesses(insight) with external cues (reinforcement).
Axiom 4 (prototyping): There is a common universal language underlying all types of data. Just one real contender for role of such a language is mathematics.
Axiom 5 (evolution): Any learning takes place in evolutionary way – from simple to complex.
Formulation of the above axioms in explicit form allows us to proceed to statement of the following problems:
Problem A. «Creating a consistent semantic information theory that satisfies the provisions of axioms 1 and 2 (unity and meaningfulness)».
A possible solution to this problem is work of Artemy Kolchinsky and David Wolpert «Semantic Information, Autonomous Agency and Non-Equilibrium Statistical Physics», published in writings of Royal Society on October 19,2018 which is recognized by reviewers as revolutionary one.
Problem B. «Searching a general algorithm for system interaction of insight with reinforcement that satisfies the provisions of Axiom 3
(complementarity)».
Problem C. «Discovery of learning patterns that correspond to natural sequence of stages in the development of mathematical thinking (satisfying the provisions of axioms 4 and 5 – prototyping and evolution)».
Possible directions of research.
To problem A: «Checking the results of work made by Artemy Kolchinsky and David Wolpert with methods of experimental mathematics (www.thefullwiki.org/
Experimental_mathematics)».
To problem B: «Mathematical modelling using techniques of coaching(http://en.
wikipedia.org/wiki/Coaching) and solving inventive problems(https://en.
wikipedia.org/wiki/TRIZ)».
To problem C: «Formation of evolutionary chain with learning techniques in a sequence of mathetical thinking "Topological–Ordinal–Metric–Algebraic–Projective" .
Choice of a concrete research direction as priority one will be made after discussion.

Dr. Igor Skryagin,
Cognitive Ethology Expert

E-mail: iskryagin@yandex.ru


Рецензии