Science and Engineering of Machine Intelligence

Intelligence is an extremely complicated concept, and understanding it with a view to creating artificial intelligence is one of the most important and interesting problems in science and technology today. Approaching this problem requires the interaction of a wide spectrum of disciplines, spanning all the way from neuroscience and cognitive science to mathematics and engineering. It is probable that progress on understanding and modelling artificial learning will enhance our understanding of intelligence, both natural and artificial, and lead to better learning strategies - and more intelligent machines.

In recent years (after about 2006), a machine learning technique called “Deep Learning”, based on multilayer neural networks, has driven ground-breaking empirical successes in many Artificial Intelligence (AI) applications: important examples are image categorization, face identification, speech recognition, language processing, and game-playing. It has become a principal tool in machine learning, outperforming “classical techniques” one after another. Many areas of research in Deep Learning are nonetheless still in their infancy, often relying more or less entirely based on ad-hoc experiments. A sound theoretical foundation for understanding the effectiveness of these techniques is desirable, in particular because it would probably lead to substantial performance gains. Establishing such a foundation will necessarily come from studying successful empirical applications from a mathematical point of view. Collaboration between engineers and mathematicians would therefore be a natural and powerful alliance, driving an interplay between research in Deep Learning from the foundational, deductive, mathematical viewpoint and the engineering viewpoint emphasizing application performance. 

Classical machine learning (before 2006) used hand-crafted features, so-called “data representation”. These features are hard to build, being based on comprehensive domain knowledge. Deep Learning ultimately aims to build features automatically in a hierarchical way, providing hierarchical data representations. This is a huge step forward for data-driven representation, though the interpretation and the understanding of these automatically generated hierarchical features is still unclear.

Learning algorithms have two extreme cases: supervised learning and unsupervised learning. In supervised learning the learner has target labels/annotations for all inputs from the data-set it is learning from, hence it knows what the response should be. Based on the current internal state, the learner can respond to a given input from the data-set and adjust its internal state to improve the response to further input. In unsupervised learning, the learner receives no feedback. Data are grouped based on similarities, so that data representation and similarity measure is of profound importance. At present the most successful AI applications use supervised learning, and their performance is far superior to unsupervised techniques. Supervised learning needs huge quantities of labelled data to be effective; the more data the better. These huge datasets are, however, very costly to produce because of the large amount of human labour needed in the labelling process. On the other hand, it is well known that humans can learn from just a few examples, indicating that biological learning does not need huge amounts of labelled data. Thus, it appears that there is both a need for, and a path towards, use of smaller amounts of labelled data. Within the last two years some very promising techniques has emerged supporting this point of view.

The research to be undertaken in this research activity will take up the challenge of moving empirical state-of-the-art Deep Learning beyond the massive use of labelled data, without compromising the learning performance. The research will be an interplay between research in Deep Learning from the point of view of application performance (engineering) and research from a view point of deduction and foundation (mathematics).

The research undertaken in this research activity will focus on 4 areas: Deep Reinforcement Learning, Generative Adversarial Networks, Manifold Learning and Data Augmentation. All of these techniques operate without labelled data and without comprehensive domain knowledge.

  1. In Deep Reinforcement Learning (DRL), the learner iteratively chooses an action and in this way moves from one state to a new state. There is an associated reward to each action, and the overall goal for the learner is to maximize the cumulative reward without the need for target labels. Reinforcement learning was in 2016 combined with Deep Networks resulting in DRL with super-human performance on very complex problems. DRL will be used on problems such as automated driving and autonomous robotics, and probably many others.
  2. Generative Adversarial Networks (GANs) are systems consisting of one network generating new (fake) data, and another that tries to discriminate between fake and real data. The two networks are trained together without labelled target data. When trained, the discriminator is a classifier trained in an unsupervised manner. In the WP, GAN will be used to gain a deeper understanding of the DL’s hierarchical features and to problems such as 3D structure from a 2D image and generalizing from small labelled datasets, and probably many others.
  3. Deep Learning provides hierarchical data representations, embedding data points in a high-dimensional space. It is conjectures that real-world data actually clustered around a submanifold (subset) of much lower dimension than the ambient high-dimensional space. The methods of estimating geometric and topological property of this submanifold is the method of Manifold Learning
  4. With access to a data source, more examples of similar data can be constructed to use as input for further analysis. This method is called Data Augmentation. Mathematically, this can be regarded as interpolation. Especially in mathematics in general and numerical analysis in particular, interpolation has been an important tool for many years and there is a large body of knowledge often relating to the “geometry” of the data.  We wish to investigate how it may put to new use for the purpose of data augmentation to move Deep Learning beyond the massive use of labelled data.

The research activity will focus on developing new techniques to reduce the amount of labelled data traditionally needed in industrial supervised learning. Cases (domain knowledge and data sets) will be drawn from different application domains in cooperation with external industrial partners. Research in the WP will also develop methods for Learning from experience (DRL). This will enable reduction of domain knowledge and automate existing industrial processes. Cases will again come from external industrial partners. The academic outcome will be new insight into how machines can learn from own experience reducing human supervision. The industrial outcome will be new methods for industrial processes monitoring and automation.

In general, one can say that this research activity is in the initial phase primarily focused at the level of optimisation of “smarter products” such that they act as intelligent machines. It is expected that the science developed in this research activity will be ending up being applied in new technological solutions enabling autonomous behaviour of different elements. We are already seeing this happening for the automotive industry, but this will be reality in many other fields soon. All the three initial departments involved with the Centre have researchers who can contribute to this research activity. 


Henrik Karstoft

Professor (Docent)
H bldg. 5125
P +4541893270