PhDinfo Blog

Deep Learning and Computer Vision for Autonomous Systems: Focus on drone vision, imaging, surveillance and cinematography.

18-19 November 2020,
Aristotle University of Thessaloniki (AUTH),
Live Web Course

Technological and scientific development in the field of Unmanned Aerial Vehicle (UAV), Computer Vision and Deep Learning has led to the use of drones in various applications that was previously unthinkable. In the field of media production (cinema movies, documentaries, news coverage and sport events), the drones are used for aerial shots instead of helicopters. The advantages are greater flexibility of movement, ease of coordination, reducing the overall costs of the entire operation, allowing both close and far shooting (Fig. 1). The management of multiple drones is fundamental to shoot from multiple angles at the same time.

Fig. 1:

Drones have also acquired more and more importance in infrastructure surveillance and inspection, alongside human contribution. To inspect bridges, roads or electrical installations, drones can carry out long-range and local accurate examinations safely (Fig. 2)

Fig. 2:

During the course, all the theoretical aspects regarding computer vision and deep learning techniques and the aspects related to communication, planning and control of a system composed of drones were addressed.
Specifically, the navigation and the operation of multiple drones systems was dealt by focusing on different issues as drone2ground communication and multisource video streaming.
Understanding the most suitable communication protocol for correct signal transmission between the drones themselves and with the ground station is fundamental for synchronization, Quality of Service (QoS), security and video streaming. In this case a combination of LTE and Wi-Fi technologies will be used.
Then, the theory of computer vision was totally covered starting from how image acquisition and the camera geometry works, passing through the concept of stereo vision (stereoscopic imaging, 3D scene from multiple images), concluding at advanced techniques of Localization, Mapping and Object tracking. Among them, the SLAM method, Kalman Filter and Kernelized Correlation Filter (KCF) tracker have been discussed in depth.

Next, neural networks were introduced dealing with the basic concepts, the functioning of backpropagation, the basic structures (Multiplayer Perceptron, Fully Connected layers, Convolutional Neural Network) and their use in various computer vision problems such as object detection and semantic image segmentation. The recent state of the art architectures have been debated: RCNN, Faster-RCNN, YOLO, SSD.
Moreover, the standard frameworks for the development of deep learning models (Keras, Tensorflow, Pytorch) with their differences have been examined.

After that, the software tools used to carry out a simulation to test the planning and behavior of a single or a group of drones were analysed. The most famous tools are Gazebo and AirSim.
Finally, we looked at a real situation in the case of infrastructure inspection (Fig. 3).

Fig. 3:

The lessons were given by Ioannis Pitas, professor at the department of Informatics of Aristotle University of Thessaloniki, Greece.
At this link, the course slides for some topics are available and downloadable.