MobiDev shares expertise in building real-time video processing products, together with creating pipelines for applying Machine Learning and Deep Learning.
Online PR News – 18-August-2021 – Atlanta – MobiDev, a US-based software engineering company with development centers in Ukraine, has shared its expertise in building real-time video processing products, together with creating pipelines for applying Machine Learning and Deep Learning models. Overall, the live streaming industry has increased up to 99% in hours watched since last year according to daily sports.gg statistics. So, it will totally change fan experience, gaming, telemedicine, etc. Moreover, Grand View Research reports that the Video Stream Market will be worth USD 184.27 billion by 2027.
With visual inspection technology, integration of deep learning methods allows differentiating parts, anomalies, and characters, which imitate a human visual inspection while running a computerized system. A Technical problem, which MobiDev solved, was to blur faces of video subjects quickly and accurately while live streaming, and without quality loss through the use of Artificial Intelligence. The final resolution is supposed to be flexible in terms of input, output and configuration.
To make processing faster, keeping the accuracy at a reasonable level is possible in several ways:
1) to do something parallelly;
2) to speed up the algorithms.
Basically, there are two approaches for ways to parallel the processes: file splitting and pipeline architecture. The first one, file splitting, is to make the algorithms run in parallel so it might be possible to keep using slower, yet accurate models. It is implemented when the video is split into parts and processed in parallel. In such a manner, splitting is a kind of virtual file generation, not a real sub-file generation. However, this process is not very suitable for real-time processing, because it may be difficult to pause, resume or even move the processing at a different position in time spin. The second one, pipeline architecture, is to make a certain effort to accelerate the algorithms themselves, or their parts with no significant loss of accuracy. Instead of splitting the video, the pipeline approach is aimed to split and parallelize the operations, which are performed during the processing. Because of this process, the pipeline approach is more flexible.
As part of one of the projects, MobiDev’s AI engineers had to process video in real-time using AI algorithms. The pipeline was composed of decoding, face detection, face blurring and encoding stages. The flexibility of the system was essential in this case because it was essential to process not only video files but also different formats of video live-stream. It showed a good FPS in the range of 30-60 depending on the configuration. The implementation of a pipeline approach involved several aspects, such as interpolation with tracking, sharing memory, multiple workers or multiprocessing.
As to the complexity of applying AI to live video streams, the process of implementation consists of several stages:
Adjusting a pre-trained neural network (or trained) to be able to perform the tasks needed
Setting a cloud infrastructure to enable processing of the video and to be scalable to a certain point
Building a software layer to pack the process and implement user scenarios (mobile applications, web and admin panels, etc.)
To create a product like this, using a pre-trained NN and some simple application layers takes 3-4 months for building an MVP. However, the details are crucial and each product is unique in terms of the scope and timeline.
More detailed information about the research on AI-driven Live Video Processing can be found at: