The video semantic content analysis technologies based on activity and behaviour recognition developed by Vision Semantics Ltd (VSL) is recognised by a wide range of governmental and commercial users in the video analytics market and the critical infrastructure security sector. VSL emphasise a self-learning approach to human activity and behaviour recognition in video data to avoid the need for either camera calibration or top-down manually defined ontology based rules, developed novel video-based behavioural recognition technologies based on statistical machine learning, evolving from supervised, unsupervised, weakly-supervised, to active learning. We are focussed on commercially exploiting our extensive research in video analytics and dynamic image understanding.
Conventional CCTV systems rely heavily on human operators to monitor activities and determine incidents, e.g. tracking a suspicious target from one camera to another camera in a large area of distributed spaces, or disjoint views. However, there are inherent limitations from deploying solely human operators due to the lack of a priori knowledge for what to look for. Work undertaken by VSL have developed novel techniques for video-based object space-time association and behaviour interpretation across a distributed network of CCTV cameras for the enhancement of global situational awareness in a wide area. Specifically, novel mathematical models and scalable computer algorithms have been developed for (a) robust detection and association of people over wide areas of different physical sites captured by a distributed network of cameras, known as solving the re-identification problem, e.g. monitoring the activities of a person travelling through different locations in a public transport infrastructure environment such as an airport or an underground station; (b) global situational awareness enhancement by learning to correlate human/vehicle location localised behaviours observed at each camera viewpoint across a network of disjoint cameras located at different physical sites, and for the detection of abnormal behaviours in public space across camera views (locally `normal' may become globally abnormal); ( c) learning visual context directly from limited observations without operator manual labelling, in order to provide a mechanism for coping with changes in behavioural context and definitions of anomaly; (d) incorporating minimal human feedback to enhance behaviour model learning under limited/incomplete visual observations, e.g. modelling rare behaviours and discovering new classes of unknown behaviours of significance in public spaces.