In traditional image analytics, each frame – whether a photo or a frame from a video – is analyzed in a single pass. If more analysis is desired or needed, incorrect classifications were made, or important details missed, so be it. Not so in the revolutionary new computer vision system that Cisco, in collaboration with a number of academic and industry collaborators, plans to release as open-source software in the coming days.
One of these collaborators is IIIM, whose research on novel AI approaches has, for the past 3 years, been funded in part by Cisco Systems. The system, called Ethosight, uses reasoning to enhance the ability of traditional ANN-based large language models to dissect and classify objects and events in images and video, in realtime. Like a human looking for more clues about what is happening in a particular scenario, the system can improve the quality and depth of its analysis over time, the longer it looks at it, collecting more information about what may be going on. The system is possibly the first of its kind to demonstrate what has been called cumulative learning, that is, the ability to autonomously improve its knowledge about a particular thing over time. A preprint of a paper describing Ethosight has been published on ArXiV repository.
For Ethosight, the things it can address may for instance involve a variety of social situations, such as a child playing near a hot stove, or opening a closet where chemical are stored. According to the blog of Cisco’s Principal Engineer and the first author (see here) of the paper Hugo Latapie, Ethosight breaks away from the traditional limitations of AI systems being positioned “…not just as a real-time video analysis tool but as a vanguard in the continuous learning paradigm…”.