There is a lot of buzz about video analytics or video content analysis these days. There is also a lot of hype about ease of set-up, auto-calibration and 'learning'.
Every video analytics algorithm is based on two or three core concepts. So, I can generalize my discussion based on these core concepts. These algorithms can be created to work with a set of default parameters, which could be programmed to automatically change based on the circumstances. However, there are many situations in the real world where doing so is very difficult if not downright impossible. For example, consider the situation where the head lights from a passing-by moving vehicle are shining on a pole which is in the area of interest of the analytics. The typical video analytics algorithm will detect this 'shining pole' as a human. And why not. The moving lights create an illusion of a moving object when reflected on to the pole, the height/aspect ratios may be in the range of human, etc. "But, Prem, poles do not have arms, head or legs. So detect the these appendages or the lack thereof on the pole and you have just cut down on the false alarm." Well, the problem is that there is so much variability between pixels that, when you see a bunch of appendage-looking structures, you are sure that these are arms and legs only if you already know that there is a person within the boundaries of those structures. The classic computer vision chicken-and-egg problem.
Computer vision is full of these 'chicken and egg' problems. Human vision is really good at breaking this chicken and egg cycles with a lot of context and
apriori information. Nobody completely understands how our visual system is so good at scene processing, but the prime suspects have been a different computational paradigm and millions of years of training. In practical terms, fine tuning is basically a way of artificially providing this training. This, however, does not mean each camera/installation is new custom software development, but it should involve changing a few exposed software variables so that the system works at the best level it can.
Fine tuning also has to do with understanding the concept of operations of the end user. In my experience, most of the fine-tuning time, if not all, is spent on minimizing false alarms. Reduction of even a few false alarms a day can have a major impact in the operations/logistics of the end-user. This is why I believe that systems that are advertised as completely plug-and-play are more like plug-and-pray.