When is a person not a person?

Object detection – the ability to identify specific objects in a picture – has made all the difference to my Zoneminder camera monitoring. Raising alerts only when a person is in the picture has almost entirely eliminated false positives.

Occasionally it still makes a mistake:

Clearly the algorithm didn’t have a lot of confidence that the picture did actually contain a person – it determined that there was a 32.23% chance that it was seeing a person. That’s not unreasonable – as a silhouette, it sort of, if you kind of squint, could be someone jumping. That is, if you ignore the context of the placement in the picture – a person that close to the camera would be much larger, and not floating in mid-air. The algorithm is just working on a bunch of pixels – it doesn’t have that context.

From my observations so far, images that do actually contain people are usually in the 90%+ confidence range, so the answer was clear – I could eliminate rogue spiders from triggering alerts by alerting only if the confidence was above a given threshold.

It was time to extend my object detection setup

I considered making the change upstream in zmEventNotification, the tool that I’m using to generate the alerts from ZoneMinder. That approach had a downside – if I found I was missing events because I’d set the threshold too high, I’d need to go back and change the code or configuration files.

I decided it would be better to make the decision at the Home Assistant end. I could then set up an input variable to control the threshold through the front end, in the same way that I’m throttling the notification frequency.

I first added the input control:

    min: 30
    max: 3600
    step: 1
    name: Zoneminder Notification Throttle
    mode: box
    min: 0
    max: 100
    step: 1
    name: Zoneminder Notification Threshold
    mode: box

I could then make that control visible through the front end and refer to it in code.

The next step was to change my AppDaemon code to use the new value. First of all, I made it visible as an argument to the AppDaemon script in the configuration file:

  module: zoneminder
  class: ZoneMinder
  zmuser: !secret ZMUSER
  zmpass: !secret ZMPASS
  zmhost: !secret ZMHOST
  throttle: input_number.zm_notify_throttle
  threshold: input_number.zm_notify_threshold
      notify: input_boolean.back_garden_notify
      notify: input_boolean.side_gate_notify
      notify: input_boolean.front_garden_notify

When an event occurs, the state value will contain JSON data that looks something like this:

    "detection": [
            "confidence": "99.47%",
            "label": "person",
            "type": "object",
            "box": [
    "eventid": "67231",
    "name": "Side Gate:(67231) [a] detected:person:99% Motion All",
    "eventtype": "event_start",
    "hookvalue": "0",
    "monitor": "2",
    "state": "alarm"

Note that the detection value is an array. That’s because you can look for multiple object types – “person”, “car” etc – and if more than one is detected, the array will contain multiple items.

I changed my code to loop over the array, looking for any items that had a label of person with a confidence level above my threshold and simply do nothing if none were found:

def state_change(self, entity, attribute, old, new, kwargs):

    state = json.loads(new)

    found = 0
    threshold = float(self.get_state(self.args['threshold']))
    for item in state['detection']:
        if item['label'] == 'person':
            confidence = float(item['confidence'].replace('%',''))
            if confidence > threshold:
                found = found + 1

    if found < 1:

I could also have pulled the value from the “name” field using a regular expression to extract the percentage, but if there’s a structured field I think it’s better to use it than to parse a free text field.

If I was making it truly flexible, I wouldn’t hard code the label value – instead of checking for “person” I could set it up with multiple labels, each with different confidence thresholds. I could also set up separate thresholds for each camera if I found the need.

I find this sort of configuration decision to be an iterative process. I’m pretty sure the only label I’m going to use is “person”, and I’m also pretty sure that the same threshold will be fine for all of the cameras. Hardcoding those decisions made sense as a starting point, and it’s something I can revisit if needed.

On the other hand, I was only guessing at the threshold value – I might still get false positives at 75%, or miss genuine events at, say, 90%. For that parameter, then, a value configurable through the front end made sense.

So far so good, and I haven’t had any more false alarms. I still need to go out every few days and clear the webs from around the cameras to keep the pictures clear though!

in Home Automation

Add a Comment

Your email address will not be published. All comments will be reviewed.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Posts