A lesson in poor API design


Now that I’m into home automation one of my requirements when buying a new device for home is how well it will fit into my setup. Unfortunately I’ve got some older stuff that doesn’t work… even though it really should…

Take, for example, the Sony SRS-X88 speaker I had in my bedroom. I bought it because it has built in Chromecast as well as Bluetooth and line input. It sounds pretty good, and I’ve been glad of the Bluetooth since I got into Echo devices. Since there’s a Sony web app that can control it, there has to be some sort of API.

There is, but it’s not terribly well documented. The SRS-X88 sort of works with the Audio Control API. It’s a truly terrible API.

Think about what the API should do. First of all you want to see the current status – power, volume, selected input source and whether it’s muted. Then you want to be able to power it on, select a source, change the volume and mute it. In other words, you want to replicate the remote control. The SRS-X88 has five inputs on the remote:

  • Network
  • Bluetooth
  • Audio In
  • USB-A
  • USB-C

The API has a getCurrentExternalTerminalsStatus function. To quote the documentation, it “Gets information about the current status of all external input and output terminal sources of the device”

This is what it does on the SRS-X88:

curl -s -d '{"method": "getCurrentExternalTerminalsStatus","id": 1,"params": [],"version": "1.0"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "connection" : "unknown",
            "title" : "USB DAC",
            "uri" : "extInput:usbDac",
            "meta" : "meta:usbdac"
         },
         {
            "connection" : "unknown",
            "title" : "Bluetooth Audio",
            "uri" : "extInput:btAudio",
            "meta" : "meta:btaudio"
         },
         {
            "connection" : "unknown",
            "title" : "Audio in",
            "uri" : "extInput:line?port=1",
            "meta" : "meta:line"
         }
      ]
   ]
}

Ah. Only three of the five inputs selectable on the remote are shown. It turns out that Sony use a concept they call “schemes” to categorise inputs. There’s a getSchemeList function you can use to list them:

curl -s -d '{"method": "getSchemeList","id": 1,"params": [],"version": "1.0"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "scheme" : "storage"
         },
         {
            "scheme" : "extInput"
         },
         {
            "scheme" : "dlna"
         }
      ]
   ]
}

There’s another function, “getSourceList” that will get you the source list for a scheme. Calling that for the extInput scheme gives the sources:

curl -s -d '{"method": "getSourceList","id": 1,"params": [ {"scheme": "extInput"}],"version": "1.1"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "source" : "extInput:usbDac",
            "playAction" : "startPlay",
            "isPlayable" : true,
            "isBrowsable" : false,
            "title" : "USB DAC",
            "meta" : "meta:usbdac"
         },
         {
            "source" : "extInput:btAudio",
            "playAction" : "startPlay",
            "isPlayable" : true,
            "isBrowsable" : false,
            "title" : "Bluetooth Audio",
            "meta" : "meta:btaudio"
         },
         {
            "source" : "extInput:line",
            "playAction" : "startPlay",
            "isPlayable" : false,
            "isBrowsable" : true,
            "title" : "Audio in",
            "meta" : "meta:line"
         }
      ]
   ]
}

That’s the three inputs returned by getCurrentExternalTerminalsStatus with one minor detail missing – the audio in source doesn’t have the port parameter. I’ll come back to that. Running the same getSourceList function for the storage scheme returns:

curl -s -d '{"method": "getSourceList","id": 1,"params": [ {"scheme": "storage"}],"version": "1.1"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "source" : "storage:usb1",
            "playAction" : "changeSource",
            "isPlayable" : true,
            "isBrowsable" : true,
            "title" : "USB",
            "meta" : "meta:storage:usb"
         }
      ]
   ]
}

So one USB source is known as extInput:usbDAC and is part of the extInput scheme while the other is known as storage:usb1 and is part of the storage scheme. I can’t fathom the logic.

Finally there’s the DLNA scheme:

curl -s -d '{"method": "getSourceList","id": 1,"params": [ {"scheme": "dlna"}],"version": "1.1"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "source" : "dlna:music",
            "playAction" : "startPlay",
            "isPlayable" : true,
            "isBrowsable" : false,
            "title" : "Home Network",
            "meta" : "meta:dlna:music"
         }
      ]
   ]
}

In theory that’s the five inputs and you can use the getPlayingContentInfo function to see which is selected:

curl -s -d '{"method": "getPlayingContentInfo","id": 1,"params": [{"output": ""}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "stateInfo" : {
               "state" : ""
            },
            "source" : "extInput:btAudio",
            "parentUri" : "",
            "contentKind" : "input",
            "uri" : "extInput:btAudio"
         }
      ]
   ]
}

So far so good. OK, so instead of one call to get the list of inputs you need to make a call to get the list of schemes and then iterate over each scheme to get the full list. It’s inefficient, but workable. Let’s try setting one:

curl -s -d '{"method": "setPlayContent","id": 1,"params": [{"uri": "storage:usb1"}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : []
}

curl -s -d '{"method": "getPlayingContentInfo","id": 1,"params": [{"output": ""}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "stateInfo" : {
               "state" : ""
            },
            "source" : "storage:usb1",
            "audioInfo" : [
               {}
            ],
            "parentUri" : "",
            "contentKind" : "music",
            "uri" : "",
            "albumName" : "",
            "artist" : "",
            "title" : ""
         }
      ]
   ]
}

It works! Let’s try another one:

curl -s -d '{"method": "setPlayContent","id": 1,"params": [{"uri": "extInput:line"}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "error" : [
      3,
      "illegal argument"
   ],
   "id" : 1
}

Ah. Nope. Remember that missing parameter for audio in that getExternalTerminalsStatus shows but getSourceList doesn’t? Yeah, you need that:

curl -s -d '{"method": "setPlayContent","id": 1,"params": [{"uri": "extInput:line?port=1"}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : []
}

curl -s -d '{"method": "getPlayingContentInfo","id": 1,"params": [{"output": ""}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "stateInfo" : {
               "supplement" : "",
               "state" : "STOPPED"
            },
            "source" : "extInput:line",
            "parentUri" : "",
            "contentKind" : "input",
            "uri" : "extInput:line?port=1"
         }
      ]
   ]
}

So now it turns out that as well as iterating over the schemes, we also need to correlate the meta parameter with the output of getExternalTerminalsStatus to get the right value to pass in to setPlayContent.

Those hoops are really piling up, aren’t they? OK, let’s try setting the network input. From the source list, you’d guess that you need to set the source to dlna:music:

curl -s -d '{"method": "setPlayContent","id": 1,"params": [{"output": "dlna:music"}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : []
}

It appears to work, but it doesn’t. The getPlayingContentInfo function still shows the last selected input and the source indicator on top of the speaker itself doesn’t change.

So what happens if you select “Network” via the remote and then call getPlayingContentInfo?

curl -s -d '{"method": "getPlayingContentInfo","id": 1,"params": [{"output": ""}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "stateInfo" : {
               "supplement" : "",
               "state" : "STOPPED"
            },
            "source" : "netService:audio",
            "fileNo" : "255",
            "parentUri" : "",
            "contentKind" : "service",
            "service" : "netService:audio?service=spotify",
            "uri" : "netService:audio?service=spotify&contentId=-1"
         }
      ]
   ]
}

Ah. Something completely different. I’ve never used Spotify with this device, so it must be some kind of default.

It turns out that the network sources are, effectively, read only. If you use the Sony app to play from a local network server using dlna, the source will be dlna, If you stream to it using a Chromecast compatible app, it will be something like:

curl -s -d '{"method": "getPlayingContentInfo","id": 1,"params": [{"output": ""}],"version": "1.2"}' http://192.168.1.18:54480/sony/avContent | json_pp
{
   "id" : 1,
   "result" : [
      [
         {
            "stateInfo" : {
               "supplement" : "",
               "state" : "PLAYING"
            },
            "source" : "cast:audio",
            "fileNo" : "255",
            "parentUri" : "",
            "uri" : "cast:audio?applicationId=-1"
         }
      ]
   ]
}

These sources can’t be set using setPlayContent alone. For actual operation that doesn’t matter – if you cast to the speaker it will switch to the network source regardless of the current source. It does mean that you can’t replicate the remote control.

It also means that it’s impossible to have a dropdown source selection that shows which of the sources is active – there’s no way to select the network source or show it if network has been selected on the remote. Creating a Home Assistant platform for it would be a major challenge.

Maybe there are workarounds. The dropdown could show that network is selected if none of the other sources are set, but that still doesn’t help when the user selects network. I guess you could send a dummy cast when network is selected – perhaps a short text to speech message. That would do the job but it’s very messy.

The mess doesn’t end there. You’ll note that every API call requires a version parameter, and these aren’t the same for every call. You need to do discovery to find them before you can use the calls. There is a discovery function, but it still seems like an unnecessary hoop to jump through.

I’m sure the API makes sense to the people who designed it. It just makes no sense to me. It falls at the first hurdle – you can’t replicate the functionality of the remote control. Even for the things it can do it seems over engineered and more complex than it needs to be. I know it supports a wide range of devices, so maybe there is some logic to it. I just think there should be a single call that returns the key information – power, volume, mute status and current source – and a single call that lists the sources in a way that can used in another call to set them.

So what have I done about it? I gave up and demoted it to a room where I currently don’t have any need for automation. Then I bought a Sonos for the bedroom, but that’s a story for another post…

in Home Automation

Related Posts