On Friday 13th March our iOS app went down. Although funds remained safe and it was business as usual behind the scenes on our servers, Chip savers were unable to view or interact with their Chip accounts on iPhone.
Android users were happily saving away but for iOS users, it was the most significant downtime to date. In the interest of transparency, we'd like to apologise again and communicate what went wrong, and what we're doing to ensure that it doesn't happen again.
In a nutshell, the problem was down to increased security infrastructure being over-protective. Our security measures went into lockdown mode and blocked our iOS app and servers from talking to each other.
Chip's client applications communicate with our servers to send and retrieve information to work with and display (such as your Chip balance, or your savings goals). When apps exchange information like this, they typically use a set of cryptographic protocols known as Transport Layer Security (TLS) to provide secure communications. The main goal of TLS is to allow us to transmit data over a network without exposing that data to untrusted third parties (naughty people).
When our app requests some data, such as the Chip balance, the server responds firstly with a set of digital certificates. A certificate is essentially a file that holds information about the server that owns it - think of it as a passport for servers. The app then verifies that those certificates are valid, authentic and not someone trying to gain access to sensitive information.
On iOS we usually delegate setting up and maintaining TLS sessions to the operating system (the app relies entirely on the certificates that the iOS Trust Store provides). This method has a security flaw, however: a hacker can perform what's known as a 'man-in-the-middle attack' by generating a self-signed certificate and sniffing transmitted data moving to and from an app - not ideal in Chip's world of sensitive data.
To overcome this security weakness, we implement a process known as SSL/TLS Certificate Pinning, or SSL/TLS Pinning for short. We configure our app to reject all but one predefined certificate. Whenever the app connects to a server, it compares the server certificate with the pinned certificate. If and only if they match, the app trusts the server and establishes the connection.
And this is where our implementation tripped over...
Our servers had previously been configured to rotate its certificate on a bi-annual basis, and it did so on Friday the 13th. This was caused by a legacy configuration rule which had not been setup correctly, resulting in the renewal of our production certificate for iOS.
This meant that the app's locally stored certificate didn't match that sent by the server and ALL communications were deemed untrusted and rejected on iOS.
This was an unforeseen error, and the Chip team have already put guard-rails in place to ensure that this (or something similar) doesn’t happen again.
Finally just to finish, we'd like to extend massive thank you to the whole Chip community for their patience and support as we resolved this issue, especially given the timing of the outage and the current climate. As ever, you're the best.
Join our online community and have your say in the future of Chip.