By Angus Williams, ASI Fellow, May 2017

Customer services can be frustrating. Whether it be negotiating menus over the phone, unsure which option fits with our query, or reading long lists of FAQs online, we can end up bored and irritated. Fortunately, forward thinking businesses that understand the power of technology are starting to provide better alternatives.

Octopus Energy are one such company. They have a single e-mail address:, to which customers can send any query they like, with the guarantee that they will receive a response from a real person. Simplifying the experience for customers has huge value, and you only need to look at Octopus Energy’s ratings on TrustPilot to see that it works. After welcoming their first customers around a year ago, they now provide electricity and gas to roughly 100,000 homes across the UK. As they continue to grow, it is important that their customer services model can scale efficiently.

Because Octopus do not ask customers to specify the nature of their problem, members of the customer services team read new messages, and decide who should deal with them based on the content. This classification step, which is currently a bottleneck in the process, is ripe for replacement by an AI system. My ASI fellowship project was to build such a system.

I was given access to all of Octopus Energy’s customer messages in order to solve this problem. Using SherlockML, I processed and cleaned the data into a format that was ready for some Machine Learning analysis. I then built a pipeline to classify messages, and trained it on the dataset that I constructed.

The first stage of the pipeline converts the text from an e-mail into a long list of numbers that can be understood by Machine Learning algorithms. The main component of this step is a ‘tf-idf’ transformation. Tf-idf assigns a number to each word in an e-mail that is proportional to how often it appears in the e-mail, and inversely proportional to the number of other e-mails it appears in. Intuitively, this process produces a large score when a word is very common in an e-mail, but doesn’t appear very often in other e-mails. This allows the computer to identify which words might be important in identifying different types of message.

The next step in the pipeline is to classify the message. For this, I trained a Support Vector Machine (SVM) classifier on past messages from customers. Once the e-mails have been converted into numbers, they can be imagined as points in a high-dimensional space. SVMs attempt to find surfaces in this space that best separate the different types of messages. For example, messages from customers submitting their meter readings will occupy a different region in this abstract space than those who want to change their direct debit details.

Using SherlockML’s ability to spin up large servers in the cloud, I trained a large number of different pipelines, adjusting the many different tunable parameters, and chose the one that performed best on left-out data. The pipeline worked well, and I worked with Octopus Energy’s tech team to connect it to their existing infrastructure. Furthermore, the classifier can continue to improve as time goes by using a process called ‘online learning’, where it learns from new messages and from past mistakes.

With this new system in place, Octopus Energy can continue to provide their amazing customer services at scale.

Angus Williams took part in the ASI Fellowship May 2017. Prior to the Fellowship he completed a PhD in Astronomy at the University of Cambridge.