Speech recognition software and its use is certainly nothing new. Siri has been around since 2011, while Google Now, Cortana and Amazon Echo show that all the major players are investing massively in conversational interfaces (CI). The constant improvements of speech recognition programmes, especially when combined with artificial intelligence (AI) and machine learning, have significantly improved the user experience over the last few years. In China, 15% of all searches on the country’s biggest search engine, Baidu, are now spoken rather than typed in, which likely has a lot to do with the fact that inputting Chinese text on a keyboard is quite tricky. In the West, Amazon’s major advertising campaigns for its own voice-controlled device, Echo, have played their part in bringing voice control systems back into public consciousness.
This development is laying the foundations for an ever-popular trend. Graphic user interfaces (GUI) are increasingly being replaced by conversational user interfaces (CUI). Some experts are even talking about the next big digital disruption. Whatever happens, there will be yet another paradigm shift. In the mid-nineties, the age of PCs and software applications gave way to a new internet era, with websites that increasingly moved applications towards the internet and later into the cloud. The mobile era that began a good ten years ago saw applications being moved over to mobile platforms with their mobile apps.
In 2020 when the mobile era reaches its peak with a global smartphone penetration of 80%, we will be seeing new fundamental changes. Even today, users are hardly able to keep up with the huge number of apps installed on their phones, and 75% of all installed apps are no longer used after three months. Instead of logging into the actual apps, users are kept informed by notifications – and therefore drowning in a sea of messages.
So it’s not surprising that some major players on the app market have decided to move forward with a “super-app”, bundling together several different individual app services. And the ones with the highest chances of success are messaging operators. China’s WeChat shows how the future might look; if you want to book a hotel room, for example, you can find the hotel, check its availability, make a booking and pay for it all with a single app. Western providers like Facebook, to name just one example, are working towards this with services like Facebook M, where again, searching via speech recognition and AI plays a key role.
When, in the future, a user says to her virtual assistant, “Book a table at Restaurant Brenner with Thomas and Paul,” the assistant will carry out all the following tasks in the background:
- Reserving a table with the restaurant
- Entering it in the calendar
- Letting friends know
- Providing directions to the restaurant, if required
- Making an automatic payment, if required
Instead of using different apps like OpenTable, Calendar, WhatsApp and Google Maps, this just requires one.
The reason why there is a trend for CUI via speech input is simple: it’s human nature. CUI allows its users to ask questions, receive answers and solve even complex tasks in the digital and real world through real dialogue. Natural speech has been our primary means of communication for hundreds of thousands of years. It helps us share knowledge and emotions and to organise ourselves. So why shouldn’t it do the same in the digital world?
Many tasks – such as checking your bank account, planning a meeting, making a reservation or finding out travel information – do not require the use of an actual user interface (UI) these days. The UI may even completely disappear in some contexts. This is known as Zero UI. In this scenario, the user interface is no longer limited to a two-dimensional display; instead, actions are carried out haptically and automatically. Of course, this doesn’t mean that graphic interfaces will no longer be used. Tasks such as selecting from drop-down menus, reading documents or looking at a map are much more efficient with GUIs.
Nevertheless, users, businesses and agencies are going to have to start thinking differently. Businesses already need to start looking at the possibilities and limitations of CUI and Zero UI and the cost of engaging with this new technology is not as significant as might be expected. Many skills are available within the libraries that can help to build simple CUIs in the form of chatbots. Agencies, too, will have to start thinking outside the box and developing new forms of expertise, with employees who understand user interactions beyond graphic interfaces and who can develop and implement speech interfaces to meet this need. The ones who will profit from this are the users, who will be offered an alternative and (hopefully) simpler way of accessing functions and interactions.
Illustration: Christine Roesch