Sue Couto, Senior Vice President, APAC Sales, TiVo
For many years, effective voice-based search technologies have eluded businesses that have tried to bring next-generation input methods to customers. Command-based speech systems have been perceived as ineffective and hard for viewers to use. However, the widespread adoption of smartphones and tablets, and their minimised keyboards, has led to a renewed interest in this genre of technology. For example, Apple’s Siri and Amazon’s Alexa have progressed beyond basic menu navigation functions. In fact, any device with a microphone has potential for speech-based commands, and can become an intelligent discovery system that uses a sophisticated entertainment brain to understand customer desires.
This technology is important and under-explored by the TV industry, which often appears to have been left behind in terms of intuitive discovery functionality. For content providers, voice-based search and recommendation should be a core part of their customer service provision to provide customers with accessibility to their favourite shows and genres.
Speaking the viewer’s language
With the chaos of content available today, consumers have preferred selections and considerations across cast, plot and genre. Conversational interfaces simulate natural communication qualities and remove the need to conform to hierarchical menu structures. Most importantly, the technology must understand when a user is drilling into a particular genre in detail, or when they have lost interest and have completely switched topics.
To be successful, natural language search needs to encompass a variety of different points, each crucial to success:
- Disambiguation: Natural language technology must understand and interpret the user’s intent. For example, the phonetic sound “Kroos” can be interpreted to apply to Tom Cruise or Penelope Cruz, and the system should be able to understand what the user is looking for in relation to the original query.
- Statefulness: During a dialogue with a user, the system should be able to maintain context, and understand that people change their minds quickly. For example, the user could say that they are “in a mood for thrillers,” then jump to “Bond” and then to “old ones”. Ideally, the system should understand these requests, and serve up a series of older James Bond films for the viewer to select from.
- Personalisation: Conversational systems need to understand their users on an individual basis. For example, the system should learn that a user based in New Zealand who asks “when is the game tonight” wants to know about their local team, and if they say, “when is the Blacks game” they mean the rugby team All Blacks.
Taking understanding to the next level
Behind successful natural language technology lies excellent search capabilities. New technologies such as graph, have introduced high-quality and relevant search results to consumers everywhere, setting a benchmark across industries. Unlike a traditional database, a graph is much more scalable and flexible because it allows the connection of all sorts of information to records, without the reliance on “tables.”
In the context of TV, most consumers have viewing patterns that can be mapped to provide highly personalised results to searches. This is more accurate than user-based profile creation or ‘thumbs up/down’ ratings that are both error-prone and do not automatically take into account users’ changing tastes and preferences over time. The ability to make personalisation precise and extremely relevant – what the industry is now terming hyper-personalisation – is correlated to the knowledge graph’s semantic capabilities.
At its core, a quality conversational search engine should include the following aspects:
- Knowledge graph: This graph maps search results to intent, and prioritises those results based on the weight of their connection and should be able to:
- Look at named entities in media, entertainment and geography and extract, de-duplicate and disambiguate the entities across sources
- Recognise similarities and build relationships between entities
- Identify a multidimensional view of popularity and how audience interest in the entities shift over time
- Generate a large vocabulary such as keywords and sub-genres to help search systems identify relevant content
- Personal graph: Crucial to true conversational systems, the personal graph tunes the conversational system to individuals to enable natural conversations around the user’s preferences and context. The personal graph is:
- Based on statistical machine learning
- Able to learn individual behavioural patterns and interests
- Learns how time and device affect recommendations
At the front end of the system, the conversational query engine is required to bind all aspects together. This brings together key algorithms to map and learn linguistic features and provide content discovery features to customers.
Intuitive search and recommendation
Natural language technology backed with knowledge graphs can provide a revolution in TV search and recommendation. Based on excellent metadata that covers actors and actresses, content synopsis and even famous quotations from films, TV providers can create a second to none entertainment brain that offers customers speedy and accurate access to their favourite shows, and similar content that they might enjoy. Voice-based discovery around knowledge graphs is no gimmick – it is set to change the way that people interact with their TV sets – as long as service providers make it personalised, intuitive and natural.