Retailers are having a hard time at the moment, especially those with a ‘bricks and mortar’ presence on the high street.
Although primarily due to the Covid-19 pandemic, the convenience of competing e-commerce outfits was already nibbling significantly at the market share of the traditional retailer.
How bad is it? Well – iconic brand retail outfits we’ve known for years are going into administration.
Their stock being sold off cheaply – a ‘fire-sale’ in all but name, with the brand-name itself a significant reason for the takeover.
One technology that could help traditional retail outlets is Computer Vision (CV), and as an example to showcase this, I suggest you check out a memorable scene from Steven Spielberg’s 2002 film ‘Minority Report’ – yes - that’s nearly 20 years ago.
The film at one point shows how retail shopping malls look in 2054, with Tom Cruise entering the mall, but trying to hide from the authorities. In-store cameras identify Cruise through a retinal scan and then after having obviously trawled through his past spend profile, he is shown real-time adverts using holographic media. You get the picture. CV is the technology that could make scenarios like that a reality.
What is computer vision?
Computer Vision (CV) is a collection of technologies and techniques that in effect, help computers ‘see’. By ‘see’, this means it can detect and classify the content of digital images, which could be photographs, videos or even real-time feeds from retail stores. Endowing systems with CV means they can recognize and track objects or people in real-time.
Use cases for CV are already all around us. Automatic Number Plate Recognition (ANPR) is a basic form of CV. Driverless cars are a more up-to-date example, and these have been under development, marrying CV to location-based services, since the 1980s. Another, more successful example, is facial recognition - which has been used by law enforcement authorities - amongst others - for years.
What is the current state-of-play in computer vision implementation?
There are two areas where CV is showing the most signs of growth - neural networks and edge computing. In the first case, large, complex artificial neural nets are being trained for processing images in previously intractable problems, like segmentation and detection of objects in 3D space in the area of temporal activity recognition (TAR). These neural nets are responsible for a large improvement in the accuracy of facial recognition. The US National Institute of Standards and Technology (NIST) has said these neural nets have led to an industrial revolution in facial recognition.
The field most relevant to the retail sector could be TAR, and it takes little imagination to see the technology being used to identify an item being picked up in a store and by who. If the item is being stolen, the computer vision system should be able to detect that – and take appropriate action. But if it’s put into a shopping trolley, and facial recognition has identified the person doing it as shopping often at the selected retail store, advanced analytics could then kick in and a whole host of scenarios become possible. For example - if it’s the first time the shopper has bought this particular item, to encourage further purchases when the customer pays, they could be charged at a reduced price – you get the idea.
In the Edge Computing field, attention is concentrated on optimizing and deploying computer vision algorithms on edge devices, like drones, mobile phones or platforms having sensors that can be triggered, for example, if a movement is detected, or some other physical parameter changes – like an increase (or reduction) in temperature.
Implementation of CV in retail is no longer sci-fi, with the example of Amazon opening its first checkout-free store outside of the US. The new place in Ealing, West London, will be till-less with customers able to simply walk out of the store without the need to stop and scan at a checkout point. CV is playing the main part in the story - identifying both customers and the products taken off shelves - and equally when they are placed back.
What are the benefits for retailers?
Computer vision is able to deliver a much better targeted and personalized experience for the shopper. We’ve already mentioned a few use case scenarios, but there are others, for example – virtual products. As we increasingly move online, high street retailers can use computer vision across their product range to allow customers to interact more closely with that product. Imagine selecting a pair of shoes and then pushing the image of those shoes to the person’s mobile device. When they hover the camera above their feet, the device could display how the shoes look on them – even suggesting the correct shoe size. This could be an extremely valuable feature, especially now, when we can’t try clothing items on in stores.
CV could also be used to physically identify products. Often store assistants are asked 'do you have x?' or 'have you got y in stock'? Typically the response involves the assistant traipsing around the store, or an 'according to our systems we have z in stock, would you like to try that instead?'. What the assistant sees next is the back of the customer exiting the store. Using computer vision retailers could quickly locate products in stores and warehouses, allowing store assistants to provide assured responses to satisfy and delight the customer. On the back of this use case, CV could also maintain stock levels, by monitoring in-store and applying live in-store analytics on the edge, so stocks can be observed, and stock levels calculated and replenished to ensure key products are continually in store.
What kind of retailer would benefit from CV the most?
All retailers would benefit from CV, from traditional ‘bricks and mortar’ to online retail, but the clincher is the level of adoption (or willingness to adopt) of digital techniques to augment their offerings. The retailers prepared to embed data-driven decision making in their processes, and who are agile in CV application, will get not only the largest, but also the fastest returns.
How can CV be combined with other technology, like data analytics and intelligent automation?
Computer vision is one component of a Data and Analytics ecosystem and is the source of data for analytics platforms. A good example would be product identification for inventory management in warehouses or stores. In addition to being the source for independent analytical insight, it enables entire stock processes to be re-imagined. CV on Internet-of-Things (IoT) Edge devices can monitor stock levels and infuse the data into customer demand analytic models for customer demand, triggering orders to be placed automatically with suppliers, as well as further refining the accuracy of customer prediction models.
The most often challenge retailers face is their own mindset. They need to embrace the need to go beyond human capabilities to create trailblazing capabilities. Retailers are looking for the computer vision teams to create cutting-edge operations, proofs-of-concept, leading to new processes or products, which are then guaranteed ongoing support to scale the solution to market.
Beyond the mindset challenge, other industry debates are going on. In the post-pandemic world of contactless payments, there are clearly merits to examples such as the Amazon Fresh store, however, the implementation of CV comes with a shadow of facial recognition. Whilst old good CCTV has been around for a while, CV feels more personalized - 'big brother is watching you' - and might make customers uncomfortable. That said, it will be down to retailers communicating how they use the image and data in play as customers see a big difference between security and being targeted with marketing offers off the back of CV recognition. As we are also seeing a number of facial recognition projects paused in the UK due to race related concerns, we can expect CV development to further focus on object recognition - at least for a while.
What does 2021 hold for computer vision?
2021 is set to be a continued year of growth of Computer Vision, with capabilities like action and behavior recognition, and image and video synthesis becoming mainstream. At the same time, we will see a growth in combined modalities with computer vision being combined with language capabilities and biometrics.
Jai Gandhi is VP Consulting at Ciklum, an international software development and IT outsourcing company