Photo of bespectacled male faculty member sitting in front of computer and bookshelf.

Photo by L.A. Cicero: Using a machine learning technique to analyze product descriptions and sales data, Stanford linguist Dan Jurafsky and colleagues found that specific words in descriptions can predict sales.

Stanford News - September 29th, 2017 - by Alex Shashkevich

Research from Stanford scholars revealed that specific words in product descriptions can predict sales.

They found that polite language that invokes culture or authority helps products sell. The work was carried out on online products in Japan, but the authors’ method could reveal top-selling words in English, Chinese and other languages.

Computer science graduate student Reid Pryzant and Stanford linguist Dan Jurafsky applied a machine learning technique to analyze more than 90,000 food and health-related product descriptions and their sales data on the Japanese e-commerce marketplace Rakuten.

The more of those keywords a description contained, the better the product sold, according to the research results, which recently appeared in an article presented at the SIGIR Workshop on eCommerce in Tokyo, Japan.

“Product descriptions are fundamentally a kind of social discourse, one whose linguistic contents have real control over consumer purchasing behavior,” the researchers wrote. “Business owners employ narratives to portray their products, and consumers react accordingly.”

Challenges of language analysis

Online vendors have long struggled to figure out why the exact same product offered on different websites has varying sales figures.

Previous research focused on online consumers’ reactions to product reviews and word-of-mouth recommendations. But product descriptions haven’t received as much attention because studying the effects of language on consumer habits is a difficult task, according to researchers.

The problem is that many words are associated with high sales simply because they signal the product’s brand or pricing strategy, the researchers said. For example, if a product’s description includes brand names like “Nike” or phrases like “free shipping,” its sales will be higher than a description that doesn’t. But these are words that advertisers can’t change.

“We’re more interested in framing,” Jurafsky said. “How do advertisers frame the text to appeal to people independent of those other obvious sales factors?”

To address that challenge, Pryzant suggested applying adversarial machine learning, a new statistical method in which predictive models are pitted against each other. In this case, the results identified words associated with high sales, but not influenced by price or brand.

“The idea came quickly, but fitting the technique to our needs was hard and took time,” Pryzant said. “But the model was good at predicting sales on the first try, which was a gratifying result.”

That the model worked surprised the researchers. The technique has been widely used in software for image analysis but rarely on language.

“Adversarial learning is a really hot topic right now,” Jurafsky said. “But it’s been challenging to get it to work for language. So this is really exciting technically and suggests other potential applications.”

Language of politeness and tradition

Researchers found that product descriptions associated with higher sales were politer, using Japanese words and suffixes that indicated respect to the customer. These descriptions were also more informative, with lists of features or properties.

The descriptions of products that sold better also invoked framings of tradition or authority, with words like “long-standing shop” or “staff,” or mentioned the cultural function of the item, using words such as “Christmas”, “year-end gift” and “souvenir.”

Those results echoed some of Jurafsky’s previous findings on the language of restaurant menus and food advertising, described in his 2014 book The Language of Food: A Linguist Reads the Menu.

“Using words that appeal to tradition – we also saw that on American menus and even on the back of potato chip bags,” Jurafsky said. “Talking about authenticity and tradition is just a really useful framing device.”

Because this research was done only on Japanese descriptions, Jurafsky and Pryzant are eager to expand the study to English and other languages.

“It’ll be really interesting to see how the language differs when we start looking at English and Chinese,” Jurafsky said. “After all, different cultures appeal to tradition and display their attitudes toward customers in different ways.”

Manipulation and language

While this kind of research on language and sales might help businesses sell more products, Jurafsky said it’s important to think about whether these results could also make it easier for some businesses to manipulate their customers.

“There is definitely an ethical question here,” Jurafsky said. “Framing is a tool for persuading people – we see it every day in politics. Linguists worry about this a lot.

“From my perspective as a linguist, I think the more we know about how people are using language to influence us, the better. If we as consumers know that people are using certain kinds of framings, that has to help us spot when we’re being manipulated.”

Young-joo Chung, a lead scientist at Rakuten and a former visiting scholar at Stanford, also co-authored the research.

Originally published at Stanford news