apacciooutlook logo

Data:The Unexploited Scale Effect

By Rob Thomas, VP-Product Development, Big Data and Analytics, IBM


Rob Thomas, VP-Product Development, Big Data and Analytics, IBM

Scale Effects

In the 20th century, scale effects in business were largely driven by breadth and distribution. A company with manufacturing operations around the world had an inherent cost and distribution advantage, leading to more competitive products. A retailer with a global base of stores had a distribution advantage that could not be matched by a smaller company. These scale effects drove competitive advantage for decades.

The Internet changed all of that.

In the modern era, there are three predominant scale effects:

- Network: lock-in that is driven by a loyal network (Facebook, Twitter, Etsy, etc.)

- Economies of Scale: lower unit cost, driven by volume (Apple, TSMC, etc.)

- Data: superior machine learning and insight, driven from a dynamic corpus of data.

Data is an unexploited scale effect in institutions around the world. Spark will change all of that.

Re-inventing Retail in the Data Era

Retail, by definition, is mass market. It has been through every era. While subtle changes in approach have occurred, very few have captured the intimacy of the original corner store. The corner store’s owner knew the customers personally; he understood what was happening in their lives, and the store became an extension of the community. In the Data era, mass marketing can reclaim the corner-store experience.

Stitch Fix is a data era retailer, focused on personalizing a shopping experience for women. While many women love clothes shopping, Stitch Fix realized that it is an inefficient experience today. It requires visiting many stores, selecting items to try on, and repeating. In fact, a successful shopping trip requires a relatively perfect set of variables to align:

Location: A store must be near the shopper.

Store: The store itself must interest the shopper and draw them in.

Clothing: The clothing in the store must be of interest to the shopper.

Circumstance: The clothing must match the circumstance for which the shopper needs clothes (dinner party, wedding, outing, etc).

Size: Even if all the preceding elements are present, the store must have the right size clothing in stock.

Price: Even if all the preceding elements exist, the shopper must be able to afford the clothing.

Stitch Fix is disrupting fashion and retail, targeting professional women shoppers who want all the variables to align. These women do not have the time nor perhaps the inclination to search for the alignment and hence, Katrina Lake, the CEO and cofounder states, “We’ve created a way to provide scalable curation. We combine data analytics and retail in the same system.”

When a person signs up for the service, she provides a profile of her preferences: style, size, profession, budget, etc. The data from that profile become attributes in Stitch Fix’s systems, which promptly schedule the dates to receive the clothes, assign a stylist based on best fit, and enable the stylist to see the person’s profile (meaning her likes and dislikes). The customer also specifies when and how often she wants to receive a ‘fix’, which is a customized selection of five clothing items. Stitch Fix maintains the data on preferences so that, over time, it becomes a giant analytics platform, where recommendations can be catered to a unique shopper. Not since the corner store has such intimacy been available, and it’s all because of the data.

Most data leveraged by Stitch Fix comes from the large amount of what Lake calls ‘explicit data’, which is direct feedback from clients on every fix. The buyers at Stitch Fix, responsible for stocking inventory according to new trends and feedback, love this data, as it tells them what to buy and focus on. As Lake says, “What customers buy and why, and what they don’t buy and why not, is very powerful.”

Stitch Fix has analyzed over 500 million individual data points. While the company has shipped over 100,000 fixes, no two have ever been the same. That’s personalization. With the power of data, a retailer can redefine core business processes and in many cases, invent new ways of interacting with customers.

Transforming Industries with Spark

All industries will be transformed in the data era, in the same vein that Stitch Fix is transforming the retail experience. Continuous insight and leveraging all data assets are the keys to unlocking the transformation in a company. Luckily, an open source project emerged from Berkeley a few years ago, which will play a fundamental role in the coming data revolution: Apache Spark.

Spark is the analytics operating system of the future, unifying all of the data in an organization. Simply put, it is an application framework for doing highly iterative analysis that scales to large volumes of data. Spark provides a platform to bring application developers, data scientists, and data engineers together in a unified environment that is not resource-intensive and is easy to use. This is what enterprises have been clamoring for.

Today, business professionals have analytics in their hands in the form of visual dashboards that inform them what is happening. Think of this as descriptive analytics. Now, with Apache Spark, these can be complemented with analytics smarts built into applications that learn from their surroundings and specifies actions in the moment. Think of it as prescriptive analytics.This means that, with Spark, enterprises can deploy insights into applications at the front lines of their business exponentially faster than ever before.

While there are many dimensions to the Spark ecosystem, I am most excited by machine learning. With machine learning at the core of applications, they can drive insight in the moment. Applications with machine learning at their core get smarter and more customized through interactions with data, devices and people—and as they learn, they provide previously untapped opportunity.

Machine Learning and Scale Effects

Superior machine learning and insight, driven from a dynamic corpus of data will be the essence of competitive advantage in the Data era. Over the next five years, machine learning applications will lead to new breakthroughs that will assist us in making good choices, look out for us, and help us navigate our world in ways never before dreamed possible.

Companies like Stitch Fix are a step ahead in exploiting data to upend the traditional players in retail. But, this opportunity exists in every industry. And, with powerful technology like Spark, the opportunity to disrupt is available to anyone with the determination and insight.

This post has been adapted in part from the book, Big Data Revolution: What farmers, doctors, and insurance agents teach us about discovering big data patterns, Wiley, 2015.

Magazine Current Issue

magazine current issue

Leaders Speak

Andy Nallappan, VP & CIO,

The Industry Demands Quick Upgrade into Cloud

By Andy Nallappan, VP & CIO,

Global Information Technology, Avago Technologies

Steven Weinreb, CIO & EVP, Technology & Operations, Asia, MetLife

Embracing Advanced Tech-enabled Solutions that Foster Innovation and Growth

By Steven Weinreb, CIO & EVP, Technology & Operations, Asia, MetLife

Anil Khatri,

Trends that are on Every CIO's Watch-list

By Anil Khatri,

Head IT-South Asia,


James F. Hanauer, CTO, VP Engineering and Art Saisuphaluck, Solutions Architect, R&D Lead, CTSI-Global

Simplifying Infrastructure Management with Microsoft Solutions

By James F. Hanauer, CTO, VP Engineering and Art Saisuphaluck, Solutions Architect, R&D Lead, CTSI-Global

Mickey Bradford, VP-IT/CTO, Exchange; & Jay McCartin, VP-Logistic Operations,  Army & Air Force Exchange Service

Embracing Cloud Hosting Benefits

By Mickey Bradford, VP-IT/CTO, Exchange; & Jay McCartin, VP-Logistic Operations, Army & Air Force Exchange Service