Predicting product-market compatibility

Completely unique, groundbreaking products are exceedingly rare. Most of us deal with products and services that are slight spin-offs of already existing ones. While I’m sure there is something to be said about creativity here, we’re going to head over to the business-side of things.

On this side of the fence, the similarity between products is beneficial. There’s reason to believe that if a particular product has successfully entered the market, there is a place for its spin-off. Previously we could only know that through business acumen and intuition.

Entering a market

EY have defined the decision to enter a market as a set of hypotheses, mostly related to possible profitability.  The issue is, as they succinctly note, that it’s mostly based on experience and instinct while not being tested against data.

Creating a data-driven ROI estimation of a product or service that doesn’t yet exist has always been tricky. C-level executives, board members, and investors like accurate predictions of profitability. Yet, statistical models, if the company doesn’t have a previous and related product, can only go so far.

Internal data sources had to be used, but there has always been the same problem with it. Any data that is not directly related to some process or product has limited use. While savvy data analysts can extract insights about many different phenomena out of one set, there’s always diminishing returns. In other words, predictions become less reliable if only tangential data is used.

As such, predictions for products the company doesn’t own are always a little wishy-washy. They can serve as a basis for convincing someone that trying to enter a market with such-and-such product is a good idea. But whether they truly reflect future performance is doubtful.

Heuristic models such as the “Value Disc”  and its various alterations have been used to predict future success of products as well. I have some doubts about their effectiveness. They might serve as a great way to frame your own perception of something. But I don’t think they hold a lot of actual predictive power.

Getting predictive data

The most valuable data, in such a circumstance, would be one directly related to a competitor’s product that is alike or similar to what is intended to be launched. They have already entered the market, collected numerous data points, and made changes. In other words, they have reaped all the benefits from the first mover advantage.

Competitors won’t give such valuable data away, especially if they have an inkling a business might be trying to enter the market with a similar product. It’s what gives them the advantage, after all.

If we, as we should, reject illegal or illegitimate ways to acquire such data, we’re left at a dead-end. Predictions can be garnered from barely relevant information or heuristics that have little predictive power. Yet, technological development has afforded us a unique opportunity — web scraping and external data.

Web scraping

Scraping and crawling has been around since the dawn of the internet. However, they have been prohibitively expensive for individuals and companies. Only companies that base their entire model around crawling (e.g., Google) were the ones that could truly harness the power.

Things have changed since the 90s. Scraping has become much more accessible, even to a single individual. While scaling these processes still takes an enormous amount of resources and technical expertise, solution providers have emerged as a way to overcome the barrier to entry.

Such companies take care of the entire tech stack and development process to provide scraping solutions to others. Users simply have to send requests to some endpoint, which accepts the job and sends back the data.

That isn’t where the value chain ends, though. As businesses mostly truly need the data itself, not the ability to scrape it, companies in an even higher part of the chain emerged. Data-as-a-service is a business model where the product is large sets of information that can be used for analysis.

Eventually, businesses that provide insights only should emerge. These, however, are still quite rare in our industry. Certain businesses have come upon large swaths of data by combining internal sources with external data and selling them as insights. While certainly interesting, the value of them is still somewhat undecided.

Use of external data

Most, if not all, information collected through web scraping would fall under the heading of external data. It is usually defined in contrast to traditional sources such as official financial reports and government documents. Since web scraping collects mostly data from various websites, it stands in stark contrast to traditional sources.

Getting back to our ability to outline ROI, web scraping affords us the opportunity to collect data that is more relevant to predictions. While it will never be as accurate and as valuable as information from someone already selling to the market, web scraping can get us pretty close.

Since it automatically collects data from any public source, starting from regular websites and ending with ecommerce pages, there’s a lot that can be acquired without significant effort. If any competitor has a similar product to the one being launched, ROI will be much easier to predict.

After all, if it already has been in the market for some time, there have been reviews, price and description changes, and many other interesting data points. All of it can be scraped and collected in a database.

Enriched with some internal sources, data from web scraping can become immensely useful. Insights about the product can be extracted such as its pricing strategy, common customer complaints (or praise), changes in marketing materials, etc. Clearly, such data might be beneficial.

I’m not the only one to think this, however. Our market research has indicated that about 57 percent  of UK ecommerce companies are already using web scraping. Unsurprisingly, most of the external data is dedicated to market research and consumer trend predictions.

Additionally, every first mover has a disadvantage. They have to figure out how the market will react to the product (i.e., figure out marketing, pricing, and many other strategies), all of which costs time and money. Web scraping lets second and further movers skip ahead.

If historical data is available, the pricing and marketing strategy can be reverse engineered. As such, it can be applied to any product launch before it even begins. That way a business doesn’t have to spend the same amount of resources figuring out the market.


Web scraping provides two distinct advantages for those who want to launch products without prior knowledge. First, it grants the opportunity to collect relevant data from competitors. It can then be used to predict ROI and market compatibility.

Additionally, web scraping can allow businesses to avoid the discovery phase. Competitors will have already optimized marketing and pricing strategies. With enough monitoring and historical data, these strategies can be reverse engineered and implemented without all the wasted resources.

Image credit: Wavebreakmedia/

Gediminas Rickevičius is Vice President of Global Partnerships at His mission is to empower businesses with state-of-art solutions for web data extraction.

Author: Martha Meyer