Traps to avoid when building a sales forecasting model

January 12th, 2011

Over the past year, we have encountered several companies that have been unable to build effective sales forecasting models. Indeed, several businesspeople appear quite cynical about the possibility of analyzing their retail/distribution network and of ultimately being able to predict in a relatively accurate fashion whether a new site will be profitable for their company in the first place and if so, to what extent.

After discussions with many corporate managers, several of whom work in the American market, I have been able to identify certain common elements regarding how these modeling exercises were conducted. It turns out that the same mistakes are often repeated from one company to the next. These methodological or conceptual errors alone account for the failure of these models to work properly.

I have thus drawn up a list of the most common errors committed in the creation of forecasting models used to evaluate the viability of future business sites. I have summarized these reasons below. Over the coming months, I will more thoroughly examine a few of these factors that merit a more systematic explanation.

Sales forecasting models often do not work because:

1. The size of the sample is too small: The first rule of parametric statistics is to acknowledge the Normal Law: the notorious Bell Curve. If the sites/stores sample used for the modeling exercise is under 30, the model will not be representative of reality and cannot be used to forecast the sales of future sites with any kind of accuracy.

2. Assumed linearity: The majority of the models we have encountered are models based upon a simple linear regression equation. Without verifying whether the phenomena being studied are indeed linear, this statistical technique is often misused or overused. Indeed, the sales of a network of sites seldom have a linear distribution. Hence, in addition to linear models, there exist notably log-linear, exponential and logarithmic models that can also be used.

3. Colinearity is ignored: Often, one sees models that use variables having a strong colinear relationship, which in itself is a major error that inserts “noise” into the model and that prevents other potentially explanatory variables from taking their proper place. Colinearity exists when two or more variables are explaining the same thing. For example, income, age and type of job are three variables that tend to usually point in the same direction. It is thus advisable to perform a factorial analysis prior to the modeling exercise in order to combine the effect of colinear variables and thereby reduce their import.

4. Equality of variance ignored: A test for the equality of variance exists for most statistical techniques used in linear modeling. This test is extremely important with regard to the model’s capacity to forecast the sales of a new site. When the variances of the variables in a sample are not equal, the model will only be valid to explain this particular sample and thus cannot be extrapolated to another sample or to the reference population.

5. Commercial zones that are too big (the aggregation of variables not sufficiently discriminant): The sales forecasting models must be exclusively built upon easily accessible socio-demographic or market data, because this is the only data that will ultimately be available for the analysis of any future site. When building sales models, one has to aggregate those socio-demographic or market variables that are available around existing sites in order to create the query file. Often, the commercial zone being used is too big and is characterized by abundant overlap between commercial zones, with the result that one ends up considering the same geographic areas for different sites, thereby limiting the discriminant potential of the variables being used.

6. Weighting of total sales: a direct result of the size of the commercial zone being used to aggregate socio-demographic data, it will never be possible, unless one is analyzing a network of point of sale that are very far apart, to examine a sufficiently large commercial zone without having any overlapping at all and that includes 100% of the sales. Total sales per store must thus be weighted by the percentage of sales included in the primary commercial zone. Accordingly, if 34% of the sales come from the primary zone (the transit effect), then one must try to model only 34% of total sales.

7. Segmentation of the sites: It is a generally accepted principle that all customers are not created equal. Hence the segmentation systems fad (Mosaic, Prismz, Focus). Commercial sites are also not all created equal. They differ in size, in their pulling power, in age, in the products available, in the markets they serve. To that effect, it is wise to segment the sites before the modeling exercise and to thus build more than one model.

8. Accuracy of the model: Before using a forecasting model to evaluate the viability of a new site, it’s worthwhile to validate its ability to estimate the sales of current sites. In order to accomplish this, one has to build a model based upon one-half of the sample of sites, and then test the model on the other half. In this manner, one can confirm the accuracy of the model, whose success rate would have to be over 85%.

As I write these few lines, four additional errors that distort the accuracy and reliability of certain sales forecasting models come to mind; a random blend of independent corporate and socio-demographic variables, sloppy analyses (univariate, followed by bivariate, and then multivariate), the use of incomplete or presumably complete information, and the use of variables created at different aggregation levels. I will discuss these elements in a future article.

All too often, building a sales forecasting model is taken too lightly. In addition, people tend to think that statistical analysis software can do all the work. This is simply not the case. Statistical modeling is a discipline where logical thinking and an intimate familiarity with the phenomena under consideration categorically trumps technology.

Geomarketing in Russia!

October 1st, 2010

The Center for Spatial Research announced that they are now opened for business for Western companies seeking to enter the Russian market.

Please follow the link below to read the full story on Stas Shuskov’s blog.

A rare positive feedback in Geomarketing

September 17th, 2010

I was having lunch early this week with a client.  A manufacturer of household goods sold through Wal-Mart, The Bay, Zellers, etc.  A few months ago, we were asked to analyze all store locations of one of their client in comparison to their customers profile and national penetration.  We ended up delivering an analysis that was estimating the potential sales of their new line by store.  The President of my client walked into the President’s office of this major retailer confident, holding our 200 pages reports filled with charts, maps and our initial purchase recommendation and allocation (we are talking millions of dollars here).  The retailer’s President was impressed with the presentation that included our binder/findings and decided on the spot to double the initial order.

Rarely do we get that kind of feedback.  I do not want by any mean to take away anything from my client’s ability to close a sale.  My client’s President is a seasoned executive with powerful skills.  He would have closed this deal without Indicia’s study in hand!  In fact, he’s been closing deals all his life before we started to work together.  But it was gratifying to ear the impact our binder/study/findings had on the meeting.

Any similar experience out there?  Your comments are welcome.

How to define a budget allocation for analysis in direct marketing

September 17th, 2010

I will be giving a training course in quantitative analysis and modeling in October to marketing executives in Montréal.  Participants will ask me what portion of their direct marketing budget they should allocate to analysis.  In other words, if I have a yearly budget of $500 000 for creative, printing, postage and lists, and want to start analyzing my data, what is a “fair” allocation of my overall budget for analysis/modeling?  A mere 2% provides $10 000, barely enough for a thorough initial analysis…  Five percent (5%) allocates $25 000 of the budget to analysis, while 10% sums up to $50 000; maybe to much?  What do you think?

Le groupe Linked Québec

August 20th, 2010

J’ai joint le groupe Linked Québec pas plus tard que cette semaine. Près de 7 000 membres. Fondé par Simon Hénault, le groupe réunit des entrepreneurs québécois et étrangers qui favorisent une approche de réseautage pour aller de l’avant.

 Je suis impressionné par la qualité des membres du groupe et du dynamisme qui les anime. J’encourage tous ceux qui veulent découvrir leur voisin et sortir du bureau à prendre part au groupe.

Il s’agit d’avoir un profil sur et de créer un profil complet de ses activités.

Bon réseautage!

Science or Marketing?

August 5th, 2010

Isn’t it hard to differentiate what is science from marketing in some cases?

So many tricky questions out there!

The case of caviar and vodka

Is mixing caviar and vodka science or marketing?

the mix takes its roots in Russian history:

…but is the mix at least molecular or just a good marketing concept to stimulate caviar and vodka’s market shares henceforth sales? I haven’t found nothing on the Web yet stating that the two are bound by science…?

Plus, wasen’t caviar associated with Champagne once?


Montréal – Québec: l’art de perdre des ventes

July 28th, 2010

Pour certains d’entre nous qui sommes familiers avec les différences régionales qui caractérisent la province de Québec, ce n’est pas une grande surprise de constater qu’il existe des disparités assez importantes dans les tendances de consommation lorsque l’on compare la ville de Québec et celle de Montréal. Read the rest of this entry »

What do I gain from an optimization analysis?

July 27th, 2010

When developing your business, what is your main preoccupation in terms of site location and selection?

Competition, real estate, available market $$$, local socio-economic trends? Read the rest of this entry »

Growing demand for geomarketing services

July 19th, 2010

The industry is currently shaping up to be more receptive to non-conventional ways of analysing its massive quantity of data. You can see this through a growing demand for external geomarketing services by retailers and some manufacturers. It’s not only a matter of getting closer to a client-base, it’s a matter of repeating success while avoiding mistakes. Read the rest of this entry »

Please help me define Geomarketing

March 20th, 2010

While working on our new website, I found an old document I wrote for a conference in Australia in 2006.  I was trying to summarize all the fields of intervention I applied geomarketing in my carreer.

Does it make sense to you?  Read the rest of this entry »