INTRODUCTION

The diffusion of innovations and, by extension, the diffusion of online shopping are generally thought to follow an S-shaped curve (Rogers, 2003). In the past, researchers have employed various market growth models; in particular the Bass, Logistic and Gompertz models to describe the diffusion curve. However, results have been, so far, inconclusive and comparative studies suggest that there is no strong evidence that one diffusion model will systematically outperform the other models. For example, in an early study, Young and Ord (1989) developed a model selection algorithm for the Logistic and Gompertz curves and tested it with four technological time series data such as the percentage of US households with telephones during the period 1920–1979. Their results indicate that no one type of model could be consistently selected as best. Young (1993) fitted nine growth curves including the Logistic, Gompertz and Bass models to 50 growth time-series data sets. The performance of the models was assessed over the last three data points. The results suggested that there is no clear outperforming model, though the Bass model appeared to outperform other models when the upper limit of the market was unknown. Two recent studies that examined the diffusion of mobile phones reported conflicting findings. Wu and Chu (2010) fitted the Bass, Logistic and Gompertz models to a mobile phone diffusion pattern and found that the Logistic model is superior for describing the diffusion pattern, whereas in the second study of Michalakelis et al (2008) it was found that the Gompertz model outperforms the Bass and Logistic models. Meade and Islam (1995) studied the development of telecommunication markets across 15 different countries using 17 growth models and found that Logistic and the Gompertz models outperform the Bass model.

Given the mixed results reported in the literature, the objective of the current study is to compare the forecasting performance of three technological growth curve models. In this study, the Bass, Logistic and Gompertz models are fitted to time series data of online shopping from Australia (1998–2009) and comparative model performance is evaluated via a series of measures. This article also contributes to the literature by exploring the market growth models’ ability to forecast market growth from a limited range of early diffusion data points. In their comprehensive review of the literature, Meade and Islam (2006) identified forecasting new product diffusion with little or no data as a key area for further research because ‘a priority among practitioners is to establish market potential as early as possible in the diffusion process’ (p. 539). However, past research has largely focused on a meta-analysis approach (see, for example, Sultan et al, 1990; Van den Bulte and Stremersch 2004) and little attention has been given to the task of forecasting diffusion patterns from limited data.

The remainder of this article is organized as follows. The following section reviews three most widely used growth curve models. Data and methodology are then described, followed by the results and discussion sections. Finally, future avenues of research are presented.

TECHNOLOGICAL MARKET GROWTH MODELS

Rogers (2003) hypothesized that the diffusion of an innovation follows an S-shaped curve arguing that populations are heterogeneous in their propensity to innovate. According to Rogers, the innovators who account for 2.5 per cent of populations, have a lower threshold for adoption and, as such, are the first adopters. The early adopters (13.5 per cent), the early majority (34 per cent), the late majority (34 per cent) and the laggards (16 per cent) are lower in innovativeness and thus adopt in later stages of the diffusion process as the innovation becomes more widely accepted and social pressure builds. In other words, the heterogeneous distribution of innovativeness results in a heterogeneous propensity to adopt (Peres et al, 2010). Income heterogeneity is also linked to the S-shaped diffusion curve. Advocates of this school of thought argue that, provided individual incomes are log-normally distributed, the diffusion curve will take on an S-type growth curve (Golder and Tellis, 1997). It is suggested that, as the price of an innovation falls, more people can afford to adopt; hence creating an S-shaped diffusion curve (Meade and Islam, 2006). Van den Bulte and Stremersch (2004) found that imitation/innovation (q/p) ratio, which controls the shape of the diffusion curve in the Bass model, is positively correlated with the Gini coefficient of income in 77 countries, confirming the impact of income heterogeneity on the shape of diffusion curve. Thus, based on past experience with a large number of innovations, it is more than likely that online shopping will also follow an S-shaped diffusion curve as potential adopters are heterogeneous in terms of, for example, appetite for online risk and shopping orientation.

Numerous growth models have been developed to model the S-shaped curve of the diffusion of innovations. Meade and Islam (2006) have identified at least eight different basic models, of which six have been already applied in modelling the diffusion curve. These models can be classified as internal, external and mixed influence models. Internal influence models (Mansfield, 1961) assume that the diffusion of innovation is a function of imitation and, as a result, the force of diffusion is determined by the number of previous adopters and word-of-mouth communication. The internal influence model, which in essence is a classic Logistic growth curve (Mansfield, 1961), can be expressed as follows:

where dN/dt is the rate of diffusion; N(t) is the cumulative adoption at time t; and Nu is the upper limit of the market and b is the coefficient of diffusion. In this equation, the effect of word-of-mouth is a function of N(t) or the number of previous adopters and increases with time as the number of previous adopters grows (Wright et al, 1997). In early study, Griliches (1957) used the Logistic curve to model adoption of hybrid corn in the United States (cf. Wu and Chu, 2010).

In the external influence model (Fourt and Woodlock, 1960), the speed of diffusion is influenced by external factors such as mass advertising and is given by the equation below as:

where dN/dt is the rate of diffusion; N(t) is the cumulative adoption at time t; and Nu is the upper limit of market and a is the coefficient of diffusion that is primarily a function of consumers’ innovativeness.

The mixed influence model, first introduced by Bass (1969), combines the two basic models into one generalized model capturing the impacts of both innovation and imitation on the adoption of an innovation. The Bass model is essentially a theory of the timing of initial purchase (or ‘adoption’) of products that are distinctively new and purchased infrequently (Bass, 1969). It provides a framework to analyse long-run sales behaviour. The model is distinguished from other growth models by explicitly incorporating some key behavioural assumptions from Rogers’ theory of diffusion of innovation (Bass, 1969; Dodds, 1973; Teng et al, 2002). Bass (1969) divides the social system into innovators and imitators. Innovators are assumed to be independent and not affected by other members of the social system in making their adoption decisions. When a new product is introduced, innovators tend to acquire information about the product from the mass media and other formal channels of communications. Unlike innovators, imitators are influenced by the members of their social systems and obtain product information from informal sources such as interpersonal channels of communication and direct observation. The Bass model assumes that the effect of mass media is greater at the early stage of the product launch, whereas the impact of interpersonal communication will be greater at the later stages of the diffusion process (Rogers, 2003). The Bass model is commonly expressed as the following equation:

where S(t) represents the number of new adopters; p and q are coefficients of innovation and imitation, respectively; m is the measure of potential market and Y(t) denotes the total number of adopters up to time t. In this equation, [ p+q/m ( Y(t))] measures the sales growth rate and (m-Y(t)) is the number of potential adopters who are yet to adopt the new product. Although the coefficient p (or power of innovation) remains constant over the period of diffusion, the number of innovators constantly decreases as the sales grow. This is due to the decreasing number of potential adopters. On the other hand, the imitation force (q/m (Y(t)) gets stronger as the number of previous adopters increases. This is in line with the implicit behavioural assumption of the model that word-of-mouth plays a dominant role in the later stages of diffusion process. In the Bass model, if P >0, q=0, then diffusion follows the external influence model and when P=0, q>0, diffusion reverts to the internal influence model.

From its introduction, the utility of the Bass model has been extensively investigated. Bass (1969) accurately described the sales growth behaviour for colour television in 1967 using this model. In subsequent years, application of the Bass model has been very fruitful in forecasting diffusion of durable goods (Sultan et al, 1990). For example, Dowling (1980) accurately estimated the magnitude of peak demand for colour televisions in Australia. The Bass model has also found widespread use in the area of forecasting the adoption of technological products and services (Dodds, 1973). Fornerion (2003) used the Bass model to analyse the diffusion of the Internet in France. Firth et al (2006) used the Bass model to forecast the number of people that join the Internet community. Further, the Bass model has been applied to analyse the diffusion pattern of mobile phones (Wu and Chu, 2010; Michalakelis et al, 2008) and mobile Internet (Chu and Pan, 2008) with results confirming the general utility of the Bass model in predicting the diffusion pattern of technological products. However, modelling the diffusion pattern of online shopping has received little attention from researchers. There is no published application of this model in the context of online shopping in Australia of which the authors are aware.

There have been numerous published studies that have attempted to extend the original Bass model. These later efforts have largely centred on incorporating flexible families of curves (Easingwood et al, 1983), modelling diffusion of successive generations of technology (Norton and Bass, 1987) and incorporating marketing variables such as pricing and advertising (Robinson and Lakhani, 1975) into the original model in an attempt to overcome some of its limitations. However, the original model remains a robust method for modelling and forecasting the diffusion of new products and services. Bass et al (1994) argue that the original Bass model can predict the diffusion of new products without marketing variables. Similarly, Bottomley and Fildes (1998) found little evidence that price information increased forecasting accuracy for electrical and electronic (for example, CD player) products. As a result, the original Bass model is employed for the purpose of this study.

METHODOLOGY

Data

Data utilized in this study are drawn from the ‘Household Use of Information Technology’ published by the Australian Bureau of Statistics (ABS, 2003a, 2009). The report contains the results of series of nationwide surveys carried out by the ABS over the period of 1999–2002, 2005, 2007, 2008 and 2009 with an average sample size of approximately 15 000 and a very high response rate (85 per cent or above). No data were available for 2010. The respondents were asked whether they had purchased any goods or services online for private purposes in the last 12 months. The reported measures are in a cumulative format consisting of both new online shoppers and those who adopted Internet shopping in the preceding years. The Bass model defines the adoption unit as a first purchaser, which in this case corresponds to new online shoppers. Thus, to be consistent with the model, the data have been disaggregated and the results are depicted in Table 1. It is assumed that, once a person adopts online shopping, he or she will continue to do so in following year(s). Violating this assumption will result in underestimating the actual number of new online shoppers.

Table 1 Online shoppers (1998–2009)

No information regarding the number of online shoppers was collected by ABS for 2003, 2004 and 2006. For 2006, mid-point estimations were used to impute the missing data as follows:

where N(t) is the cumulative number of online shoppers. For year 2003 and 2004, a straight line was fitted to the number of online shoppers between 2002 and 2005 to impute the missing data. This approach is justified on the grounds that, in absence of any dramatic event that could significantly distort the diffusion of online shopping from its underlying pattern, a linear approximation should adequately represent the diffusion pattern between small intervals of time.

Some researchers have argued that parameters p and q in the Bass model are sensitive to the number of data points and tend to be unstable in the early stages of the diffusion processes when generally limited historical data are available (Sultan et al, 1990). Thus, to further increase the number of points for which data are available, the cumulative number of online shoppers for the year 1998 was estimated by first applying ordinary least squares regression to estimate the percentage of population aged 18+ who purchased online (2.15 per cent) in 1998 from the percentage of the same population who had access to the Internet at home. The two variables show a good association (adjusted R2=0.88, P-value=0.042) and the estimated number of online shoppers is equal to 294 000 (that is, 2.15% × 13 682 million). It is assumed that there was no significant number of online shoppers before 1998, thus the estimated figure represents the new online shoppers for 1998. This extends the available data to 12 observations (that is, 1998–2009), which were then used for modelling. Overall, the estimated number of online shoppers in 1998 appears to be plausible. The relatively sharp drop in the number of new online shoppers in 2000 probably reflects the general climate surrounding the information technology industry at the beginning of the new millennium.

Procedure

The following equation was used to predict the number of online shoppers for the Bass model:

The following analogues to equation (5) were then used to estimate the parameters q, p and m, which is an OLS multiple regression (Bass, 1969; Dodds, 1973).

and m=−b−√(b2−4ac)/2c, p=a/m and q=−mc. In contrast to the Bass model shown above, the following are the non-linear statistical equations for the Logistic and Gompertz models (Prajneshu, 2011):

where Y(t) is cumulative number of online shoppers. The parameters b1, b2 and b3 for both models were estimated by Levenberg-Marquardt’s iterative procedure (Prajneshu, 2011) using SPSS 19.

The performance of the models was assessed via the fit (R2 and adjusted R2) of the OLS estimation models as well as fit (R2) of the models’ predictions to the actual time series data (Australian Bureau of Statistics, 2003b). Further, following Kvalseth (cf. Prajneshu, 2011), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Root Mean Squared Error (RMSE) are employed to analyse the difference between the actual and predicted number of online shoppers.

ANALYSIS AND RESULTS

Model evaluation

All three models showed a very strong fit to the data (Table 2). The estimation models’ R2 values are high or very high in the cases of the Logistic and Gompertz models. Wright et al (1997) studied diffusion of four products in New Zealand using the original Bass model and reported an average R2 of 0.67. Jeuland (cf. Australian Bureau of Statistics, 2003b) reported an average R2 of about 0.90 of the Bass OLS estimation model for 32 data sets. The fit of the models’ prediction to the actual time series also resulted in high R2 values, which range from 0.799 for the Bass model to 0.714 to the Gompertz model. The models’ parameters also look plausible. In the case of the Bass model, both p and q are positive and m, which is the market potential or the total number of expected online shoppers (9.18 million), seems to be closer to the lower end of a reasonable market size. Parameters of the Logistic and Gompertz models (that is, b1, b2, b3) have small standard errors of estimation relative to the estimated parameters. Parameter b1 is equivalent to m in the Bass model. As can be seen from Table 2, potential market size estimated by the Bass and Logistic models are very similar (approximately 9 million), whereas the Gompertz model predicts that total number of online shoppers in Australia will reach 12.6 million. This seems to be towards the upper boundary of the online market size in Australia given the projected population size (Australian Bureau of Statistics, 2003b). Finally, statistical tests were conducted to examine the normality of the residuals. The results of both Shapiro–Wilk and Kolmogorov–Smirnov tests were insignificant showing that the residuals of the Bass, Logistic and Gompertz models do not deviate significantly from a normal distribution, hence demonstrating strong fit to the data.

Table 2 Model parameters and evaluation criteria

Overall, the models’ fit indices strongly support the structural soundness of all three growth models in modelling diffusion behaviour of online shopping in Australia. However, closer examination shows that the Bass model outperforms, though marginally, the Logistic and Gompertz models. The fitted model between the actual and forecasted numbers is stronger for the Bass model (adjusted R2=0.779) compared with the Logistic model (adjusted R2=0.718) or Gompertz model (adjusted R2=0.686). This finding is supported by MAE, MAPE and RMSE. The MAE shows the models' average error regardless of the direction of the errors. The Bass model predicted the number of online shoppers with an error of +/−104 000 adopters compared with 119 000 adopters for the other two models. Further, the MAPE indicates that each prediction of the Bass model deviated from the actual number by approximately 16.6 per cent compared with 20.6 per cent and 19.1 per cent for the Logistic and Gompertz models, respectively. The models’ RMSE also suggests a similar pattern.

Dynamics of the models and diffusion pattern of online shopping

The adoption curves of online shopping for the three growth models are illustrated in Figures 1 and 2. Both the Bass and Logistic models are symmetrical around the asymptote, which means that the inflection occurs when about 50 per cent of the potential market (m) adopts online shopping. As can be seen from these figures, the diffusion patterns predicted by the Bass and Logistic models are very similar. Both models predicted that the ultimate number of online shoppers will be slightly more than 9 million and that market saturation will occur around 2018. However, the two models are somewhat different in terms of describing the pre-inflection pattern of diffusion. Essentially, the number of online shoppers at the early phase of diffusion grows slightly faster (exponentially) in the Logistic version and also the inflection point is marginally higher than the Bass model. Nonetheless, the overall pattern of diffusion depicted by the Bass and Logistic models is, to a great extent, identical. This resemblance is no surprise as both models heavily rely on internal forces in describing the diffusion pattern of online shopping. As shown in Table 2, the imitation coefficient (q) of the Bass model is 0.371, which is much larger than the coefficient of innovation (P=0.027) and indeed very close to the comparable coefficient (b3=0.433) of the Logistic model. The Gompertz model is, unlike the Bass and Logistic models, asymmetrical where the inflection point occurs around approximately 37 per cent of the potential market (that is, 2005). The model predicts that the ultimate number of online shoppers will reach 12.6 million and that market saturation would not happen, at least, up until 2020. This caused the Gompertz model to dramatically deviate from the diffusion patterns described by the Bass and Logistic models. The potential number of online shoppers predicted by the Gompertz model, however, seems to be plausible given that in 2009 approximately 74 per cent (12.5 million) of people aged 15 years and over had accessed the Internet (Australian Bureau of Statistics, 2009).

Figure 1
figure 1

Fit of the growth models to the number of online shoppers.

Figure 2
figure 2

Fit of the growth models to the cumulative number of online shoppers.

FORECASTING ABILITY OF THE MODELS WITH LIMITED DATA

This section examines the Bass, adjusted Bass, Logistic and Gompertz models’ ability to forecast the number of online shoppers from limited early diffusion data. To this end, data from 1998 to 2003 were utilized for model fitting and the resulting models were employed to predict the number of online shoppers from 2003 to 2020. Finally, the actual number of online shoppers from 2003 to 2009 was used to evaluate the models’ forecasting accuracy.

Table 3 depicts the results of model fitting. The R2 and adjusted R2 values indicate that all the models fit the data reasonably well, though, in the case of the Bass model, this is not significant, largely due to the small sample size (n=5). The results of Shapiro–Wilk and Kolmogorov–Smirnov tests also show that the models’ residuals are normally distributed, further supporting the models’ fit adequacy.

Table 3 Assessment of models forecasting accuracy

Despite acceptable fit to the data, the Bass, Logistic and Gompertz models, which were developed with limited early diffusion data (that is, 1998–2002), performed poorly in forecasting the number of online shoppers (that is, 2003–2009). All models appear to significantly underestimate the potential market for online shopping. The Bass model suggests that the ultimate number of online shoppers (m) will be around 4.37 million. The Logistic model estimates the potential market (b1) even lower at 3.42 million. The Gomportz model’s estimation of potential market (b1) was 7.10 million, which, although better than those of the Bass and Logistic models, still seems to be considerably underestimated (see Figures 3 and 4). As a result, the models’ predictions radically differ from the actual number. For example, the R2’s between the predicted and actual numbers of online shoppers are very low and statistically insignificant clearly showing that the two are not associated. Also, the MAE, MAPE and RMSE are all very high confirming poor forecasting performance. It is plausible that a slower adoption rate in the early stages, combined with a more rapid growth in later stages caused the models to underestimate the potential market (m).

Figure 3
figure 3

Fit of the models based on limited data to number of online shoppers (1998–2002).

Figure 4
figure 4

Fit of the models based on limited data to cumulative number of online shoppers (1998–2002).

In the case of the Bass model, the literature suggests that an intuitive approach could provide more relevant estimation of the potential market (Dodds, 1973; Fornerion, 2003) and hence better forecasting. For instance, in a study of the global diffusion of mobile phones, Dekimpe et al (1998) defined the potential market as the proportion of the well-read population who live in urban zones and who have an income that allows them to subscribe to a basic telephone service. Similarly, Fornerion (2003) used the population aged 18 and over, and with an education level equal or superior to high school, as the potential market for the Internet. Using a similar methodology and using data from 1998 to 2002 only, it is estimated that m could be as high as 9.6 million. Table 3 shows the results of the Bass model adjusted for m. As can be seen, the model’s forecasting performance is remarkably improved and indeed is very close to the Bass model developed with the full data set (1998–2009). For example, the adjusted model forecasts the number of online shoppers with an average error (MAPE) of 16.3 per cent, which is similar to the forecasting error (16.6 per cent) of the full-data Bass model.

DISCUSSION, IMPLICATIONS AND FURTHER RESEARCH

This study presents an empirical analysis of the forecasting performance of three technological growth models. The models were first fitted to the number of online shoppers in Australia during 1998–2009 and the models’ performances were evaluated. Our results show that, overall, the Bass model outperforms the Logistic and the Gompertz models, although the incremental performance improvement is marginal. This finding corroborates with Young’s (1993) results. The results of model fitting with limited early diffusion data (1998–2002) clearly shows that none of the models was able to adequately represent the online shopping diffusion curve for the period of 2003–2009. However, when the upper limit of the market was estimated from external data, a common approach in the literature (Dodds, 1973; Fornerion, 2003; Dekimpe et al, 1998), the Bass model was capable of generating forecasts that were as accurate as the Bass, Logistic and Gompertz models developed using full time series data.

Several key managerial implications can be identified from the results of this study. The declining number of new Internet shoppers after the inflection point in 2005 suggests stiffer competition after that year. More specifically, the model results from this study predict that by about 2017, expansion of the online market will cease, at least as far as the new number of shoppers is concerned. This finding provides impetus for online firms to shift their focus from customer acquisition to customer retention by 2017 at the latest. Firms may also use the forecasts to make strategic decisions regarding the future direction and timing of further investments such as developing new Websites or introducing new products to meet potential online demand. In many practical situations, it is often desirable to understand the diffusion process long before product launch or at the very early stage of diffusion. Online vendors can utilize the coefficients of innovation and imitation derived in this study to prepare pre-launch forecasts of comparable products or products that are still at the early stages of their diffusion, such as mobile online shopping and the super-fast mobile Internet. Bass et al (2001) used a similar method to forecast subscriptions to satellite TV before its launch over a 5-year horizon with forecast accuracy described as fairly good. Overall, past studies indicate that coefficient p and q are relatively constant within a given industry (Norton and Bass, 1987; Pae and Lehmann, 2003). Also noteworthy is the fact that adoption of online shopping is largely driven by internal influences (that is, word-of-mouth). This suggests that online vendors may be able to successfully employ social networking Websites such as Facebook to attract new or current online buyers.

The results of this study also suggest two key avenues for further research. The first avenue is modelling diffusion across two or more generations of technology. Norton and Bass (1987) developed a framework for this purpose. The Norton-Bass model has been employed by several authors. For example, Kim et al (2000) modelled the diffusion of several generations of mobile telephony (that is, pager, analogue mobile phone, digital mobile phone and CT2). In the case of online shopping, examples of new generations of technology that may ultimately replace conventional online shopping include online shopping via mobile telephone, mobile broadband Internet and iPad. This analysis should help marketing practitioners, for example, in assessing the feasibility and timing of developing Websites that better suit smaller telephone screens. Another fruitful avenue of research is modelling the diffusion of sub-categories of online shopping. At the time of this study, reliable data at sub-category level were not available. Examples of online shopping sub-categories on which future studies may focus are demographics (male versus females) product categories (search versus experience) and geographical area (metropolitan versus regional). It is expected that such sub-category diffusion modelling will help online vendors to identify less developed and niche markets.

In conclusion, this study provides further endorsement of the utility and validity of the original Bass model. Its ability satisfactorily to predict adoption of a whole new generation of Internet shopping testifies to the contribution and robustness of the pioneering work of Bass and his colleagues. It also demonstrates the usefulness of diffusion models in forecasting adoption patterns using very limited aggregated market data. Although technological change is inexorable, on the basis of the evidence presented here, there are good grounds for confidence in the continuing validity and utility of established forecast modelling methogologies.