TWO-STAGE DEA ESTIMATION OF TECHNICAL EFFICIENCY: COMPARISON OF DIFFERENT ESTIMATORS

Research background: The challenge of resource limitations requires that farmers make judicious use of resources to maximize output and profit levels. This can be achieved through assessment of resource-use efficiency of farmers by estimating the level of technical efficiency and the determining factors. Purpose of the article: This paper compared the results of alternate DEA methodologies and applied different estimators to measure the influence of exogenous factors on technical efficiency of groundnut farmers in northern Ghana. Methods: The study used the traditional and double bootstrap DEA approaches to estimate technical efficiency while in the second stage, OLS, Tobit and double bootstrap techniques were used to estimate the influence of exogenous factors on efficiency. Findings & Value added: The double bootstrap DEA approach produced a mean technical efficiency of 51 per cent compared to 70 per cent for the traditional DEA approach. Concerning the determinants of technical efficiency, the DEA with Tobit (DEA+Tobit), DEA with OLS (DEA+OLS), and Simar and Wilson’s double bootstrap DEA (SW-DEA) procedures produced very similar results. The findings shed light on two-stage DEA estimation as well as the modelling of the influence of exogenous factors on the DEA scores.


INTRODUCTION
Technical efficiency (TE) analysis is a major field in empirical economics with wide application in other fields of study. Efficiency estimation in agriculture has gained considerable attention in the economic literature due to resource limitations of farmers and the need to make judicious use of resources to maximize output and profit levels, or any other economic objective of the producer. Agricultural production in most developing countries is predominantly a small-scale activity. Smallholder production is typically characterised by dependence on rainfall and little application of productivity-enhancing technologies such as modern seeds, irrigation technology and mechanisation (Diao, 2010;Chamberlin, 2007;ISSER, 2006). Coupled with other constraints such as inaccessibility to agricultural support services and basic infrastructure like road networks and markets, productivity of smallholder agriculture has been rather low, which is a concern to policymakers and the research community. Critical to the low productivity of smallholders is the presence of inefficiency in production due to sub-optimal allocation of resources and inadequate management skills. In order to increase the productivity of smallholder producers, measures are required to enhance TE of production, especially the technical aspects of production.
Efficiency analysis is typically classified into parametric approach using stochastic frontier analysis (SFA) and nonparametric approach using data envelopment analysis (DEA). The SFA has its appeal in the fact that it provides a measure of both the estimate of efficiency and its determinants. In other words, SFA directly provides a measure of the sources of inefficiency, which in many empirical studies are of much more importance to policy-making than the mere estimation of the level of efficiency of individual production units. The DEA methodology, however, measures the input-output combinations that yield maximum output without directly addressing the factors explaining the differences in efficiency between the best performing decision-making units (DMUs) and their less efficient peers.
In the light of this limitation with the nonparametric approach, semi-parametric two-stage DEA approaches that combine regression analysis with the nonparametric DEA efficiency estimation have gained popularity and extensive application in recent years. Typically, researchers rely on either a Tobit model (because of the bounded nature of the DEA estimates) or ordinary least squares (OLS) for the second stage estimation (Hoff, 2007;McDonald, 2009;Simar & Wilson, 2011). These two-stage estimators have been widely used in the efficiency literature mainly for their intuitive appeal. Other methodologies for non/semi-parametric efficiency estimation can be found in the literature. This study, however, focuses on three of the commonly used approaches and compares the efficiency outcomes by applying these estimators to a dataset generated from smallholder producers in Ghana.
While DEA estimation of TE is widespread, the method has not been without some criticisms which include the absence of a clear data generation process (DGP) and the presence of serial correlations between the estimated DEA scores (see Simar & Wilson, 2007;McDonald, 2009). The latter problem arises mainly because the DEA procedure is derived from a common sample. The estimation of each firm's TE uses information on the whole sample; hence the estimated scores are considered to suffer from serial correlation.
Simar & Wilson (2007) advocated a parametric technique to solve the above-mentioned problems with the two-stage DEA estimation. Instead of a censored regression model, they proposed truncated regression with bootstrapping to provide a data generation process that mimics the true process. With the double bootstrap methodology, double bootstrapping is performed on the efficiency scores to eliminate unknown serial correlations associated with the initial DEA scores. The stage two analysis involves truncated regression to regress the firststage bootstrap DEA scores on environmental variables expected to affect efficiency. However, the Simar and Wilson (SW) approach is not without its own criticisms. For example, the SW approach completely ignores random noise, which is an important factor in estimating efficiency. SW's double bootstrap technique corrects twice for bootstrap bias to give an approximation of the true or population DEA score. The supposition that the bootstrap bias is an approximation of the model or DEA bias has been challenged by Tziogkidis (2012). Banker & Natarajan (2008) prescribed sufficient conditions for the OLS estimator to yield consistent estimates of the influence of contextual (environmental) variables in two-stage DEA analysis. In a recent study, Banker et al. (2019) demonstrated from Monte-Carlo simulations that the simple DEA+OLS approach performs better than the more complicated SW approach. Hoff (2007) compared different approaches for two-stage DEA modelling and observed that the Tobit model was sufficient in modelling the second stage DEA model. The author further observed that OLS was in many cases a sufficient replacement for the Tobit model in the second stage DEA estimation. Johnson & Kuosmanen (2012) also developed a one-stage DEA approach which they found to outperform the DEA+OLS. The authors showed that the two-stage DEA estimator is statistically consistent under more general conditions, adding that the finite sample bias of DEA in stage one is carried across to the stage two analysis resulting in biased estimates of the contextual variables.
The paper compares two-stage DEA estimation using SW double bootstrap approach (with truncated regression) and the traditional DEA approach with OLS and Tobit regression through a case study in the Ghanaian farm sector. The paper compares three approaches (estimators) for estimating the influence of exogenous factors on DEA scores, in order to determine whether these estimators yield comparable estimates. The paper's departure from previous studies is that it applies real data to test the results from using these estimators. Even though comparison of alternative estimators exists in the literature (Banker et al., 2019), studies using real data set instead of Monte Carlos simulations are rare. Hence, this study attempts to fill that void by providing analysis based on real data set.

The study area and sampling procedure
The data for the analysis came from 158 smallholder groundnut cultivators in the Tolon district which is situated in the northern savanna of Ghana. The district has a single rainfall regime with high daily and night temperatures. Groundnut production is an essential income-generating activity in the district, which is agrarian. Farmers were sampled from eight communities in the district which were selected based on groundnut production potential. Twenty farmers were sampled from each community. Data were collected on production, socio-economic and institutional factors through questionnaire administration. After the data entry and cleaning, two respondents were dropped due to incomplete information on their farming activities.
Data envelopment analysis DEA model can be formulated as a minimisation objective function applying linear programming. The DEA model compares the efficiency of each DMU to a constructed efficiency frontier. Shortfalls in production from the efficient frontier are reported as inefficiency. DEA is estimated under constant returns to scale (CRS) or variable returns to scale (VRS) assumptions. The CRS (Charnes et al., 1978) assumes that all the DMUs are operating at an optimum scale, a condition which is relaxed in the case of VRS proposed by Banker et al. (1984). DEA estimation also follows either an input or output orientation, depending on which factors farmers have much control over. Smallholders have greater control over factors of production than outputs hence an input approach is generally preferred. For CRS, the DEA procedure is presented as follows (Coelli et al., 2005): where θ is the estimate of efficiency taking values between zero and one, q is output, Q denotes an output matrix, x denotes inputs, X denotes an input matrix and λ represents weights. Efficient farms have θ of one while any deviation from this value indicates inefficiency.
Including the convexity constrain, 1′ = 1 gives the DEA model under VRS.
where N1 denotes a vector of ones. in equation 2 gives an indication of pure technical efficiency while the corresponding value in equation 1 gives total efficiency, which comprises scale efficiency (SE) and pure TE. SE is derived as the ratio of the value of under CRS assumption to that under VRS, that is = / . The SW double bootstrap approach considers the lack of a coherent DGP in the estimation of DEA as a limitation. The proponents of the double bootstrap approach contend that the DEA scores are estimated in a way that utilizes information on all the individuals in the sample, resulting in efficiency estimates that are serially correlated. What the bootstrapping technique seeks to achieve is to produce a DGP that mimics the true DGP using bootstrapping technique to correct for the serial correlations associated with the DEA scores. Artificial efficiency scores are computed by simulation from which bootstrapped coefficients and standard errors are produced. Confidence intervals are generated using the bootstrap results. In the second stage, further bootstrapping is carried out to generate new confidence intervals for the estimation. The SW approach uses truncated regression in the second stage estimation. A complete description of the double bootstrap technique is contained in Simar & Wilson (1998, 2000, 2007 and Nkegbe (2018).

Second-stage DEA analysis
The regression equation estimated in the second stage was expressed as: where ̂ is the calculated DEA score, Z is a vector of regressors, and represents unknown coefficients. The empirical model of the second-stage regression (truncated, OLS and Tobit) was specified as follows: Exogenous factors included in the model were chosen relying on a priori expectation and the existing literature. Gender influences TE of smallholders due to differences in access to and ownership of production resources (Anang et al., 2016). Also, age influences TE of production according to the extant literature. Younger farmers may be more adventurous and more likely to take up new innovations in sync with the observation of Where participation in off-farm work leads to reduction in liquidity constraints of the farmers and hence higher capability to afford farm inputs, TE is expected to increase. However, if off-farm activity leads to withdrawal of labour from the farm, then TE is expected to decline. In the case of pests and diseases, higher incidence is expected to increase input use while reducing output level thereby decreasing TE of farmers.

Summary statistics of the sample
Majority of the producers were male farmers with a mean farm and household size of 1.7 hectares and 13 members respectively (Table 1) . In addition, only 11 per cent of the sample participated in a farmer group, which in recent times has gained prominence as conduit for extension delivery to smallholders and access to information and production inputs by smallholders. Half of the respondents experienced pest and disease infestation during the cropping season, implying a likely loss of farm output or the use of additional chemical inputs for crop protection.

Technical efficiency analysis
The results of the traditional and double bootstrap DEA efficiency analyses are indicated in Table 2 The traditional DEA approach produced TE scores ranging between 0.35 and 1, compared to a range of 0.19 to 0.51 for the bootstrap DEA approach. Also, fewer farmers had very low efficiencies (less than 40 per cent) under the traditional DEA analysis whereas fewer farmers had very high efficiencies (above 90 per cent) for the bootstrap DEA approach. The traditional approach identifies a large proportion of farmers (>33 per cent) as highly efficient (0.81-1.00) whereas the bootstrap approach only finds that 9.5 per cent of farmers are highly efficient. The result is attributed to the sensitivity of the DEA approach to outliers which tends to flatten the efficiency estimates to maximum (Førsund & Sarafoglou, 2005). The DEA approach, unlike the stochastic frontier approach, does not handle noise, and tends to treat data with noise as containing outliers, resulting in flattening of the DEA scores towards maximum. The application of bootstrapping technique, however, addresses this sensitivity and produces DEA scores that are relatively lower in magnitude.
The implication of the result is that when estimating TE using DEA, researchers need to take into account the influence of the sensitivity of the DEA approach to outliers on the efficiency scores. The traditional approach overestimates the DEA scores, ostensibly due to the noise in most real data sets. Since most real data sets contain some element of noise, the traditional DEA approach is most likely to overestimate the DEA scores. The use of the bootstrap technique will provide more conservative results without the influence of the sensitivity to outliers.

Determinants of technical efficiency: effects of exogenous variables
In many empirical efficiency analyses, the determinants of efficiency assume a higher importance than the estimated efficiency scores due to the policy implications of the sources of inefficiency. Consequently, identifying the factors associated with (in)efficiency has become an integral part of efficiency analysis. The factors determining TE are indicated in Table 3. Farrell's (1957) input-oriented TE measure was used rather than Shephard's (1970) output distance function, a reciprocal of Farrell's approach. Hence, the signs of the coefficients are not reversed as in traditional stochastic frontier analysis.
The core question was whether the three estimators -OLS, Tobit and truncated regression modelsprovide similar results for the 2-stage DEA estimation. The results show that the three estimators provide quite similar results for the second stage regression; although the first stage efficiency scores differ. The signs of the coefficients are quite similar, except the degree to which some of the variables are significant in their effect on efficiency. The bootstrap DEA approach returned a significant value for credit access (albeit at 10 per cent) while the rest of the estimators posted a non-significant value. Also, the OLS estimator posted a non-significant value for household size while both the bootstrap and Tobit estimators returned a significant value at 10 per cent.
The results indicate that women groundnut producers were more technically efficient than their male counterparts. Usually, women are considered to have lower efficiency because of their multiple roles in the household and imbalance in intra-household resource allocation (Anang et al., 2016; Abdulai et al., 2013) which affect their farm performance. Female farmers, however, have the potential to be technically efficient in production, when provided with the required production inputs. Thus, the result of the study reiterates women's potential to be technical efficient in production.   TE increased with household size at 10 per cent for the double bootstrap and Tobit models. This means that an increase in household members correlate with higher TE of the household. Larger households are less likely to be labour-constrained thus able to carry out farm operations timeously and more effectively to enhance TE. Similar result was attained by Ahmadu & Alufohai (2012) in an assessment of TE of rice producers in Nigeria.
The result further portrayed a decrease in TE with cultivated area implying that producers become more inefficient as their acreage increases. Smallholders typically cultivate small acreages and may lack the skills and managerial abilities to operate larger farms which may account for the decrease in efficiency as farm size increases. The results however disagree with that of The results also showed that even though TE initially decreased with farming experience, it subsequently increased indicating that when farmers become more experienced in production, their efficiency level increases. As farmers become more experienced in farming, their level of efficiency is expected to increase. Participation in off-farm work was associated with lower TE, implying that off-farm engagement impacts negatively on farm efficiency. This could be due to labourloss effect, as agricultural labour is lost to off-farm activities which could affect critical and timely farm operations. Other authors such as Nkegbe (2018) and Coelli et al. (2002) obtained similar inverse association between off-farm work and TE in their studies in northern Ghana and Bangladesh, respectively. Access to agricultural extension had a positively significant influence on TE which is in sync with expectation. Extension workers play important roles in smallholder agriculture that helps to improve efficiency of production. For example, extension agents in Ghana train farmers in modern farming practices, introduce producers to new innovations and assist farmers to form groups and access farm inputs. The result resonates with that of Access to credit had a negative association with TE at 10 per cent level, and this was only in the case of the double bootstrap model. Low access to credit could account for the limited impact of credit on farmers' TE. Anang et al. (2016) observed that TE of rice farmers in northern Ghana was not different between credit users and non-users. Nkegbe (2018) however showed that credit users had higher TE than non-users in maize cultivation in northern Ghana.
Although, the traditional and bootstrap approaches produced different efficiency scores (70 per cent and 51 per cent respectively), the differences in scores do not seem to matter much if the focus is on second stage results (statistical significance of factors explaining TE scores). This result is very significant in DEA estimation. What the result implies is that while the application of biascorrection (bootstrapping technique) affects the magnitude of the TE estimates, it has little effect on the relationship between the efficiency scores and the exogenous factors influencing efficiency.

CONCLUSIONS AND POLICY IMPLICATIONS
The study compared three approaches for estimating the influence of exogenous factors on DEA scores through a data set related to small-scale farming in northern Ghana. TE estimation using the double bootstrap and traditional DEA approaches produced different efficiency estimates -70 per cent for the traditional DEA and 51 per cent for the SW double bootstrap approach. In particular, the double bootstrap approach biased the TE estimates downwards.
The result further revealed that the double bootstrap and traditional DEA approaches yielded practically similar results regarding the influence of exogenous variables on TE within a semi-parametric framework. The results showed that due to sensitivity of the DEA approach to outliers as outlined by other authors, the traditional DEA approach overestimated the efficiency scores. What the result implies is that researchers measuring TE using DEA estimation should take into account the influence of the sensitivity of the DEA approach to outliers on the efficiency scores. However, despite the differences in efficiency scores for the traditional and bootstrap methods, the influence of exogenous factors on efficiency did not differ across the different approaches. The paper therefore demonstrated that bootstrapping largely affected the magnitude of the DEA estimates, but had little effect on the relationship between the efficiency scores and the exogenous factors influencing efficiency. Hence, for the purpose of identifying the sources of inefficiency in production, investigators may choose between any of the three estimators as they yield comparable estimates. Where investigators choose to simultaneously apply more than one estimator, statistically significant variables in the second stage regressions could be identified as potential policy instruments.
With regards to the policy implications of the study's findings, it is recommended that more female farmers should be encouraged to venture into groundnut production while extension services should be targeted at producers to improve their TE in order to promote household food and income security. Groundnut is an important food and cash crop in the study area. Empowering more women to venture into groundnut production is therefore expected to enhance the income of women farmers thereby improving household food and nutrition security. Extending extension services to smallholder farmers is essential to improve efficiency of resource use and farm performance in general. Extension service provision is also needed to increase the managerial abilities of producers. The results indicated that farmers became less efficient when their acreage increased. Thus, farmers lacked the managerial and technical abilities to manage larger acreage. Access to extension service is one of the critical factors that have enabled small-scale farmers in developing countries to acquire such managerial and technical skills to improve their level of production.