# SSB spending in Jamaica and its association with household budget allocation | BMC Public Health

Data for the analyzes are from the Jamaican Household Expenditure Survey 2004-2005 (HES 2004-2005), which was collected between June 2004 and March 2005 by the Statistical Institute of Jamaica. [20]. The 2004-2005 EME used a two-stage stratified random sampling design, where the first stage was the selection of primary sampling units (PSUs). In the second step, a number of households from each PSU were selected. A total of 12,012 households were selected to be interviewed over a ten month period, and the response rate was 73.8%, totaling 8,865 households with a completed survey. There is no information on the characteristics of non-responding households. However, comparing the proportional size of certain population groups (e.g. 15+, 65+, etc.) from the survey with United Nations estimates [21], there are no differences or they are negligible. This may indicate that the resulting sample is representative of the Jamaican population. A total of 233 households (2.6% of the total sample) do not have complete expenditure information (the sum of expenditures in all categories is less than the total household expenditures) and were excluded. Removing these observations from the statistical analyzes does not significantly change the results (see Table S2 in Supplementary Material).

The data contains information on all expenditure categories; employment status of all household members; household income; personal and household characteristics, such as age, gender, area of residence (urban/rural), household size, etc. For analytical purposes, household expenditures were classified into 16 categories: 1) Food (consumed at home); 2) Tea, coffee and cocoa; 3) SSB; 4) Water (not SSB); 5) Alcoholic beverages; 6) Tobacco; 7) Clothing and footwear; 8) Housing, water, electricity, gas and other fuels; 9) Furnishings, household equipment and routine housekeeping; 10) Health care; 11) Transportation; 12) Communications; 13) Leisure and culture; 14) Education; 15) Restaurants and hotels; 16) Others.

All categories except the first four follow the Classification of Individual Consumption by Purpose (COICOP) classification [22]. The first group of the COICOP classification was then divided into four subgroups “Food (consumed at home)”, “Tea, coffee and cocoa”, “SSB” (carbonated drinks -bottled or canned-, nectars and juice – bottled, canned, canned – of several flavors, etc.) and “Water” to perform the analysis. The COICOP groups “Miscellaneous goods and services”, “Taxes” and “Grants” have been grouped in the category specified as “Other”. The exact codes considered for each category are listed in Table S1 of Supplementary Material.

Two different models are estimated to characterize household decisions related to SSB expenditures. First, a generalized ordered probit model (GOPM) was estimated to examine the association between the decision to spend on BSS and socioeconomic variables. The dependent variable is ordinal and takes four possible values: 0 if the household does not buy SSB; 1 if the household spends a “low amount” on SSB; 2 if they spend an “average amount” on SSB; and 3 if they spend a “high amount” on SSB. The expenditure categories for SSB are *ad hoc* and constructed using the tertiles (33% of the distribution) of total household spending on SSB. GOPMs are more parsimonious than probit models when the data is ordered [23], as is the case here. Also, GOPMs do not have to satisfy the parallel lines assumption that ordered probit models (OPMs) must satisfy. [23]. A likelihood ratio test, testing the hypothesis of parallel lines, is performed to choose between them [24]. The likelihood ratio test to test the parallel lines hypothesis is rejected at a significance of 1% (results not shown but available from the authors). Therefore, GOPM is preferred over OPM.

The functional form of these models has been described in detail elsewhere. [24]. In our case, the independent variables include the area of residence of the household (urban or rural); sex, age and age squared of head of household; the natural logarithm of household size; the proportion of women in the household; the proportion of children in the household (under 15); a dichotomous variable taking the value one if there is at least one employed member in the household; and the natural logarithm of total household expenditure.

The second model estimates the statistical association between the decision to purchase SSB and the budget allocation on other goods and services. This is a reduced form based on a budget-restricted household utility maximization model, which assumes that households determine what proportion of their budget they would first allocate to a certain product (e.g. SSB) and then determine the proportion allocated to other budget categories and, subsequently, products. In this case, it is important to determine whether the product considered first in the budget allocation is weakly separable from the consumption of other products. Weak separability would imply that the consumption of such a product only generates an income effect (it only decreases the absolute consumption of other products due to the lower net budget of expenditures on such a product) and has no no substitution effect (consumption of such a product does not modify the marginal rate of substitution between other products) [25, 26]. Concretely, low separability would imply that households buying SSB would have the same budget allocation as comparable households not buying SSB, if they were faced with similar market conditions.

The weak separability assumption of SSB can be considered by estimating a system of equations in which each individual equation takes the form:

$${w}_{ih}=alpha +beta cdot {SSB}_{h}+gamma cdot {X}_{h}+{varepsilon }_{ih}$$

or ({w}_{ih}) is the share of total household expenditure allocated to the good/service *I* by the household *h*or *I* can be one of the specified categories of goods/services; SSB is a dichotomous variable that takes the value one if the household *h* has positive spending on SSB and ({X}_{h}) are a set of socio-demographic variables for households *h*. These are the same variables included in the vector *X* for the GOPM. Since the budget shares of budget categories can add up to one (summing the restriction), a category should be arbitrarily removed [25]. In this case, we drop the “Other” category and estimate a 14-equation system (the 16 categories defined above, except for “SSB” and “Other”).

A positive (negative) *β* for the category *I* means that the purchase of SSB is associated with an increase (decrease) in the share of expenditure devoted to this category of goods/services. On the other hand, a negative coefficient indicates that spending on SSB is linked to a decision to spend less on this category of goods/services.

Because the decision of households to allocate their budget is made simultaneously, the system of equations may contain errors (({varepsilon }_{ih})) that are correlated (contemporary correlation). To consider this, the system was estimated using apparently unrelated regression equations (SURE), as recommended in similar studies. [19, 25,26,27,28]. SURE estimation allows the estimation of a system of equations where the errors of each equation can be correlated with the errors of the other equations, an assumption that is reasonable in a context where budget allocations are made (almost simultaneously), given some budget restriction [29].

Using SURE with the same independent variables in all equations results in coefficients that are identical and can be interpreted as those obtained from a set of ordinary least squares (OLS) regressions, estimated independently. When estimating the same SURE but for budget shares of “*Housing, Water, Electricity, *etc.*.”; “Furniture, household equipment, *etc.*.” ; and “Communication”*by not using as regressors the log of household size, the proportion of women and the proportion of children aged 15 and under (because in these expenditures there are intra-household economies of scale in consumption ), the results (presented in table S3 of the material supplement) remain qualitatively unchanged.

The estimation of the two models takes into account the structural information of the sampling plan and the sampling weights. Models are estimated using Stata 16.1/MP. All methods were performed in accordance with current guidelines and regulations.