The purpose of harmonising physical behaviour data is to facilitate quantitative interpretation of diverse measurements of the same underlying latent behavioural construct. This could be data from studies using different physical activity questionnaires or different wearable devices for assessing behaviour. The process of data harmonisation depends on the original methods used, how diverse they may be, and the availability of validation data.
The simplest form of data harmonisation is unit conversion, for example converting energy expenditure from calories to Joules or body segment acceleration from gravity units to m.s-2.
The reductionist approach converts measurements from different methods to the least common denominator. For example, if Study A reports a continuous estimate of PAEE and Study B reports 3 categories of PAEE ranging below 30 kJ.day-1.kg-1, 30-50 kJ.day-1.kg-1, and above 50 kJ.day-1.kg-1, then Study A can derive a new categorical variable for PAEE using the cutoff values from Study B. This will reduce the granularity of information in Study A but it will allow data from Study A and B to be analysed together as categories of PAEE.
The reverse approach may also be used; this involves computing the summarising (e.g., as mean and SD) the continuous PAEE data in each of those generated categories in Study A, and then applying those summary statistics to Study B.
Physical activity data from two or more distinct sources may also be harmonised through validation against an external criterion measure. For example, the output of different accelerometers could be compared by the values measured at a certain walking speed or the relationship between average accelerometric means and PAEE from DLW during free-living.
Ideally, we prefer to have robust direct validation of primary exposure or outcome variables against their respective criterion measures. However, such validation work has often not been conducted or has only been conducted at small scale or in different population settings to where methods have been deployed. This is where indirect validation can help with interpretation of measurements and harmonisation of different types of measurements.
Indirect validation refers to the principle that a method is compared to the gold-standard criterion via another method, in other words it is not a direct comparison to the criterion. This principle is also refered to as network harmonisation, since it is possible to construct a network of inter-relationships between several methods which allows converting the estimates from one method to those of another method somewhere else in the network, as long as these are connected either directly or indirectly. Figure P.4.1 shows such a network of connections between wearables.
Figure P.4.1. Network of wearables
This section outlines three case studies illustrating some of the harmonisation concepts described on the general data harmonisation page.
The first example is from the InterConnect consortium, which has used existing data from different study cohorts across the world to study the relationship between physical activity during pregnancy and anthropometry outcomes in the offspring.
The second example is a comparison of three different harmonisation algorithms derived from direct validation between self-reported and device-based physical activity to study the association between physical and diabetes.
The third example illustrates the principle of network harmonisation using validation data which are not directly linked between the data to be harmonised and criterion data but indirectly linked through other data sources.