Eora Quality Reports

One challenge in building a globally balanced MRIO is that some portions of the system are underdetermined (values must be estimated or inferred) while other parts are overdetermined (different data sources provide conflicting values). Thus, all MRIOs are to some degree modeled (read more on this in our Eora Confidence whitepaper). The build process used to create Eora generates a number of quality check reports. Additionally, there have been several intercomparison efforts to measure convergence across the major IO database (Eora, EXIOBASE, WIOD, GTAP, and OECD).

MRIO Intercomparison

A special issue of Economic Systems Research contained several papers which investigated model intercomparison and the importance of sectoral (dis) aggregation in the calculation of environmental footprints. At that time, we assembled top-level results from each of the major MRIOs and prepared a comparison of results across the models. Also in that special issue we note the nice paper by Steen-Olsen et al. (2014) which clearly shows that high sector detail provides more accurate carbon footprints than more aggregate models.

View the Eora/MRIO intercomparison results

Sensitivity Analysis

For the paper Convergence Between the Eora, WIOD, EXIOBASE, and OpenEU's Consumption-Based Carbon Accounts we conducted a Monte Carlo sensitivity analysis for the calculation of the carbon footprint using Eora.

View the Eora sensitivity analysis results

Eora Quality Statistics

IO tables must be balanced: each sector's total outputs (row sum) must equal the sum of inputs (column sum). Given disaparate and conflicting input data, achieving a balanced table is always possible. These reports document how well each national IO table within Eora is balanced.

The GNI/GDP report shows the realized the National Accounting Balance identity for each country in each year. This identity says that Gross Domestic Production (GDP) + imports should equal Gross National Expenditure (GNE) less exports.

This histogram shows how well the IO table satisfies constraints posed by externally given raw data. The x axis shows the absolute deviations |GP-C| of the MRIO transactions P from prescribed constraints C in logarithms of US$. An x-axis value of 6 means that the MRIO states a value GP, whilst an external data source says this value should really be C = GP±1,000,000 '000 US$, because 10⁶ = 1,000,000, and the units are '000 US$. The y axis shows frequency counts.

This histogram shows how well the IO table satisfies constraints posed by externally given raw data. The x axis shows the relative deviations =|GP-C|/σ_C of IO table transactions P from prescribed constraints C. An x-axis value of -1 means that the IO table states a value GP, whilst an external data source says this value should really be C = GP±0.1 σ_C, where σ_C is the standard deviation of the external value C. In other words, in this example, the MRIO would be "0.1 standard deviations away" from the value prescribed by an external data source. The y axis shows frequency counts. Note that the term "standard deviation" is - in a strict statistics sense - not applicable to the uncertainty information about the Eora tables. This is because the sigma values are not calculated using a full covariance matrix describing MRIO elements as dependent variables. On this issue, see the FAQ. Irrespective of this statistical technicality, we will refer to the values describing the uncertainty in the Eora tables as "standard deviations".

This histogram shows how much IO table transactions have moved from the initial estimate to the optimised IO due to the optimisation process. The x axis shows the relative changes =|P-P₀|/ P₀ of IO transactions P during the optimisation process. An x-axis value of =-1 means that the optimised IO value has moved "0.1 standard deviations away" from the initial estimate. The y axis shows frequency counts. Note that the term "standard deviation" is - in a strict statistics sense - not applicable to the uncertainty information about the Eora tables. This is because the sigma values are not calculated using a full covariance matrix describing MRIO elements as dependent variables. On this issue, see the FAQ. Irrespective of this statistical technicality, we will refer to the values describing the uncertainty in the Eora tables as "standard deviations".

This histogram shows how well the Eora tables satisfy constraints posed by balance rules. One example for a balance rule is that the sum of elements across a basic-price table row must equal the sum of elements across the corresponding table column plus taxes less subsidies on products. An IO table should always be balanced. The optimizer attempts to reconcile conflicting data into a balanced table. This chart shows how well the MRIO table is balanced prior to the addition of the "Rest of World". The Rest of World country absorbs these post-balancing statistical discrepancies to create a perfectly balanced MRIO.

This histogram shows how well the table of standard deviations σ_P of MRIO elements P satisfies the error propagation condition imposed by the estimates of the standard deviation σ_C of constraints C. The x axis shows the absolute deviations |√(Gσ_P²)-σ_C| of the propagated MRIO standard deviations σ_P from prescribed standard deviations σ_C of constraints C, expressed in logarithms of US$. An x-axis value of 3 means that the MRIO standard deviations propagate to a value of √(Gσ_P²), whilst they should really propagate to σ_C = √(Gσ_P²)±1,000 '000 US$. The y axis shows frequency counts.

For more information on how the the error propagation condition σƒ(x,y,z)=√[ (∂ƒ/∂x*σ_x)² + (∂ƒ/∂y*σ_y)² + (∂ƒ/∂z*σ_z)² ] can be used to derive the standard deviation σƒ of a function ƒ(x,y,z) from the standard deviations σx etc of its dependent variables x etc, see

Lenzen, M., Wood, R., Wiedmann, T. (2010) Uncertainty Analysis for Multi-Region Input-Output Models - A Case Study of the UK's Carbon Footprint. Economic Systems Research 22(1), pp43-63. 10.1080/09535311003661226

In our case, the function ƒ(P)=GP, with known σƒ=σ_C, and the error propagation formula is fitted to derive the σ_P. Note that the term "standard deviation" is - in a strict statistics sense - not applicable to the uncertainty information about the Eora tables. This is because the sigma values are not calculated using a full covariance matrix describing MRIO elements as dependent variables. On this issue, see the FAQ. Irrespective of this statistical technicality, we will refer to the values describing the uncertainty in the Eora tables as "standard deviations".

During the optimisation, or matrix balancing process of the Eora MRIO tables, elements that are supported by only few raw data, and hence restricted by only few constraints, can be subject to large adjustments, and hence their reliability is low. On the other hand, for virtually all large and important IO table elements, there exist supporting raw data, so that the adjustment of these elements is usually minimal, and hence their reliability is high. You can verify this feature in any of the many “hillside” or “hockey” graphs. These will show that large transactions in the IO table (in vector notation, P) have a relatively small standard deviation and are relatively well constrained compared to smaller transactions. The hillside diagrams show that relative standard deviations (σ_rel) of the IO table P are small for large table elements, and vice versa.

During the optimisation, or matrix balancing process of the Eora MRIO tables, elements that are supported by only few raw data, and hence restricted by only few constraints, can be subject to large adjustments, and hence their reliability is low. On the other hand, for virtually all large and important IO table elements, there exist supporting raw data, so that the adjustment of these elements is usually minimal, and hence their reliability is high. You can verify this feature in any of the many “rocket” graphs. These graphs consider the constraint realisations GP. Because the externally fixed raw data C conflict, the optimiser can generally not find a solution P where the realisations GP perfectly match all constraints c, it can only find a solution that is optimal in some defined sense. In other words, conflicting raw data will result in discrepancies between constraints C and their realisations GP. However, these discrepancies are only large for unimportant small constraints. The thin "rocket tip" shows that for large constraints, contraints values are realised very well in the Eora tables.

This image shows the absolute adherence to the constraints in standard deviations.

This histogram shows the size distribution of the initial estimate. Note that the Eora IO tables are in units of '000 US$, so an x-axis value of 6 means a trasaction size of $US 1,000,000,000. The y axis shows frequency counts. Note that this graph is usually not presented for Satellite accounts since satellite indicators use a mixture of units.

Uncertainty in the Eora tables is expressed in terms of standard deviations. The relative standard deviation σ_rel(P_0i) of an initial estimate transaction P_0i is defined as the ratio σ(P_0i)/P_0i of the absolute standard deviation σ(P_0i) of this transaction and the transaction value P_0i.

A well-behaved graph will exhibit a "hockey stick" or "hillside" shape, indicating that small transactions are less well known, whilst large transaction are more well known, because they are usually based on a sum of many small transactions. For more information on this behaviour read the Eora Confidence whitepaper. Note also that the term "standard deviation" is - in a strict statistics sense - not applicable to the uncertainty information about the Eora tables. This is because the sigma values are not calculated using a full covariance matrix describing MRIO elements as dependent variables. On this issue, see the FAQ. Irrespective of this statistical technicality, we will refer to the values describing the uncertainty in the Eora tables as "standard deviations".

This histogram shows the size distribution of the optimised IO table. Note that the Eora IO tables are in units of '000 US$, so an x-axis value of 6 means a trasaction size of $US 1,000,000,000. The y axis shows frequency counts. Note that this graph is usually not presented for Satellite accounts since satellite indicators use a mixture of units.

The relative standard deviation σ_rel(P_i) of an IO table transaction P_i is defined as the ratio σ(P_i)/P_i of the absolute standard deviation σ(P_i) of this transaction and the transaction value P_i. Note that the term "standard deviation" is - in a strict statistics sense - not applicable to the uncertainty information about the Eora tables. This is because the sigma values are not calculated using a full covariance matrix describing MRIO elements as dependent variables. On this issue, see the FAQ. Irrespective of this statistical technicality, we will refer to the values describing the uncertainty in the Eora tables as "standard deviations"

This histogram shows the size distribution of the constraints posed by the raw data collected from a multitude of data sources. Note that the Eora IO tables are in units of '000 US$, so an x-axis value of 6 means a trasaction size of $US 1,000,000,000. The y axis shows frequency counts.

The relative standard deviation σ_rel(C_i) of a constraint C_i on the IO table is defined as the ratio σ(C_i)/C_i of the absolute standard deviation σ(C_i) and the constraint value C_i. The graph shows that small data points are typically not well known, whilst large data points are well known, because they are usually based on a sum of many small transactions. For more information on this behaviour read the info sheet Eora Confidence whitepaper. Note also that the term "standard deviation" is - in a strict statistics sense - not applicable to the uncertainty information about the Eora tables. This is because the sigma values are not calculated using a full covariance matrix describing MRIO elements as dependent variables. On this issue, see the FAQ. Irrespective of this statistical technicality, we will refer to the values describing the uncertainty in the Eora tables as "standard deviations".

MRIO Reports

To build the Eora table we begin by assembling an initial table for 2000 using the best available data for that year. This is called the initial estimate. (This is not really an estimate because it is carefully assembled from the best available data, but mathmatically it is the initial estimate of what the final MRIO table will be after all other conflicting data sources are considered.) For many sections of this table, several alternative data sources exist. For example both the UN and a national statistical agency may supply slightly different IO tables. Or to take another example, a set of transactions which should balance in fact do not. These alternative data sources and balancing rules are called constraints since they constrain the value of all of the elements in the table.

This initial estimate plus set of constraints is run through optimisation software. The optimiser finds compromise values which least disturb the initial table while respecting any conflicting constraints. This page reports on the performance and effects of the optimiser.

These reports show which constraints involved have been best, and least, respected in the final solution.

The Adherence Reports show which constraints are best respected in the final result. These are the data sources with the strongest influence on the results. Ideally, these reports should show that national agency data are well respected in relative value (multiples of standard deviation). The Violation Reports show which constraints were least respected by the optimizer. This occurs when a data source significantly disagrees with others. Ideally, the degree of violation is small -- that is, the difference between the True RHS (the value reported by the data source) and the Realized RHS (the best compromise value achieved by the optimizer) is small. (RHS stands for Right-hand Side, or the vector c in the generalized optimization problem Gp=c.)

Each of these reports is offered in both absolute value an in relative value (number of standard deviations). Start by checking the reports in absolute value terms. The reports in relative terms can be useful but can also sometimes be misleading since often the top violators in relative terms have small absolute values, and are thus of less importance to the overall results.

The constraint label format in the reports is documented here: Constraint Nomenclature

	Absolute Value	Relative Value
Adherence Report
Violation Report

These same reports are also available for the initial estimate, that is, before the optimizer has been run:

	Absolute Value	Relative Value
Adherence Report (before optimizer)
Violation Report (before optimizer)

Imperfect balancing can result in Technical Coefficients (A matrix) values greater than 1. We cap these values at 1. This report showing which values were capped at 1, and the original value, may be useful for diagnosing specific sectors where the result is of low confidence due to imperfect balancing:

Out of Bounds Technical Coefficients

Balancing success, per sector, in this country is shown in this first graph. Each dot is a sector, and the dot size is scaled according to the sector's size. Ideally all sectors are perfectly balanced, but it is normal to see the larger sectors balanced (along the horizontal line at 1) and smaller, outlier sectors less well-balanced. Also, as Eora is built using year 2000 as a "base" year, it is typical that the balance is best around that period.

Macro-economic totals (GDP, gross imports, gross exports, final demand, and balancing ratio):

Data used to inform the national IO table

This report shows all the data sources (also called constraints) used in constructing the IO table:
.

The nomenclature used in this report is described here: Constraint Nomenclature.