Exploration to the data analysis in the scientific research


President Xi Jinping proposed that carbon dioxide emissions should reach the peak before 2030, and strive to achieve the goal of carbon neutrality by 2060. This goal needs to be supported by fruits based on the calculation of carbon accounting data. The results of this analysis can be in good agreement with the recent or near-current frontier progress in the field of carbon emissions, which reflects the application of the Dialectics of Nature in the frontier field of scientific research.

Key Words

Data analysis. Carbon accounting. Carbon neutrality

Main Text

The sixth chapter of the Dialectics of Nature$^{[1]}$, Science Technology and Innovation Methodology , mentioned the following concepts: For the structure of scientific theory, the basic concept is the basic unit of thinking which is the form of thinking that reflects the essential attributes of natural things, constituting the cornerstone of the theory. The basic principle or law is the judgment of the connection with basic concepts, the reflection of the basic relationships among the research objects, the foundation the scientific theories rely on to be established and developed. Logical inferences are logical conclusions deduced from these concepts and principles, that is, various specific laws and fore-sights. It functions as the explanations and fore-sighting of theories. This chapter also mentioned that the methodological significance of mathematics is to provide a concise and precise formal language for research and a tool for abstract thinking, to provide effective means for organizing and developing existing knowledge and subsequently establishing a scientific theoretical system, to provide means of quantitative analysis and theoretical calculation for scientific research. The above discussions give the author inspirations to analyze the carbon accounting and climate change issues currently being studied by the author from a dialectical perspective.

Climate change is the major environmental science issue facing human society in the 21st century. Previous studies have pointed out that global anthropogenic greenhouse gas emissions are “very likely” to be the main cause of increasing climate change events$^{[2]}$. The cumulative carbon dioxide emissions have a linear relationship with the cumulative temperature rise after the pre-industrial stage$^{[3]}$, which accounted for 80% of the total greenhouse gas emissions from human activities$^{[4]}$. In the “14th Five-Year Plan”, China has clearly set emission reduction targets: carbon peaks in 2030 and carbon neutrality in 2060. Therefore, accurate and reliable carbon emission data can be the basic support and also the scientific basis for formulations of emission reduction policies and targets, which requires accurate carbon accounting methods to support it.
National carbon dioxide emission accounting is the basis for taking emission reduction measures, issuing national emission reduction strategies, and conducting international verification and evaluations. However, carbon dioxide itself is not included in air pollutants, meaning that there is no universal accounting system globally$^{[5]}$. To this end, international institutions such as the International Panel of Climate Change <IPCC> have formulated globally-widely-used methodologies to estimate national carbon dioxide emissions based on the consumption of numerous kinds of carbon-containing fossil energy and the carbon contents of the corresponding energy themselves respectively$^{[6]}$. Recommended carbon accounting methodologies from the IPCC include two types: reference approach and sectoral approach. Reference approach is to estimate emissions based on the total amount of energy consumption in a specific country, which can be calculated from energy production, import and export, and stock exchange. Sectoral approach is to calculate sector-wise emissions based on energy consumption for each sector, which eventually is added together to obtain the amount of national emissions$^{[5]}$. One of the cores of clarifying the methodologies underlying the carbon emission accounting is a group of mathematical formulations. Carbon emissions can be disaggregated into fossil-fuel carbon emissions and industrial process carbon emissions:

$carbon\_emissions=fossil\_fuel\_emissions+industrial\_process\_emissions$ <1>

Fossil fuel emissions can be calculated by equation <2> :

$fossil\_fuel\_emissions=activity\_data ( energy\_ consumption ) \\\times emission\_factors (carbon\_content\_in\_the\_energy\_sources)$ <2>

Because the emission factor for a specific energy source is associated with the quality, property and combustion efficiency of that source, that factor can be further disaggregated:

$emission\_factor=heating\_value \\\times carbon\_content\_per\_heating\_value \\\times oxidation\_rate (combustion\_efficiency)$ <3>

Carbon emissions from fossil fuels can be estimated from equation <4>:

$fossil\_fuel\_emission= activity\_data ( energy \_consumption )\\ \times heating\_value \times carbon\_content\_per\_heating\_value \\\times oxidation\_rate (combustion\_efficiency)$ <4>

Most of the industrial process part is the cement production process, whose basic rule of carbon accounting is similar to the fossil fuel counterpart, formulated as equation <5>:

$industrial\_process\_emissions=activity\_data ( cement\_production ) \\\times emission\_factor$ <5>

From the equations, it can be noted that the uncertainties of estimations of carbon emissions come from two aspects: activity level data <energy consumption and industrial production> and emission factors.

Throughout the basic rules and the methodologies concerning the carbon emissions accounting, we can dive into the following points.

First is the importance of basic concepts. Easy as it’s for the mathematical summary of accounting emissions <eg. There doesn’t exist the concepts stemming from advanced mathematics such as calculus>, the accounting of that is uneasy and even awesome, reflected by the diversifiedness of those basic concepts. Take the core equation, equation <1> as an example. What is the concept of fossil fuel? How to understand the activity data. Why is the content parenthesized in that equation noted as the energy consumption and what are the relationships between those? What is the emission factor? Understanding those terms is the most crucial fundamental to correctly grasping how to calculate emissions.

Second is the basic rules and therefore the logic inferences. Though the equations are not fraught with too many mathematical deductions and inferences, it demands high from the emissions-calculating practitioners to make clear the rules of carbon accounting because of the enduring issues surrounding the fully understanding of basic concepts. For instance, why are the operations used in the equations those of “adds” rather than “times/multiplications” ? This implicates the inferences of “mathematicalization” from the understanding of basic rules to the resultant formulations that can be practically reasonable, requires the foundational establishment of relationships between the logic based things even above the basic concepts and rules, and finally fulfill the obtainment of “right answers <methodologies>” . Resulting from those methodologies is the obedience to basic scientific rules, which can be further explained by logic.

This kind of thinking can be better surfaced by the equation <3>. Dissecting that equation, the basic concepts are: heating value, carbon content per heating value, oxidation rate. Departing from the starting line of the subject classifications, we can put more weight on the inspecting of terms used in that equation. Heating value is from physics whereas oxidation rate from chemistry. Although the concept of carbon content per heating value is truly newly-defined in terms of the nomenclature, the definition of that abides by the ratio definition method mostly utilized by physicists but resulting in a novelly conceptual term in the field of carbon accounting. Therefore, equation <3> is virtually a case study for the application of multi- or inter-disciplinary research, empathized by the resulting academic fruits that, based on the achievements from each subject, the achieving pathways from how to integrate them to solving a certain scientific question can be discovered to serve the purposes of scientific research for the current key questions from the perspective of effectively taking full advantage of knowledge from different subjects.

However, not only need in-depth and microscopically-investigating-style scientific research focus on the process of conceptualization and mathematicalization from rules to formulae but sufficiently get the essentials of how to actually make calculations based on that process. This transformation can be borrowed from what’s described in the Structure of Scientific Revolution as Kuhn’s paradigm, which is from pre-paradigm, the formation of paradigm, normal science, anomaly, crisis, a new paradigm to new normal science. Take carbon accounting as an example. The part of paradigm shift, from anomaly to crisis, from crisis to new paradigm, can be exemplified by the know-how of calculating the uncertainties of results. From the abovementioned paragraphs come two aspects influencing the uncertainties: activity level data and emission factors. Globally many institutes and universities have contributed to the carbon emission accounting, most notable and prestigious of which are: Carbon Dioxide Information Analysis Centre <CDIAC>, European Commission’s Joint Research Centre <JRC>, Netherlands Environmental Assessment Agency <PBL>, Emissions Database for Global Atmospheric Research <EDGAR>, International Energy Agency <IEA>, U.S. Energy Information Administration <EIA>, United Nations Framework Convention on Climate Change <UNFCCC>, World Resources Institute <WRI> $^{[7]}$. They provide annual estimations of carbon emissions data, but in fact the carbon emissions emitted by human beings is indeed one certain value rather than the fact of “estimation after estimation” , which indicates the “existingness” of one “true” value $^{[8]}$ . So, why errors?

We can relate this to the analysis of uncertainties. The systemic errors from the national statistics data, the errors induced from in-situ measurements or estimations of the emission factors and other factors are the culprits of uncertainties. The judgement and modifications of statistical data and measurement errors can be analogy to the process of “anomaly->crisis->new paradigm”. In addition, besides errors, these two can be analysed for better granularity. In the field of carbon emissions, such granularity can be depicted by a concept, resolution, either spatially or temporally <eg. more detailed <granularized> temporal resolution>. Generally according to the theory of error propagation, increased resolution is commensurate with increased sampling frequency, beneficial to the reduction of inherent errors in the sampling process and the decreasing of extra costs for analysing errors <in terms of data rather than methodologies>. Carbon accounting is conditioned on the activity data and the emission factors. Considering that an emission factor for one substance can be re-regarded as the individual property because of the invariability of its chemo-physical property and the errors mainly come from making this property mathematically calculatable and assuming that it remains unchanged behind the high-temporal-resolution scene, the emission factor remains unchanged. By inversely inferring from the equations that we have mentioned to use to get the emissions, we can conclude that one of the approaches is to improve the temporal resolution of activity data. Therefore, high-resolution carbon accounting is one of the future directions, where we can more accurately run the process of carbon accounting to approximate the so-called true value.

Nevertheless, carbon accounting is basically a tool to measure the impacts of the running of human society on the earth environment and climate, but rather has overarching contributions to the temperature rise due to climate change. Human society is itself a macro-system, so is the earth. Behind the carbon accounting is a way to measure to leverage the impacts among macro-systems. Recalling from the system science theory, this kind of research needs to see the object as an organic whole, putting more weight on the coordination between localness and globalness, between specificness and generalization, between analysis and integrative-thinking; to see the object as a dynamic system, connecting its history, current status and future with the discreetness of combining stage-level thinking into continuity. Mixed optimization and systematic filtering are needed further to abstract the real system into the model<s> and get results from simulations and emulations. The high-resolution methodology considers the dynamic evolution of a system rather than results from one certain period <eg. a year>. There are studies pinpointing the abstraction of systems for modeling, including analysing pathways of emissions mitigation as for the coupling analysis of human socioeconomic development and carbon accounting. But those studies have failed to include the high-granularity temporal dynamics, which can be improved in the future. Finally, organic coordination is needed in scenarios where pathway analysis can couple human society into carbon accounting relevant to the human activity data and the reflection of human society running. So how can this inter-couplability be represented and researched in future studies? Do we need to emphasize the system where initially A is considered, and only after A is finished are we able to turn our head to B? Those are some points for carbon emission studies, inspiring from the Dialectics of Nature, which also can be something looked inside longitudinally in the future.

The End

This article analyses the applications of the Dialectics of Nature to the field of carbon emissions accounting by integrating some viewing points from the dialectics into scientific data analysis questions concerning the emission accounting. It also interprets the influences of such applications to the scientific advances and development in that field. Results show that dialectics implicitly and inadvertently plays significant roles in those research, and is an integral part of the development of science.


[1] Introduction to the Dialectics of Nature, 2018 version

[2] 马翠梅,徐华清,苏明山. 温室气体清单编制方法研究进展. 地理科学进展,2013,32<03>:400-407.

[3] Rogelj J, Forster P M, Kriegler E, et al. Estimating and tracking the remaining carbon budget for stringent climate targets[J]. Nature, 2019, 571<7765>: 335-342.

[4] Solomon S. 2007. Climate Change 2007. The Physical Science Basis: Working Group I Contribution to the Fourth Assessment Report of the IPCC. Camrbidge: Cambridge University Press

[5] 刘竹, 关大博, 魏伟. 2018. 中国二氧化碳排放数据核算. 中国科学: 地球科学, 48: 878–887, doi: 10.1360/N072017-00009

[6] Eggleston H S, Buendia L, Miwa K, Ngara T, Tanabe K. 2006. IPCC guidelines for national greenhouse gas inventories. Hayama: Institute for Global Environmental Strategies. 2: 48–56

[7] Zhu S L. Comparison and analysis of CO2 emissions data for China. Advances in Climate Change Research, 2014, 5<1>:17-27.

[8] Andrew R M. A comparison of estimates of global carbon dioxide emissions from fossil carbon sources[J]. Earth System Science Data, 2020, 12<2>: 1437-1465.


This article is the course paper of Introduction to the Dialectics of Nature with 94/100 final grade, also reflects some thinkings about the incorporation of dialectics into my research field, which I have never thought before. It also gives insights into how to dialectically approach the answers to the scientific questions.