Cluster Analysis of Per Capita Gross Domestic Products
Objective: The purpose of this article is to show the value of exploratory data analysis performed on the multivariate time series dataset of gross domestic products per capita (GDP) of 160 countries for the years 1970-2010. New knowledge can be derived by applying cluster analysis to the time series of GDP to show how patterns in GDP can be explained in a data-driven way.
Research Design & Methods: Patterns characterised by distance and density based structures were found in a topographic map by using dynamic time warping distances with the Databionic swarm (DBS) . The topographic map represents a 3D landscape of data structures. Looking at the topographic map, the number of clusters was derived. Then, a DBS clustering was performed and the quality of the clustering was verified.
Findings: Two clusters are identified in the topographic map. The rules deduced from classification and regression tree (CART) show that the clusters are defined by an event occurring in 2001 at which time the world economy was experiencing its first synchronised global recession in a quarter-century. Geographically, the first cluster mostly of African and Asian countries and the second cluster consists mostly of European and American countries.
Implications & Recommendations: DBS can be used even by non-professionals in the field of data mining and knowledge discovery. DBS is the first swarm-based clustering technique that shows emergent properties while exploiting concepts of swarm intelligence, self-organisation, and game theory.
Contribution & Value Added: To the knowledge of the author it is the first time that worldwide similarities between 160 countries in GDP time series for the years 1970-2010 have been investigated in a topical context.
machine learning; cluster analysis; swarm intelligence; visualization; self-organization; gross domestic product
Ausloos, M., & Lambiotte, R. (2007). Clusters or networks of economies? A macroeconomy study through gross domestic product. Physica A: Statistical Mechanics and its Applications, 382(1), 16-21. https://doi.org/10.1016/j.physa.2007.02.005
Behnisch, M., & Ultsch, A. (2015). Knowledge Discovery in Spatial Planning Data: A Concept for Cluster Understanding Computational Approaches for Urban Environments (pp. 49-75). Springer.
Birdsall, S., & Birdsall, W. (2005). Geography matters: Mapping human development and digital access. First Monday, 10(10). https://doi.org/10.5210/fm.v10i10.1281
Callen, T. (2008). What Is Gross Domestic Product?. Finance & Development, 45(4), 48-49.
Day, E., Fox, R.J., & Huszagh, S.M. (1988). Segmenting the global market for industrial goods: issues and implications. International Marketing Review, 5(3), 14-27.
Demartines, P., & Hérault, J. (1995). CCA:"Curvilinear component analysis". Paper presented at the 15° Colloque sur le traitement du signal et des images, France 18-21 September..
Dijkstra, E.W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269-271.
Duda, R.O., Hart, P.E., & Stork, D.G. (2001). Pattern Classification (Second Edition ed.). New York, USA: John Wiley & Sons.
El Araby, M. (2002). Urban growth and environmental degradation: The case of Cairo, Egypt. Cities, 19(6), 389-400. https://doi.org/10.1016/S0264-2751(02)00069-0
England, R.W. (1998). Measurement of social well-being: alternatives to gross domestic product. Ecological Economics, 25(1), 89-103.
Franceschini, F., Galetto, M., Maisano, D., & Mastrogiacomo, L. (2010). Clustering of European countries based on ISO 9000 certification diffusion. International Journal of Quality & Reliability Management, 27(5), 558-575. https://doi.org/10.1108/02656711011043535
Furnham, A., Kirkcaldy, B.D., & Lynn, R. (1996). Attitudinal correlates of national wealth. Personality and Individual Differences, 21(3), 345-353.
Gallo, J., & Ertur, C. (2003). Exploratory spatial data analysis of the distribution of regional per capita GDP in Europe, 1980-1995. Papers in Regional Science, 82(2), 175-201. https://doi.org/10.1111/j.1435-5597.2003.tb00010.x
Giorgino, T. (2009). Computing and visualizing dynamic time warping alignments in R: the dtw package. Journal of Statistical Software, 31(7), 1-24. https://doi.org/10.18637/jss.v031.i07
Handl, J., Knowles, J., & Kell, D.B. (2005). Computational cluster validation in post-genomic data analysis. Bioinformatics, 21(15), 3201-3212. https://doi.org/10.1093/bioinformatics/bti517
Hennig, C., et al. (Hg.) (2015). Handbook of cluster analysis. New York, USA: Chapman & Hall/CRC Press.
Heston, A., Summers, R., & Aten, B. (2012). Penn World Table Version 7.1 Center for International Comparisons of Production. Income and Prices at the University of Pennsylvania.
Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46).
Jain, A.K., & Dubes, R.C. (1988). Algorithms for Clustering Data, Englewood Cliffs, New Jersey, USA: Prentice Hall College Div.
Kantar, E., Deviren, B., & Keskin, M. (2014). Hierarchical structure of the European countries based on debts as a percentage of GDP during the 2000–2011 period. Physica A: Statistical Mechanics and its Applications, 414, 95-107.
Kleinberg, J. (2003). An impossibility theorem for clustering. Paper presented at the Advances in neural information processing systems, (Vol. 15, pp. 463-470). MIT Press, Vancouver, British Columbia, Canada December 9-14.
Leister, A.M. (2016). Hidden Markov models: Estimation theory and economic applications. Marburg: Philipps-Universität Marburg.
Liapis, K., Rovolis, A., Galanos, C., & Thalassinos, E. (2013). The Clusters of Economic Similarities between EU Countries: A View Under Recent Financial and Debt Crisis. European Research Studies, 16(1), 41.
Lötsch, J., & Ultsch, A. (2014). Exploiting the Structures of the U-Matrix. Paper presented at the Advances in Self-Organizing Maps and Learning Vector Quantization, Mittweida, Germany, July 2–4.
Makinen, G. (2002). The economic effects of 9/11: A retrospective assessment. Washington D.C.: Library of congress Washington D.C.
Mazumdar, K. (2000). Causal flow between human well-being and per capita real gross domestic product. Social Indicators Research, 50(3), 297-313.
Michinaka, T., Tachibana, S., & Turner, J.A. (2011). Estimating price and income elasticities of demand for forest products: cluster analysis used as a tool in grouping. Forest Policy and Economics, 13(6), 435-445.
Mörchen, F. (2006). Time series knowledge mining. Marburg, Germany: Philipps-Universität Marburg, Görich & Weiershäuser.
Müller, M. (2007). Information Retrieval for Music and Motion, Chapter Dynamic Time Warping. Heidelberg, Germany: Springer.
Nash, J.F. (1951). Non-cooperative games. Annals of Mathematics, 286-295.
Powell, M., & Barrientos, A. (2004). Welfare regimes and the welfare mix. European Journal of Political Research, 43(1), 83-105. https://doi.org/10.1111/j.1475-6765.2004.00146.x
Redelico, F.O., Proto, A.N., & Ausloos, M. (2009). Hierarchical structures in the Gross Domestic Product per capita fluctuation in Latin American countries. Physica A: Statistical Mechanics and its Applications, 388(17), 3527-3535. https://doi.org/10.1016/j.physa.2009.05.033
Rogoff, K. (1996). The purchasing power parity puzzle. Journal of Economic Literature, 34(2), 647-668.
Therneau, T., Atkinson, B., Ripley, B., & Ripley, M.B. (2018). Package ‘rpart’. Retrieved on April 20, 2018 from cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf
Thrun, M.C. (2018). Projection Based Clustering through Self-Organization and Swarm Intelligence (A. Ultsch & E. Hüllermeier Adv.). Heidelberg, Germany: Springer. https://doi.org/10.1007/978-3-658-20540-9
Thrun, M.C., Lerch, F., Lötsch, J., & Ultsch, A. (2016). Visualization and 3D Printing of Multivariate Data of Biomarkers. In V. Skala (Ed.), International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (WSCG), (pp. 7-16), Conference Proceedings. Plzen.
Thrun, M.C., & Ultsch, A. (2017). Projection based Clustering. Paper presented at the International Federation of Classification Societies (IFCS), Tokyo, Japan, August 7-10.
Thrun, M.C., & Ultsch, A. (2018). Effects of the payout system of income taxes to municipalities in Germany. In M. Papież & S. Śmiech (Eds.), 12th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena (pp. 533-542). Conference Proceedings. Cracow, Poland: Cracow: Foundation of the Cracow University of Economics.
Thrun, M.C., & Ultsch, A. (2019). Analyzing the Fine Structure of Distributions. Technical Report. Marburg, Germany: Philipps-University Marburg.
Ultsch, A. (2000). Clustering with DataBots.Int. Conf. Advances in Intelligent Systems Theory and Applications (AISTA) (pp. 99-104). Conference Proceedings. Canberra, Australia: IEEE ACT Section.
Ultsch, A. (2005). Pareto density estimation: A density estimation for knowledge discovery. In D. Baier & K.D. Werrnecke (Eds.), Innovations in classification, data science, and information systems (Vol. 27, pp. 91-100). Berlin: Springer.
Ultsch, A., & Lötsch, J. (2017). Machine-learned cluster identification in high-dimensional data. Journal of Biomedical Informatics, 66(C), 95-104. https://doi.org/10.1016/j.jbi.2016.12.011
Ultsch, A., & Thrun, M.C. (2017). Credible Visualizations for Planar Projections. In M. Cottrell (Ed.), 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM) (pp. 1-5). Conference Proceedings. Nany, France: IEEE.
Ultsch, A., Thrun, M.C., Hansen-Goos, O., & Lötsch, J. (2015). Identification of Molecular Fingerprints in Human Heat Pain Thresholds by Use of an Interactive Mixture Model R Toolbox (AdaptGauss). International Journal of Molecular Sciences, 16(10), 25897-25911. https://doi.org/10.3390/ijms161025897
UNDP. (2003). Human development Report. New York: In P. f. t. U. N. D. P. (UNDP) (Ed.).
Van der Maaten, L., & Hinton, G. (2008). Visualizing Data using t-SNE. Journal of Machine Learning Research, 9(11), 2579-2605.
Vollmer, S., Holzmann, H., & Schwaiger, F. (2013). Peaks vs components. Review of Development Economics, 17(2), 352-364. https://doi.org/10.1111/rode.12036
Watanabe, S. (1969). Knowing and Guessing: A Quantitative Study of Inference and Information. New York, USA: John Wiley & Sons Inc.
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a CC BY-ND licence that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are asked to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) only the final version of the article, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access). We advise using any of the following research society portals: