When observing the correlations in the correlation matrix produced in the results there is a wide range of possible interpretations. However, some hypothesis can be formulated as to the configuration of this clustering.
Language can play a big role as to the correlation of countries: the UK and USA have a 68% correlation, China and Taiwan a 70% correlation. Distance between countries also seems to be a source of correlation between them; Costa Rica and Guatemala have between themselves a correlation of 71% and a low correlation with every country with the exception of Brazil. However, there are some relationships that can be due to other factors such as economic partnerships (South Korea and USA - 65%), climate (Brazil, Zimbabwe and Philippines >80%), or simply the volume of research activity (the case of India). Moreover, some relationships would need a higher level of investigation to be understood, such as the high correlation (>58%) of Denmark with Portugal and Spain.
In order to provide further context as to the reasons for the correlation among the capability of different countries, as suggested by literature, the GDP per capita of countries was extensively used.
One general trend that is worth commenting on is the fact that most country pairs are locked in an “area” characterized by a low GDP per capita difference (<40000$) accompanied by a low capability correlation among themselves (<40%).
The highest capability correlations are more frequent as the GDP per capita difference decreases. In fact, countries such as Brazil and Zimbabwe, which have a similar GDP per capita but a very high capability correlation (88%). On the other hand, some country pairs such as Denmark and India, or the US and China counter this trend, with high divergences in terms of GDP and high correlations, due perhaps to the unbalanced economic development of certain countries.
Generally speaking, the introduction of the average GDP per capita of a country pair comes to confirm the fact that lower GDP differences are related to a higher capability correlation. This can be related perhaps to the fact that countries that are closer to each other tend to have a more similar economic capability, and more collaboration, but this cannot be said for certainty. .
When studying the collaborations and similarity of countries a general trend surfaces: most countries are different in terms of capability and do not collaborate among themselves. This was expected.
All countries that are similar and collaborate are European Union countries (Portugal, Germany, Austria, Italy, and Belgium). This could indicate that the EU is a driving force of collaboration but that collaboration is not necessarily “innovative”, since the countries are similar in terms of research. Moreover, in the global landscape, countries that are similar tend to not collaborate, which seems counterintuitive, but for purposes of innovation makes sense.
Furthermore, most collaborations are made of countries that are different in capability (Canada-Kuwait, El Salvador-Germany, France-Lebanon), rather than similar. This would indicate that most countries decide that collaboration is a way of accessing a research space that they have previously not seen, instead of a way of intensifying the research into their core areas of interest. This is the example of pairs such as France-Lebanon, Canada-Kuwait, Germany-Ukraine, Hungary-UK. The factors that lead to these relationships seem to be a mix of historic, economic and political factors.
When comparing countries, in the particular case of Brazil and Denmark there is one interpretation that could be made. The top term pairs in the Denmark capability matrix are highly related to outputs or processing technologies such as biogas, ethanol, or fermentation. The only feedstock term that appears in Denmark’s top terms is “straw”. On the other hand, looking at Brazil, there is a high prevalence of sugar, sugarcane and other feedstocks.
This can indicate that countries that have a higher prevalence of a certain industry/ raw material tend to focus their research in that particular term and what they can use it for. While on the other hand, countries that are less reliant on a particular raw material tend to focus on the outputs and results of that research, without giving so much importance to the raw material that is used.
The country spectrum is enlightening as to the extreme heterogeneity and inequality in the world of research.
Most of the spectrum is empty which indicates that most countries have very little research when comparing to others. In contrast, the US is particularly impressive as its spectrum is almost entirely completed. This would mean that the US uses almost all term pairs that appear in the database. This can be due to their tradition in leading the research field, their economic development or even the intensive patenting culture. Other countries such as China or India only come close it.
The spectrum also shows if a country is adopting one of two strategies: focusing on certain areas, or distributing its capability across areas. Brazil for instance, seems to follow the first strategy distributing its interest in certain areas across the spectrum, one of them probably related to sugar. The second strategy is the one use by bigger players such as the US, India, or even China which have widely spread research interests.
The uniqueness index comes to confirm the previous premise, the US leads with almost 50% uniqueness. Which means that most term pairs used by the country are only used by it. This shows not only the intensity of research but also the amount of innovation and accessing unresearched areas. When comparing to countries of similar size, such as China, one can say that the US is 40% more unique than China (10%), which sounds impressive. However, does more term usage necessarily indicate more innovation? Or more saturation?
Interestingly, there is a very large number of small countries in the uniqueness ranking such as Ukraine, Lebanon, Cyprus, or Bangladesh. This could mean that countries with no particular research intensity and low economic development tend to focus on areas that are special to them, or their location, and relatively “exotic”. Possibly this is made through collaborations (France-Lebanon example) where a more established country accesses an untapped area through a small country with a particular capability, using it as a “research proxy”.
The first limitation worth mentioning is the usage of the Pearson correlation index. Similarly to the macro analysis, the Pearson correlation index can be criticized by not giving the full picture when comparing countries and establishing a correlation matrix between them.
The second limitation is related to the use of the GDP as contextual information additional to the correlation index. Throughout the literature, several other economic, social and political indicators are used to explain the innovation in countries (Filippetti, A., Peyrache, A., 2011). Therefore it can be seen as naive to only use one indicator. It is interesting to think of other indicators that could further explain the relation between technological capabilities of countries: language, population, education, or even geography. Moreover, the relationships might also be a combination of different factors rather than a consequence of one.
The third limitation is related to collaboration. Throughout the analysis, a division was made between countries that collaborate and countries that do not. However, there is no quantitative index for “collaboration”, only the number of collaborations between two entities, For example, should countries that only worked once together, be seen as countries that collaborate, or should it be only considered as collaboration if they collaborate more than X times?
The fourth and final limitation worth mentioning is the spectral representation used to display the capabilities of countries. The sheer amount of term pairs in the database makes the visualizations rather difficult. Although an effort was made to reduce this number, some countries appear as almost empty spectrums while in fact, they used term pairs (as can be seen by the data).