Patents and Publications

Jupyter Notebook Link

1.Patent and Publication Matrix: Characterisation and differences

In this first part of the complementary analysis, the focus will not be in the macro, meso or micro level, but rather in comparison of two different types of assets present in the database, particularly patents and scientific publications. As a first result of the analysis, is the creation of two capability matrixes, one for patents and another for scientific publications. These matrices, represented as heat maps below, are obtained through the same term-pair approach in conjunction with a filtering of the technological assets by type.

Following the same approach as before, and with the goal of understanding the differences in terms of utilization of different term pairs between patents and publications, these two normalized matrices were compared. As a result, a table of the most divergent term pairs in terms of usage was created. Here, one can see the term pairs that are more published than patented and vice-versa, ordered by the absolute difference in percentage of utilization. For example, in the first position, the term pair “biogas-ethanol/anaerobic digestion” is used in 36% of publications, but only 0.44% in patents. Moreover, it should be noted that most of the term pairs in this table are either processing technologies (e.g. anaerobic digestion, fermentation, pyrolysis, hydrolysis) or output terms (e.g. biogas, ethanol), and that the presence of feedstock term pairs is rather unimportant.

Top term pairs usage differences in Patents and Publications:

First Term

Second Term

Patents

Publications

Difference

anaerobic digestion

biogas

0.000488

0.016448

0.015960

pyrolysis

bio-oil

0.000561

0.008202

0.007641

bioethanol

fermentation

0.002560

0.009282

0.006722

hydrolysis

bioethanol

0.001609

0.006748

0.005139

biodiesel

catalysis

0.000488

0.005337

0.004850

biogas

waste

0.001439

0.006182

0.004743

butanol

fermentation

0.006925

0.002377

0.004548

ethanol

catalysis

0.000317

0.004606

0.004289

ethanol

butanol

0.005364

0.001115

0.004250

ethanol

pressing

0.004828

0.000601

0.004227

biodiesel

transesterification

0.002975

0.007192

0.004217

anaerobic digestion

waste

0.000415

0.004441

0.004026

ethanol

enzymatic hydrolysis

0.002585

0.006574

0.003989

hydrolysis

biogas

0.001170

0.004937

0.003767

methanol

catalysis

0.000073

0.003474

0.003401

2.Evolution of asset types over time

In order to study the evolution of the systems of the two types of assets over time, the total number of records of each type from 1990 until 2017 was plotted. An explosion in the number of assets can be noted from the year of 2007, moreover, there seems to be a sharp downfall of the total number of assets from 2015. When comparing patents and publications, while the years of 2004-2010 have a far superior number of patent assets over published assets, this trend seems to be overcome after that same period, following a sharp rise in published assets.

Furthermore, another focus of this subsection is to understand how different terms are patented vs. published. As a proof of concept, 8 feedstock terms were chosen (waste, algae, cellulose, sugar, paper, wood, residues and corn) and their evolution over the period of time of 1990-2017 was studied. In the following graphs, both the normalized and absolute versions are shown. Where one shows the total number of assets of each type with that term, the other shows the number of assets as a ratio of the total number of assets in that year. It can be noted that the behavior of the different terms follows the general pattern: they appear more in patents until a certain moment in time, and after that period, they appear more in scientific publications.

3.Term distribution by type of asset

The final part of this complementary analysis, seeks to understand the behavior of the different types of terms (feedstocks, processing technologies, and outputs) in terms of their balance in presence in patents or publications.

To understand this, the same approach was applied to 3 groups. In this approach all of the different terms are plotted in a graph where the x axis corresponds to percentage of patents, and the y axis corresponds to the publications. A perfect balance would align the terms perfectly following the x=y curve (a term is as patented as published). The further away from the x=y diagonal the data point is, the bigger the bias (towards publications or patents) towards one type of technological asset.

Though the three different groups were analyzed in this subsection we will focus on feedstocks as a proof of concept. As can be noted in the graph below, there seems to be a relative balance between publications and patents in the case of feedstocks. Moreover, in the table below, the assets with the biggest distance to x=y were printed. The term “starch” for example, has a bias towards patents (3.04%) when comparing to publications (0.88%). However, when counting the global number of feedstocks, in 170 terms, 80 appear more in patents and 90 appear more in publications.

Most unbalanced feedstock terms:

Name

Patent

Publications

Distance to Mean

Bias

starch

0.030417

0.008816

0.015274

Patents

grain

0.029412

0.008946

0.014472

Patents

agriculture

0.008798

0.024893

0.011381

Publications

sugar

0.064605

0.050823

0.009745

Patents

waste water

0.006662

0.020420

0.009729

Publications

algae

0.077049

0.063788

0.009376

Patents

paper

0.031674

0.044470

0.009048

Publications

blend

0.023504

0.010956

0.008873

Patents

energy crops

0.004022

0.016271

0.008661

Publications

sewage

0.010935

0.022689

0.008311

Publications

Last updated