Skip to main content

Gender Workshop 11/18

I³ Fall Workshops, #4

Published onOct 19, 2020
Gender Workshop 11/18

Part of the 2020 I³ Fall Workshops series

November 18, 1200-1310 ET, on Zoom


Stefan Pauly: Innovation and Gender Project

Alenka Guzman: Women inventors in Emerging Countries


Innovation and Gender Project

Investigating how different factors translate into gender gap in patenting.

3 factors covered here:

  • national/socio-cultural norms

  • corporate culture

  • time zones

Linked to work by UK Patent Office, creating a comprehensive gender classification dataset for all inventors in PATSTAT. However, lots of variation in name formats, and much lower accuracy for non-western names. Global inference rate of 75%. Drawn from Matias (UK/US birth certificates), and Tang (2011) (Facebook Data).

This work extends UKIPO data set, as GDR patent inference rate below 50%. Lots of cleaning, fixed format consistency + used R package gender (US social security), genderize API, NamSor API (most effective).

Using data from Morrison, Riccaboni and Pamolli (2017) for Inventor Disambiguation.


Bitsy Perlman: Are the gender assortation of names likely to be different/have changed in the time east and west Germany were separate? (certainly names like Ashely or Kelly have moved assortation over time and, presumably between counties)

Çağatay Bircan: Unlikely, but we can look more into this.

Bronwyn Hall: What exactly are the patents on which this is based? GDR and West German? EPO? families?

Çağatay Bircan: To define the “gap” between East and West Germany, we use all patents filed in 1980-1989 at the German patent office.
That is, both West German and East German patent offices.

Kate Black: What does High Resolution mean in this context?

Çağatay Bircan: The full address including street name, number etc. (as opposed to addresses that just include info on towns and cities for instance)

Q: Cultural origin estimates?

A: We assume they are from Germany, as first proxy. But [namstor] can be fed both first and last name; the last name can suggest region and first can give you a better gender hint.

Name disambiguation over time

Mike Andrews: you are looking at first-time female inventors. I’m interested in hearing more about disambiguating female names. They tend to change after marriage, more than men; makes it touch to dab across patents, identify ‘first time’ inventors and look at mobility. A challenge for the wider community: how should we think about these issues?
Related: there have been some named projects (I can link to them) to link census records to marriage records. This could be something that eventually could be operationalized in patent data:

Adam Jaffe: We could also maybe surmise that what looks like a first patent for a woman might not actually be her first patent, vs for men.

Stefan Pauly: In that sense, disambiguation is harder for women… but because women don’t appear so much in the data, they can also be easier to disambiguate.

Gender estimation from name + metadata

Alenka Guzman: Has USPTO made efforts to identify sex of the inventors?

Adam Jaffe: In general, the PTO does not see its job as providing everyone with data.

Bitsy Perlman: This is an argument for the PTO to ask specifically for gender, Lisa Cook has called for this too. This has had congressional support.

SJ: And there's been one update on the PTO data since then: Progress and Potential 2020

Emily Melluso: PatentsView at USPTO could do gender attribution pretty well for US inventors only using social security data, but international data we do not know of a similar registry of gender attribution and names.

Adam: We appreciate what PatentsView has done with the existing data, but could they collect more data during the process?

Bitsy: The USPTO should do a job of identifying inventors. Also of interest to lawyers and patent holders.
Adam: Something we can all advocate for in our

Deyun Yin: For Chinese patent data, rumors said CNIPA would collect the first inventor's citizen ID, which the last digit indicates the gender. However, since citizen ID is very sensitive, no one has verified this rumor nor has access to this data.

Aside: USPTO uses a measure of the Average Women Inventor Rate (AWIR), which relies on an algorithm that estimates gender from a name. They issued a Progress + Potential report on women inventor-patentees, in 2019 and update in 2020. Here’s how that report describes testing the accuracy of these estimates:

Women Inventors in Emerging Countries

Analysing gender differences in creativity, innovation and science in emerging countries. Determining which factors influence growing number of women inventors.

Many emerging countries have almost gender parity with inventors. In general, women represent ~25% of inventors worldwide.


Adam Jaffe: The regression — were you continuing to measure a woman participation by their being a single woman on a patent counting as participation… you got a significant effect associated with the size of the team, how do you adjust for the likelihood of a woman being in the team being based on team size?

~ If you’re only looking for presence of a single person, that will rise on large teams even if they are not particularly encouraging or welcoming by proportion. (also, see: reports on increases by # vs by proportion, per country)

No comments here
Why not start the discussion?