edit or comment on this doc
Form || Responses
Discussion (Google Hangouts) : Monday 8/16 1300-1430 ET
Ask: ‘Describe any USPTO data that are not currently available publicly, and would foster useful research if systematically assembled and released.’
General groups of requests:
Improve access to current data (public + private)
Ask for or mandate clean identifiers in submissions
Ask for or mandate new data in submissions (race + gender)
Better tracking / mapping (of typos to cleaner names; of assignments)
Summary / TOC
Additional data —
Create/use IDs for patent-related drug coumpounds (1.)
—> Policy proposal: Ask/mandate that applicants provide INNs for compounds
Use standardized company name/ID (2.)
—> Inform: Point to historical PTO work here (BHH)
—> Policy: Ask for self-reported ID? (SJK)
Ungranted patents pre-2000 (6.)
—> Policy: Small statutory change.
—> Process: Create general environment for confidential research access (similar to Census data centers / via the Chief Economists’s Office?)
Characteristics of assignees + inventors (7., BHH)
—> Policy proposal: Ask for collecting race + gender info. Considerable interest among researchers; find out how far the PTO has gone with this
Timeline + clustering of related TM registrations (11.)
—> Research: Asking applicants to link related marks? Better a research Q
Mapping assignees + patent similarity —
Map assignees to company (2., 8.)
—> Ask: Code a correspondence b/t misspelled/nonstanard company names
Track assignee/acquisition history (3, 8, 9) - cf. dynamic assignee panel (9)
—> Policy ask: Add obligation of recording change in assignment
—> Inform: In principle this is in PAIR but it would be nice if it were more convenient to access. Of course that doesn’t solve the nonreporting problem - I am not sure how big this is. (BHH)
Publish pairwise (or closest historical) similarity (5.)
—> Research: Very popular request. Algorithm challenge; better imagined as a research API
Litigation data —
Outcomes of patent litigation (4., 10.)
—> Research: Mostly outside USPTO scope. Ask David Schwartz about this
PTAB data:
Data on ex post outcomes specifically + related metrics of quality (10.)
Data currently hard to access via public PAIR (new), and limited release e.g. of rich metadata such as correspondence between applicant and examiner, disclosure of NPLs
—> Policy: recently this has become harder to access, no batch access
Todo: Write this up in detail? (OAJ)
Policy change requests — addressed above
Offer a streamlined way for students/researchers to access private data (6.)
Invite TM registrants to indicate related marks (11.)
Require assignees to reveal the Real Party of Interest, for public benefit. (9.)
Other —
Data feed of new USPTO data products (12.)
—> Research: in the PTO newsletter; add to catalog?
Hold more educational workshops for students (13.) —> request to the CEO
—> Req to the CEO: Workshop on what data is available; how to find new data
Where a request can’t be done yet, link to related resources (14.)
—> Research: add to catalog
Help researchers run periodic surveys (new)
—> Policy: maintain a process for this? (BS) PTO could maintain a panel of applicants to opt into surveys
Full responses
[Lucy Xiaolu Wang]
IDs for patented or patent-related drug compounds
Category:
additional data (pharma ID)
Data wish: clean & processed identifiers of patented or patent-related drug compounds understanding IP-related issues in the pharmaceutical sector.
Key features: understanding IP-related issues in the pharmaceutical sector
Research this facilitates: firms' strategic patenting behavior; drug prices and patents; currently most health economists don't use patent data given the lack of relevant training and easily accessible compound-specific patent data.
[Josh Krieger 1]
Data wish: Better mapping from patent assignees to standardized company name, company type and location (i.e., public: ticker, private company: address, incorporated/registered year; individual).
Category:
mapping + company ID
[Josh Krieger 2]
Data wish: Assignee and acquisition history
Research this facilitates: Ex: Wyeth Pharma patent granted in 2008....would be nice to have a file indicating whether or not the patent became owned by Pfizer in the 2009 merger
Category:
additional data
[Josh Krieger 3]
Data wish: Outcomes of patent litigation (to go with the Patent litigation dockets file)
Category:
additional data
[Josh Krieger 4]
Data wish: Pairwise patent text similarity files (or at least max backwards similarity, overall and within CPC)
Category:
additional data + mapping
[Heidi Williams]
Pre-Nov.2000 data on PTO applications that were not granted patents
+ A streamlined process to visit/access this data for research.
Category
: additional (private) data, policy
Data wish: Pre-Nov 2000 data on applications to the USPTO that were not granted patents
Key features: The post-Nov 2000 "unsuccessful patent applications" data has been incredibly useful in facilitating a variety of research projects
Other requests: I understand that these data can't be made publicly available, but it would be great to set up -- if possible -- a standardized, streamlined process through which students or others could visit the USPTO to analyze this data for research purposes.
[Yi Qian]
Characteristics of assignees or innovators, merging individual + firm data.
+ Links to where else (EPO, &c) IP is being protected.
+ Providing panel/merged data if available. (or listing relevant sources)
Category
: additional data
Data wish: assignees' or innovators' characteristics, where else (eg. EPO, etc.) being IP protected.
Key features: The merge of individual and firm (not just public firms but also private ones) characteristics and other syndicated databases
Research this facilitates: These characteristics could help analyze incentives and responses to innovate in different environment
Other requests: Availability of panel data or merged data, if available
[Xixi Hu]
Map assignees to companies (name, ticker, other ID).
Track assignment changes over time (eg., between companies)
Category
: mapping, company ID, assignment tracking
Data wish: Better mapping from patent assignee to company name or ticker, as well as better tracking of the assignment changes between companies.
Key features: To understand the intellectual property development in the pharmaceutical industry.
Research this facilitates: why do firms strategically choose research/patent areas, firm behaviors. Currently, there is still a gap in research on firm's motivation, efficiency and behaviors.
Other requests: A list of the different data source where students can go and merge data if the data cannot be released publicly.
[Tim Simcoe 1]
Standard data on the ultimate patent owner (“Real Party in Interest” [RPI])
+ Improve the dynamic assignee panel
+ Adopt rules requiring assignees to reveal RPI, for public benefit.
Category
: mapping, assignment tracking, policy change
Data wish: Standardized data on ultimate owner (or what attorneys call “Real Party in Interest”).
Key features: Creating improvements to the dynamic assignee panel
Research this facilitates: Better measurement for papers with assignee effects, and improved patent-to-firm matching
Other requests: Ideally, PTO should adopt rules requiring assignees to reveal RPI (for public benefit).
[ Tim Simcoe #2 ]
Data on ex post outcomes in PTAB and in courts.
Category
: new data (trial outcomes)
Data wish: Curated data on ex post outcomes in PTAB and courts.
There is lots to unpack here, but the idea is to track: Was it ever asserted?
Was it challenged at PTAB? Did a court rule on validity or infringement?
Did the owner make a public licensing commitment? See: https://patentlyo.com/patent/2021/06/contreras-shepardizing-patents.html
Key features: Systematic collection of ex post "quality" metrics
Research this facilitates: Would enable more research into relationship between prosecution and long-term indicators of "quality" as interpreted by PTAB and district courts
[Other 1]
Timeline of TM registrations.
Clustering of related marks assigned to the same entities.
Adopt rules inviting registrants to identify clusters of related marks
Category
: mapping, cluster ID, policy
Data wish: Timeline of trademark registrations, clustered by related marks, showing when each comes into and falls out of force
Key features: Explicit rather than implicit end dates; explicit clustering of related marks; some cluster ID
Research this facilitates: Understanding the evolution of a family of marks over time, seeing gaps in registration coverage; distinguishing when similar marks appear in different fields vs. when a founding company branches out into a new field under the old mark.
Other requests: A feed of new PTO data products of all kinds, w/ links to existing products that it enhances or replaces
[Other 2]
A feed of new PTO data products, referencing related / superseded products
Category
: new data feed
[Other 3 : Lucy Wang]
Hold more educational workshops (online) for interested learners
Category
: new workshops
[Other 4 : Xixi Hu]
Other request: For the above: where requested data exists but can’t [yet] be compiled for lack of time, add pointers to related data sources on the PTO site for the closest existing dataset
Category
: website update
[Template]
Data wish:
Key features:
Research this facilitates:
Other requests: