Examples from fields with great practices for documentation or provenance
Astronomy: observations all cited to geo/date/time/telescope + weather
CERN: datasets up to 1TB shared via Zenodo (open to everyone)
Approaches from other scientific fields
Principles for open data: FAIR, Open Access guidelines
Current + changing EU policies
How dirty is your data?
Open Data for Research — NSF +
Using GitHub for open science
Cost and permissions
Storing and sharing code along with shared data
Specific challenges with historical patent data
Optical readers + the challenge of different scripts, esp pictograms
Historical patent data
New data: associating ages w/ patents (scraper code, data)
Discoverability
How to make in-process or new research easy to find?
How to profile and collaborate w/ younger researchers
How do you find new research
Favourite datasets (that aren’t commonly found online)
Biggest bottlenecks in documenting and sharing your data?
Legacy of data - Guidelines: how long should it be useful, who maintains it?
Documenting and sharing data in a manner useful to researchers outside of the (possibly) narrow silo it was conducted in
Experiments: sth interesting/different ; compare data.world seminars
More seminary: Presenting on specific research? Less like chats on a topic.
—> people will want a series, one way or another. [reflect after d.w session]
—> write up a summary [draft invite] — update late Sept [name the series]
Build relationships across the community!
Maybe: have matt-osmat conv about patent citations [invite Cyril/patcit]