Published onAug 19, 2020
Models for sharing data from other fields

Examples from fields with great practices for documentation or provenance

  • Astronomy: observations all cited to geo/date/time/telescope + weather

  • CERN: datasets up to 1TB shared via Zenodo (open to everyone)

  • Approaches from other scientific fields

Checklists for making work accessible

Open data

Open source code

Choosing repositories

  • Cost and permissions

  • Storing and sharing code along with shared data

  • Specific challenges with historical patent data

  • Optical readers + the challenge of different scripts, esp pictograms

Underserved + new datasets

  • Historical patent data

  • New data: associating ages w/ patents (scraper code, data)


  • Discoverability

    • How to make in-process or new research easy to find?

    • How to profile and collaborate w/ younger researchers

    • How do you find new research

  • Favourite datasets (that aren’t commonly found online)


  • Biggest bottlenecks in documenting and sharing your data?

  • Legacy of data - Guidelines: how long should it be useful, who maintains it?

  • Documenting and sharing data in a manner useful to researchers outside of the (possibly) narrow silo it was conducted in

Presentation / seminar series

  • Experiments: sth interesting/different ; compare seminars

  • More seminary: Presenting on specific research? Less like chats on a topic.
    —> people will want a series, one way or another. [reflect after d.w session]
    —> write up a summary [draft invite] — update late Sept [name the series]

  • Build relationships across the community!

  • Maybe: have matt-osmat conv about patent citations [invite Cyril/patcit]

