Our data collaborative:
Maintaining III datasets (pub version): regular checkins w/ contributors
-> Quarterly updates: what are people up to? See post-workshop mail
-> Checklists for adding and refining data
-> Who else should be invited? How to include known public datasets?
Focus on work that will be useful by July: submissions and participants (a soft but handy deadline: the in-person committee meeting)
Mailing list note from Adam - a call for list serve discussion + basics o what to expect in the coming months.
Use as discussion forum, w/ more ways to be visible to each other
Data sharing guidelines + practices
Share and shape this
Shape the Dec meeting
Share draft questions and guidelines
Revisit + share Matt’s short paper on his data sharing thinking and workflow
Reiterate the call for discussion beyond the subject of data sharing - we might think of prompts.
Revisit “How To” talks/videos (Bhaven’s idea) — do these fit in the DV?
~ Using historical patent datasets
Steering committee - early July (after list discussions begin)
Offer a pub / forum for people looking to connect around the summer institute
Listserv threads + December ideas formalized / recapped in a pub
Zoom community forum - as needed (Aug/Sept)
List serve + December ideas recapped on pubpub
Steering committee - planning for call for Dec meeting
Call for submissions for Dec meeting. Deadline end of Sept.
For the committee: what do we want to use that time for?
If we want a separate bulletin board, implement now for the Fall.
Zoom community forum as needed
Announce Harvard Dataverse as an I³ repository - tbd on community demand
Steering Committee Mid-Oct: discuss submissions, set Dec program
Laura + SJ to connect w/ Heidi Williams prior to sharing the draft guidelines, so she has eyes on them and they can incorporate her comments.
Other content options - with corona - more like to initiate Spring 2021:
Bhaven had the wishlist item of I3 compiling a library of How To short videos - ideally member created along a template with topics such as:
How to work with Historical Patent Data - where are they, what are the unique issues? (compare Wiki Patent work)
How to link patent data globally? (flag related issues_
Workflow for updating the site
Ex: add the OECD datasets. How should we represent ‘external’ data?
Should we package it in some way beyond how it’s found online?
Commercial data: be neutral. Examples? point to a spectrum. Jay E asked about a G workshop on BigQuery + G.Pat.Data — Invite PatStat, others? MS, Orbis, Lens?
Invite Dataverse to run a hosted-storage/query workshop, invite others. (Z,G!BQ, )1
Talk through a roadmap for the community
What insights or how to steps (from Matt’s fine example) would be helpful to this community in terms of “next steps”?
Talk to the committee 1:1 w/ each member, then a group meeting
A checklist to consider: heirarchy of recs + timelines:
Core that are required; penumbra of strong, weak, qualified recs
One for openness (for creators), one for utility (poll users:Trinh Le)
Asides: who puts in the effort to validate, doc, test; set timelines?
How can we help/support people doing this well, in their career?
Increase professional kudos you get for this activity. [develop this!]
: 1. Awards; 2. invitations to speak; 3. publishing data-description papers (don’t just release the data; ^^ journals willingness to publish)
Are there socsci funders that fund data-construction + infra?
Similar orgs / luminaries to join/support/participate?
Would it be useful to have a running census of datasets that members either have or want and use that to have a target list to triage that list?
[Start w/ committee, datasets referenced in the presentations, reach out to presenters? ~ Heidi’s how-tos]
Likewise: census of similar/related datasets [superseded, complementary, different/incomparable, …]
For the summer meeting agenda: [AB] test ways to ask attendees (at the big-circle meeting) to participate in our data workflow
Set up an i3 mailing list [Adam?]
Interviewing committee members:
Introducing Laura to the committee
Use cases + aspirations for each
Notes on data: from Matt and Phil Durbin
Emails after the convening: community notes and comments
Next followup to all attendees?
Specifics from the email feedback
Do we want to include a [final? end-of-life?] update to the original NBER dataset -- perhaps with pointers to how to regenerate updates oneself for future years? Mentioned in a past chat [SJ]
Ideas from the attendee emails.