In September-October, Openscapes led a 2-month Champions Cohort with the National Oceanic and Atmospheric Administration (NOAA) National Marine Fisheries Service (NMFS), working with over 30 fisheries scientists across four fisheries science centers at NMFS. These scientists were interested in exploring new approaches to scientific and data workflow, data analysis and stewardship, and project management—as it applies to the complex workflow required in analyses and reports involving diverse teams, data flows, and analyses.
This cohort followed one we led this spring for the Northwest Fisheries Science Center (NWFSC) and was co-led by Dr. Eli Holmes, who had participated and has a long history of open science leadership and teaching at NOAA and beyond. This post focuses on what the cohort setup and participants achieved. It is co-authored with Eli, Eric Ward, and Hélène Scalliet from NOAA NWFSC, who coordinated the cohort, and Corey Clatterbuck from California Waterboards, who assisted. This opportunity is funded by participating science centers and coordinated by NOAA Northwest Fisheries Science Center (NWFSC). See openscapes.org/champions for more background on the Champions program.
Quick links:
- openscapes.github.io/2021-noaa-nmfs - Cohort webpage
- github.com/nmfs-openscapes - New NMFS GitHub Organization for coordinating and highlighting open science at NOAA NMFS
- github.com/nmfs-openscapes/.github/wiki - Wiki with examples and onboarding resources of open science at NOAA NMFS
NMFS Cohort setup and structure
We worked with Eric Ward earlier this year to lead the Northwest Fisheries Science Center (NWFSC) cohort with fisheries researchers from Seattle. We were keen to lead a second one to follow up and expand the participation with Openscapes, and were able to include researchers from four different fisheries science centers in the Southeast (SEFSC), Northeast (NEFSC), Northwest (NWFSC), and Alaska (AFSC). We iterated the Champions program based on what we learned from leading three cohorts in the spring, as we described in a separate post about Fall Cohorts. We were excited for this mix of new and returning champions and interested to support them to make progress on their open data science pathway, no matter where they were starting from.
During the Openscapes Champions sessions, the entire cohort discussed inefficiencies, reproducibility and documentation problems and other team roadblocks in their current scientific workflows. The teaching component of each session involved training in reproducible workflow, team culture, and structuring open team communities that can collaborate effectively with the current and future selves.
Two guest teachers shared about their open data science efforts at NOAA. Emily Markowitz (NOAA AFSC) shared how the AFSC is using RMarkdown to automate stock assessment reports to improve efficiency, reproducibility and reduce report errors (slides: Data to Product Workflows). Chanté Davis (NOAA Fisheries’ West Coast Region) shared about how the West Coast Region has approached staff training in R to increase efficiency in development of WCR reports and response to data requests (slides:Expanding OuR Communities). Emily said that with R Markdown workflows for making reports: “We will never guess again” and that it saves so much time and emailing to track things down. Chanté said “Don’t R alone!” and that R has to be in your work plan. She suggests approaching & mapping project management and workflows like a recipe: What is the end product? What are the steps to create the end product? Which of these steps can be automated?
In between the Champions sessions, the cohort learned the in-and-outs of using RStudio, Git and GitHub for team and project management with clinics and 1-on-1 sessions with Eli. Each team also met separately to work on their on goals for improving the team workflow during Seaside Chats (AFSC example), using GitHub Issues and Pathways documents as a way to organize together.
What did the participants achieve?
This cohort was an opportunity for NOAA staff to explore and practice open data science approaches, but it was also an opportunity to meet and build relationships with folks from different fisheries science centers. It was exciting to see participants discuss common pain points and learn from each other about strategies that could be reused across centers. Discussion topics included strategies to help diverse teams develop strategies for using GitHub together and organizing R code, as well as articulating differences between final NOAA products vs ongoing work/projects and how open science relates to both.
Many topics/themes that resonated with the 2021 NOAA NMFS Cohort overlapped and reinforced each other. Here are a few examples:
Documenting onboarding & offboarding processes. Investing time to formally document onboarding and offboarding processes (for federal employees as well as contractors) was a big topic. Both are important to capture institutional knowledge and critical to prepare both for folks leaving their positions and for new folks joining the team. Written documents help others get up to speed faster, and find available datasets, data description guides, and methodologies. It’s critical that they are easily accessible and updatable “living documents” for core or team processes that may occur infrequently.
GitHub projects. While multiple tools for project collaboration and management exist, this cohort was particularly enthusiastic to learn more about Github projects. The cohort noticed that Github projects are easily used by team members who have different skill sets, job titles, and areas of expertise. Teams started offering other ways to contribute, including discussion boards, Github issues, and Slack. These tools enable asynchronous collaboration that goes “beyond the meetings” and can improve hybrid working conditions. GitHub’s Project Boards{beta} were release just as our cohort ended and Paul McElhaney (NWFSC) shared examples of how it could be used by NOAA teams. Some potential uses for GitHub Projects included:
- Organize and track data request tasks
- Coordinate on/offboarding efforts
- Hands-on workflow practice and use using Github
Strategies for community building. Collaboration is essential within teams and with external partners, but collaborating with people around new technology is challenging and takes time. Discussing next steps for carrying this work forward, one participant summarized it as “not just skills, also culture”, and that helping folks get excited is worth the effort. Strategies included show-by-doing (screensharing), co-working, office hours, and bringing down silos of people who are used to working with (beyond job titles). This includes sharing work earlier (not waiting for work to be perfect-ish before sharing) and building better shared relationships with Google folders.
“We want to change ‘our next steps’ to ‘our now’” - NMFS team member
nmfs-openscapes GitHub Organization. Eli Holmes (NWFSC) and Em Markowitz (AFSC) have spear-headed the nmfs-openscapes GitHub Organization to help coordinate and give visibility to all the open data science work at NMFS — and beyond (see R-Govys. We especially love the Wiki, a growing resource with examples ofGitHub in gov andproject boards.
The NOAA-NMFS Champions Cohort was really a thoughtful and dedicated group, and are so excited for the momentum at NOAA NMFS. The following is a bit more about each NMFS team in the Cohort. And look out for further blogs about what the Openscapes team learned and what’s coming next.
NMFS Cohort teams
The NWFSC Eco/Stock Assessment Team. This team is working on analyses and visualization of fisheries-dependent and -independent data to inform both groundfish stock assessments and integrated ecosystem assessments (IEA). Potential projects include 1) analysis of ecosystem drivers and associated responses, 2) estimation of species-specific habitats in the context of fishing gear utilization, 3) incorporation of ecosystem considerations into stock assessments, 4) visualization of IEA indicator distributions and trends, and/or 5) development of a strata explorer app.
The NWFSC FEAT Team. This team will focus on streamlining data tracking/sharing/processing within the Fisheries Engineering and Acoustic Technologies (FEAT) team, as well as improving onboarding/offboarding of personnel, coordinating Pacific hake biomass calculation dataflow for stock assessors, MSE, and other interested parties.
The NWFSC Protected Salmonids Team. This team will be working on improving data workflow related to protected PNW salmonids. Top issues for this team are more robust data workflows: 1) better data tracking, 2) personnel on-boarding systems, 3) loss of critical connections with our diverse data partners when staff retire or leave, and 4) implementing more automated data workflows. Team members are working on the PNW Viability Reports and on the salmon bycatch RM&E program in the west coast groundfish fisheries.
The NWFSC WRAP Team is a cross-divisional team from the Fish Ecology and Fisheries Resource and Monitoring divisions. This team is working to coordinate sharing of data for marine ecosystem and life cycle modeling projects.
The AFSC Team is from the Resource Assessment and Conservation Engineering (RACE) division. This team will be working to improve data, code and report workflows for survey data. Data: curation, processing and dissemination, including implementing continuous integration. Code: improving collaboration on code packages survey data processing and prep, streamlining code development and sharing. Reports: automating reports, collaborating on proposal writing, and automating standard data products.
The SEFSC Team is part of the SouthEast Data, Assessment, and Review (SEDAR) in the Sustainable Fisheries division at SEFSC. This team will be working on improving data workflow, automating reports, and collaborating on packages and functions for assessment model output and diagnostics for Southeast stock assessments for the Gulf of Mexico, Caribbean, and Atlantic.
The NEFSC Team is involved in collecting fisheries dependent data in collaboration with the fishing industry. The team is interested in strengthening open science approaches to increase the value of these data by opening the door to new users who have previously been unable to use the data sets due to their complex nature.