Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

The Research Software Alliance (ReSA) and the Community Landscape

5 minute read

Published:

(This post is cross-posted on the UK Software Sustainability Institute blog, the Netherlands eScience Center blog and the US Research Software Sustainability Institute blog.) ReSA’s mission is to bring research software communities together to collaborate on the advancement of research software. Its vision is to have research software recognized and valued as a fundamental and vital component of research worldwide. Given our mission, there are multiple reasons that it’s important for us to understand the landscape of communities that are involved with software, in aspects such as preservation, citation, career paths, productivity, and sustainability. One of these reasons is that ReSA seeks to be a link between these communities, which requires identifying and understanding them. We want to be sure that there aren’t significant community organizations that we don’t know about to involve in our work. Also, identifying where there are gaps will help us create the opportunities and communities of practices as required. When thinking about these communities, it’s clear that in addition to those that focus on software, there are others for which software is just a small part of their interest. Some examples are communities that focus on open science, reproducibility, roles and careers for people who are less visible in research, publishing and review, and other types of scholarly products and digital objects. ReSA also wants to define how we fit and interact with that broader scholarly landscape.

How was this work undertaken?

In September 2019, a ReSA taskforce came together to map the software community landscape, consisting of the authors of this blog. This group distributed a survey to ReSA google group members to identify other groups interested in software. Other useful sources included:The taskforce then met to consider the results and how to analyze them. The ReSA list of research software communities is now publicly available as a living community resource, with the version of this list used by the ReSA taskforce in February 2020 and a copy of this post archived in Zenodo. Suggested additions or corrections are welcome by making comments in the list. Some of the issues we’ve had in assembling this list are:
  • How much interest in software does an organization need to have to be listed?
  • When is an organization sufficiently research focused to be included?
  • What momentum/scale does an organization need to have so that we consider it relevant in the global picture?
On the other hand, once we started adding entries to the list, for many we found that we immediately thought of other similar organizations that should be added. For example, some organizations have a geographic aspect, and this led us to think of other similar organizations with different geographic aspects, such as all the national and regional RSE associations.

What did we learn?

There were a range of interesting outcomes of the analysis:
  • There are many, many communities that support research software, emphasizing the need for a coordinating organization such as ReSA. The importance of community development is captured in articles such as Community Organizations: Changing the Culture in Which Research Software is Developed and Sustained by Daniel S. Katz et al., which provides an overview of key groups and discusses opportunities to leverage their synergistic activities.
  • There is an increasing (and wide) range of community initiatives. For example, the Open Science Grassroots Community Networks list has evolved into the Community of Open Scholarship Grassroots Networks (COSGN), whose networks communicate and coordinate on topics of common interest. COSGN has submitted an NSF proposal to formalize governance and coordination of the networks to maximize impact and establish standard practices for sustainability.
  • The increasing focus on open software makes it hard to separate research and non-research initiatives. As per the points above, it is very hard to define which initiatives are part of the research software community, and which aren’t.
  • Some organizations that were originally data-centric now include a software focus. For example, the Research Data Alliance now includes the Software Source Code Interest Group, which provides a forum to discuss issues on management, sharing, discovery, archiving, and provenance of software source code.

What are the next steps?

We invite readers to continue to add or make corrections to the ReSA list of research software communities by making comments in the list, which will continue to be curated by ReSA. We are also interested to hear from community members who would like to engage with us in writing a landscape paper based on further analysis and work. This could address questions such as what are the axes that create the space, where do the currently-known organizations fit in the space, and are there gaps where no organization is currently working? We also invite readers to consider involvement in other ReSA activities, including Taskforces.

Conclusion

The ever-growing number of constituents of the research software community both reflects and demonstrates the increasing recognition of research software. The research software community is now a complex ecosystem comprised of a wide variety of organizations and initiatives, some of which are community networks themselves. Collaboration and coordination across these initiatives is important, to enable the broader community to work together to achieve bigger goals. ReSA aims to coordinate across these efforts to leverage investments, to achieve the shared long-term goal of research software valued as a fundamental and vital component of research worldwide. Join the ReSA google group to stay up-to-date on our activities. Read more

portfolio

ExPaNDS

The ambitious ExPaNDS project is a collaboration between 10 national Photon and Neutron Research Infrastructures (PaN RIs) as well as EGI. The project aims to deliver standardised, interoperable, and integrated data sources and data analysis services for Photon and Neutron facilities.
Read more

publications

The FAIR Guiding Principles for scientific data management and stewardship

Published in Scientific Data, 2016

This is the first formalisation of the FAIR guiding principes for data management and stewardship, which aim at making data Findable, Accessible, Interoperable and Reusable (FAIR). Read more

Recommended citation: Wilkinson, Mark D. and Dumontier, Michel and Aalbersberg, IJsbrand Jan and Appleton, Gabrielle and Axton, Myles and Baak, Arie and Blomberg, Niklas and Boiten, Jan-Willem and da Silva Santos, Luiz Bonino and Bourne, Philip E. and Bouwman, Jildau and Brookes, Anthony J. and Clark, Tim and Crosas, Mercè and Dillo, Ingrid and Dumon, Olivier and Edmunds, Scott and Evelo, Chris T. and Finkers, Richard and Gonzalez-Beltran, Alejandra and Gray, Alasdair J. G. and Groth, Paul and Goble, Carole and Grethe, Jeffrey S. and Heringa, Jaap and ’t Hoen, Peter A. C and Hooft, Rob and Kuhn, Tobias and Kok, Ruben and Kok, Joost and Lusher, Scott J. and Martone, Maryann E. and Mons, Albert and Packer, Abel L. and Persson, Bengt and Rocca-Serra, Philippe and Roos, Marco and van Schaik, Rene and Sansone, Susanna-Assunta and Schultes, Erik and Sengstag, Thierry and Slater, Ted and Strawn, George and Swertz, Morris A. and Thompson, Mark and van der Lei, Johan and van Mulligen, Erik and Velterop, Jan and Waagmeester, Andra and Wittenburg, Peter and Wolstencroft, Katherine and Zhao, Jun and Mons, Barend. "The FAIR Guiding Principles for scientific data management and stewardship", Scientific Data, https://doi.org/10.1038/sdata.2016.18 https://doi.org/10.1038/sdata.2016.18

Data discovery with DATS: exemplar adoptions and lessons learned

Published in Journal of the American Medical Informatics Association, 2017

This paper analyses the implementation of the DATS model for data discovery in a set of exemplar data sources Read more

Recommended citation: Alejandra N Gonzalez-Beltran, John Campbell, Patrick Dunn, Diana Guijarro, Sanda Ionescu, Hyeoneui Kim, Jared Lyle, Jeffrey Wiser, Susanna-Assunta Sansone, Philippe Rocca-Serra. "Data discovery with DATS: exemplar adoptions and lessons learned" Journal of the American Medical Informatics Association, Volume 25, Issue 1, 1 January 2018, Pages 13–16, https://doi.org/10.1093/jamia/ocx119 https://doi.org/10.1093/jamia/ocx119

PhenoMeNal: processing and analysis of metabolomics data in the cloud

Published in GigaScience, 2018

This paper PhenoMeNal provides a cloud e-infrastructures solution to analyse metabolomics data. It provides easy-to-use web interfaces that can be scaled to any custom public and private cloud environment.. Read more

Recommended citation: Kristian Peters, James Bradbury, Sven Bergmann, Marco Capuccini, Marta Cascante, Pedro de Atauri, Timothy M D Ebbels, Carles Foguet, Robert Glen, Alejandra Gonzalez-Beltran, Ulrich L Günther, Evangelos Handakas, Thomas Hankemeier, Kenneth Haug, Stephanie Herman, Petr Holub, Massimiliano Izzo, Daniel Jacob, David Johnson, Fabien Jourdan, Namrata Kale, Ibrahim Karaman, Bita Khalili, Payam Emami Khonsari, Kim Kultima, Samuel Lampa, Anders Larsson, Christian Ludwig, Pablo Moreno, Steffen Neumann, Jon Ander Novella, Claire O'Donovan, Jake T M Pearce, Alina Peluso, Marco Enrico Piras, Luca Pireddu, Michelle A C Reed, Philippe Rocca-Serra, Pierrick Roger, Antonio Rosato, Rico Rueedi, Christoph Ruttkies, Noureddin Sadawi, Reza M Salek, Susanna-Assunta Sansone, Vitaly Selivanov, Ola Spjuth, Daniel Schober, Etienne A Thévenot, Mattia Tomasoni, Merlijn van Rijswijk, Michael van Vliet, Mark R Viant, Ralf J M Weber, Gianluigi Zanetti, Christoph Steinbeck; PhenoMeNal: processing and analysis of metabolomics data in the cloud, GigaScience, Volume 8, Issue 2, 1 February 2019, giy149, [https://doi.org/10.1093/gigascience/giy149](https://doi.org/10.1093/gigascience/giy149) https://doi.org/10.1093/gigascience/giy149

Discovering Data Access and Use Requirements Using the Data Tag Suite (DATS)

Published in bioRxiv, 2019

This paper is about the representation of data access and data use requirements for the Data Tag Suite (DATS) model. Read more

Recommended citation: Discovering Data Access and Use Requirements Using the Data Tag Suite (DATS) Model George Alter, Alejandra Gonzalez-Beltran, Lucila Ohno-Machado, Philippe Rocca-Serra bioRxiv 518571; doi: https://doi.org/10.1101/518571 https://doi.org/10.1101/518571

Interoperable and scalable data analysis with microservices: applications in metabolomics

Published in Bioinformatics, 2019

Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator. Read more

Recommended citation: Payam Emami Khoonsari, Pablo Moreno, Sven Bergmann, Joachim Burman, Marco Capuccini, Matteo Carone, Marta Cascante, Pedro de Atauri, Carles Foguet, Alejandra N Gonzalez-Beltran, Thomas Hankemeier, Kenneth Haug, Sijin He, Stephanie Herman, David Johnson, Namrata Kale, Anders Larsson, Steffen Neumann, Kristian Peters, Luca Pireddu, Philippe Rocca-Serra, Pierrick Roger, Rico Rueedi, Christoph Ruttkies, Noureddin Sadawi, Reza M Salek, Susanna-Assunta Sansone, Daniel Schober, Vitaly Selivanov, Etienne A Thévenot, Michael van Vliet, Gianluigi Zanetti, Christoph Steinbeck, Kim Kultima, Ola Spjuth, Interoperable and scalable data analysis with microservices: applications in metabolomics, Bioinformatics, , btz160, https://doi.org/10.1093/bioinformatics/btz160 https://doi.org/10.1093/bioinformatics/btz160

FAIRsharing as a community approach to standards, repositories and policies

Published in Nature Biotechnology, 2019

Read more

Recommended citation: FAIRsharing as a community approach to standards, repositories and policies Susanna-Assunta Sansone, Peter McQuilton, Philippe Rocca-Serra, Alejandra Gonzalez-Beltran, Massimiliano Izzo, Allyson L. Lister, Milo Thurston & the FAIRsharing Community Nat Biotechnol. 2019 Apr;37(4):358-367. doi: https://doi.org/10.1038/s41587-019-0080-8 https://doi.org/10.1038/s41587-019-0080-8

Software Citation Checklist for Authors

Published in zenodo, 2019

This document provides a simple, generic checklist that authors of academic work (papers, books, conference abstracts, blog posts, etc.) can use to ensure they are following good practice when referencing and citing software they have used, both created by themselves for their research as well as obtained from other sources. It may also be used and adapted by journal editors, publishers and conference chairs as the basis of more specific guidance for their contributors and reviewers. Read more

Recommended citation: Chue Hong, Neil P., Allen, Alice, Gonzalez-Beltran, Alejandra, de Waard, Anita, Smith, Arfon M., Robinson, Carly, … Pollard, Tom. (2019, October 15). Software Citation Checklist for Authors (Version 0.9.0). Zenodo. http://doi.org/10.5281/zenodo.3479199 https://doi.org/10.5281/zenodo.3479199

Special Issue on Scholarly Data Analysis (Semantics, Analytics, Visualisation)

Published in Data Science Journal, 2019

The increasing interest in analysing, describing, and improving the research process requires the development of new forms of scholarly data publication and analysis that integrates lessons and approaches from the field of Semantic Technologies, Science of Science, Digital Libraries, and Artificial Intelligence. This editorial summarises the content of the Special Issue on Scholarly Data Analysis (Semantics, Analytics, Visualisation), which aims to showcase some of the most interesting research efforts in the field. This issue includes an extended version of the best papers of the last two editions of the “Semantics, Analytics, Visualisation: Enhancing Scholarly Dissemination” (SAVE-SD 2017 and 2018) workshop at The Web Conference. Read more

Recommended citation: https://content.iospress.com/journals/data-science/2/1-2

The Data Tags Suite (DATS) model for discovering data access and use requirements

Published in GigaScience journal, 2020

This paper is about the representation of data access and data use requirements for the Data Tag Suite (DATS) model. Read more

Recommended citation: George Alter, Alejandra Gonzalez-Beltran, Lucila Ohno-Machado, Philippe Rocca-Serra, The Data Tags Suite (DATS) model for discovering data access and use requirements, GigaScience, Volume 9, Issue 2, February 2020, giz165, https://doi.org/10.1093/gigascience/giz165 https://doi.org/10.1101/518571

COPO: a metadata platform for brokering FAIR data in the life sciences

Published in F1000, 2020

COPO is a computational system that attempts to address some of these challenges by enabling scientists to describe their research objects (raw or processed data, publications, samples, images, etc.) using community-sanctioned metadata sets and vocabularies, and then use public or institutional repositories to share them with the wider scientific community. Read more

Recommended citation: Shaw F, Etuk A, Minotto A et al. COPO: a metadata platform for brokering FAIR data in the life sciences [version 1; peer review: awaiting peer review]. F1000Research 2020, 9:495 https://doi.org/10.12688/f1000research.23889.1

Draft extended data policy framework for Photon and Neutron RIs

Published in Zenodo, 2020

We review the FAIR data policy landscape at European and national levels, consider the current state of data policy adoption and implementation at ExPaNDS partner facilities, and examine existing FAIR ecosystem data policy recommendations, in particular, from the Turning FAIR into reality report and the recent FAIRsFAIR Deliverable 3.3: Policy enhancement recommendations In response, we make twenty-six recommendations of our own that serve to translate these recommendations to the local level of photon and neutron research infrastructures. Read more

Recommended citation: Matthews, Brian, McBirnie, Abigail, Vukolov, Andrei, Ashton, Alun, Collins, Stephen, Da Graca Ramos, Sylvie, Gagey, Brigitte, Gonzalez-Beltran, Alejandra, Johnsson, Maria, Krahl, Rolf, Ounsy, Majid and Van Daalen, Mirjam 2020. Draft extended data policy framework for Photon and Neutron RIs. Zenodo https://doi.org/10.5281/zenodo.4014811

Report on status, gap analysis and roadmap towards harmonised and federated metadata catalogues for EU national Photon and Neutron RIs

Published in Zenodo, 2020

The ExPaNDS project aims at deploying into EOSC Data Catalogues and data analysis services. This document describes the status, a gap analysis, and a roadmap required to achieve harmonised and federated (meta)data catalogues within EOSC of the participating national Photon and Neutron (PaN) Research Infrastructures (RIs). Read more

Recommended citation: Ashton, Alun, Da Graca Ramos, Sylvie & Gonzalez-Beltran, Alejandra. Report on status, gap analysis and roadmap towards harmonised and federated metadata catalogues for EU national Photon and Neutron RIs. (Zenodo, 2020). doi:[10.5281/zenodo.4146819](https://doi.org/10.5281/zenodo.4146819) https://doi.org/10.5281/zenodo.4146819

Fostering global data sharing: highlighting the recommendations of the Research Data Alliance COVID-19 working group

Published in Wellcome Open Research, 2020

The systemic challenges of the COVID-19 pandemic require cross-disciplinary collaboration in a global and timely fashion. Such collaboration needs open research practices and the sharing of research outputs, such as data and code, thereby facilitating research and research reproducibility and timely collaboration beyond borders. Read more

Recommended citation: Austin CC, Bernier A, Bezuidenhout L et al. Fostering global data sharing: highlighting the recommendations of the Research Data Alliance COVID-19 working group [version 2; peer review: 1 approved, 2 approved with reservations]. Wellcome Open Res 2021, 5:267 (https://doi.org/10.12688/wellcomeopenres.16378.2) https://doi.org/10.12688/wellcomeopenres.16378.1

Ten Simple Rules for making a vocabulary FAIR

Published in arXiv, 2020

We present ten simple rules that support converting a legacy vocabulary – a list of terms available in a print-based glossary or table not accessible using web standards – into a FAIR vocabulary. Various pathways may be followed to publish the FAIR vocabulary, but we emphasise particularly the goal of providing a distinct IRI for each term or concept. A standard representation of the concept should be returned when the individual IRI is de-referenced, using SKOS or OWL serialised in an RDF-based representation for machine-interchange, or in a web-page for human consumption. Guidelines for vocabulary and item metadata are provided, as well as development and maintenance considerations. By following these rules you can achieve the outcome of converting a legacy vocabulary into a standalone FAIR vocabulary, which can be used for unambiguous data annotation. In turn, this increases data interoperability and enables data integration. Read more

Recommended citation: Simon J D Cox, Alejandra N Gonzalez-Beltran, Barbara Magagna, Maria-Cristina Marinescu. "Ten Simple Rules for making a vocabulary FAIR" https://arxiv.org/abs/2012.02325 https://arxiv.org/abs/2012.02325

ExPaNDS ontologies v1.0

Published in Zenodo, 2021

We present ontologies for the domain of photon and neutron (PaN) science. With the primary goal of supporting PaN FAIR data catalogue services, we have developed three ontologies: PaN experimental techniques (PaNET), an ontology of NeXus definitions (NeXusOntology), and a semantic integration ontology for the PaN domain (PaNmapping). The ontologies are presented as initial versions, supported by community development workflows. The work represents deliverable D3.2 of the Horizon 2020 ExPaNDS project. Read more

Recommended citation: Collins, Steve P., da Graça Ramos, Silvia, Iyayi, Daniel, Görzig, Heike, González Beltrán, Alejandra, Ashton, Alun, Egli, Stefan, and Minotti, Carlo, 2021, ExPaNDS ontologies v1.0: Zenodo, doi:10.5281/zenodo.4806026. https://doi.org/10.5281/zenodo.4806026

Radical collaboration during a global health emergency: development of the RDA COVID-19 data sharing recommendations and guidelines

Published in Open Research Europe, 2021

The purpose of the present work was to explore how the RDA succeeded in engaging the participation of its community of scientists in a rapid response to the EC request. The three constructs of radical collaboration (inclusiveness, distributed digital practices, productive and sustainable collaboration) were found to be well supported in both the quantitative and qualitative analyses of the survey data. Other social factors, such as motivation and group identity were also found to be important to the success of this extreme collaborative effort. Read more

Recommended citation: Pickering B, Biro T, Austin CC et al. Radical collaboration during a global health emergency: development of the RDA COVID-19 data sharing recommendations and guidelines [version 1; peer review: awaiting peer review]. Open Research Europe 2021, 1:69 (https://doi.org/10.12688/openreseurope.13369.1) https://doi.org/10.12688/openreseurope.13369.1

Ten Simple Rules for making a vocabulary FAIR

Published in PLoS Computational Biology, 2021

We present ten simple rules that support converting a legacy vocabulary—a list of terms available in a print-based glossary or in a table not accessible using web standards—into a FAIR vocabulary. Various pathways may be followed to publish the FAIR vocabulary, but we emphasise particularly the goal of providing a globally unique resolvable identifier for each term or concept. A standard representation of the concept should be returned when the individual web identifier is resolved, using SKOS or OWL serialised in an RDF-based representation for machine-interchange and in a web-page for human consumption. Guidelines for vocabulary and term metadata are provided, as well as development and maintenance considerations. The rules are arranged as a stepwise recipe for creating a FAIR vocabulary based on the legacy vocabulary. By following these rules you can achieve the outcome of converting a legacy vocabulary into a standalone FAIR vocabulary, which can be used for unambiguous data annotation. In turn, this increases data interoperability and enables data integration. Read more

Recommended citation: Cox SJD, Gonzalez-Beltran AN, Magagna B, Marinescu MC (2021) Ten simple rules for making a vocabulary FAIR. PLOS Computational Biology 17(6): e1009041. https://doi.org/10.1371/journal.pcbi.1009041 https://doi.org/10.1371/journal.pcbi.1009041

service

talks

What was the plan? A role for data standards, models and computational workflows in scholarly data publishing

Published:

This talk explores how principles derived from experimental design practice, data and computational models can greatly enhance data quality, data generation, data reporting, data publication and data review. For this, I presented a case study on reproducibility that was a collaboration between the GigaScience journal and the ISA-commons, Research Object and Nanopublication communities. You can read more about my presentation in Scott Edmund’s blog post for the GigaScience Journal and see my slides can be found below. Read more

EBI Metagenomics Bioinfomatics Course

Published:

Within the Metagenomics Bioinformatics Course, Eamonn Maguire and I gave a tutorial on “Metagenomic Data Provenance and Management using the ISA infrastructure — overview, implementation patterns & software tools”. Read more

Better software + better data = better research

Published:

In April 2019, I gave this talk in CIFASIS. CIFASIS, in Spanish Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas, is a research institute of the Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) or the National Council for Science and Technology of Argentina. Read more

Research Reproducibility

Published:

I was invited to give a talk on “Research Reproducibility” during the “Research Day” of the EPSRC- and MRC-funded Oxford-Nottingham Centre for Doctoral Training in Biomedical Imaging (ONBI CDT). Read more

teaching

EBI Metagenomics Bioinfomatics Course

Training, EMBL-EBI, 2014

Within the Metagenomics Bioinformatics Course, Eamonn Maguire and I gave a tutorial on “Metagenomic Data Provenance and Management using the ISA infrastructure — overview, implementation patterns & software tools”. Read more