Page tree
Skip to end of metadata
Go to start of metadata

1. Summary

The Global Water Futures program unifies heterogeneous research on Canada’s water resources. One of the most important legacies of this program will be the repository that will aggregate the data collected by each project.  The data, spanning significant temporal and spatial domains, will be collected through observation, instrumentation and human interaction, generated using various environmental models, and be produced through analysis and visualization.  This assemblage of data, having been properly archived, secured, yet easily discoverable, will be the foundation of current and future research efforts. Consequently, our Data Management Strategy will ensure that QA/QC, security, processing, and data sharing procedures are standardized and in compliance with the GWF Data Management Policy.

The GWF Data Management team works with the Strategic Management Committee (SMC), Core Teams and Projects to develop a process and technology driven framework to effectively and efficiently manage GWF related data throughout the research cycle and beyond.  Further, the Data Management team will support and promote a culture of data stewardship within the program. This approach is key to the success of the program as science at its core is a data driven endeavor.  The quality of data available directly affects the quality of science based knowledge that it can generate and the soundness of decision-making processes for which the data and science provide a foundation.  The vast investment of time and money into the collection of environmental data warrants significant attention be paid to its preservation. To this end, the Data Management team will work with the SMC to champion a data governance strategy that includes a framework for the development of data standards and processes to collect, manage, and preserve GWF data.  This will help streamline current practices to create efficiencies in the current projects and culminate in required end of program data deliverables.

On this page:

2. The Team

The GWF Data Management Team consists of 4 faculty leads, 4 data managers, and 2 support staff spread out across the 4 major partner universities of the GWF program.  The faculty leads provide advice and direction to the data managers.  The data managers are tasked with the development and implementation of the data management strategy.  We assist researchers and projects with the management of their data throughout the research lifecycle.  The data managers meet every 2 weeks to discuss topics of interest.  Our members liase with experts from library and technology organizations (Portage & ICT), and research ethics boards.

UniversityUniversity of SaskatchewanUniversity of WaterlooMcMaster UniversityWilfrid Laurier University
Faculty Lead

John Pomeroy

Jimmy LinMike Waddington

Michael Steeleworthy

Data Manager
Bhaleka PersaudKrysha DukaczGopal Saha
Support Staff

Laleh Moradi1, Stephen O'Hearn2

Juliane Mai3

1Research Analyst; 2IT Cordinator and Specialist and Data Management Team Lead; 3Assistant Professor, Big Data Dissemination

3. Objectives

3a. Management of Data throughout the Research Lifecycle

As a broad consortium of investigators from allied research fields, as well as government, industry, indigenous, and community partners, GWF’s data management needs are diverse.  Developing data management plans, workflows, methods, and strategies that can standardize practices while maintaining flexibility for the different subject domains and stakeholder groups to operate is an overarching goal of the Data Management Team. Data management needs within GWF can be articulated in terms of planning, description / markup (metadata), storage / backup, security and access, training, and preservation and discovery activities.

Planning. As a large, interdisciplinary research program, GWF requires a significant amount of data management planning, both prior to and after project inception.  Identifying, categorizing, and solving storage needs, security concerns, architecture and metadata requirements, access and transfer needs, as well as preservation and discovery platforms are required for the success of GWF.

Description / markup (metadata).  Given the size and scope of the GWF program, developing workflows for metadata creation, description, and enrichment is required.  Project-level, data-level, and administrative-level metadata should be collected, and whenever possible, standardized to existing practices and standards.

Storage / backup. Securing centralized backup facilities to serve as housing for original as well as the processed and derived research data collections is of the utmost importance for the GWF program. Given the high number of co-I’s and HQP, different methods of data transfer, and geographic distance from field to lab, facilities with efficient data recovery procedures would create a reliable work environment for the GWF research community.

Security and Access. GWF collects a large amount of sensitive data, including traditional knowledge and data collected with indigenous partners, data from government and industry partners, and data collected from human participants.  There is a stark need for security provisions and embargoes, as well as an understanding of the need for controlled access for researchers, HQP, and community stakeholders working with the data.

Training.  HQP and Investigator data management training has been identified as a key Data Management requirement. 

3b. Data Governance 

In order to assist researchers in managing their data and to facilitate the consistency of data from across the program, structures will be developed to serve as guidance tools.  These include the following:

Define Best Practices and Processes. Working with the SMC and subject matter experts develop best practices documents to guide researchers

Formalize a structure of roles and responsibilities. Define rights and obligations of participants (data embargoes, expectations for movement of data into the repository, etc.)

Communications.  Develop means of communicating data management information to researchers

3c. Data Stewardship

To assist projects in meeting the goals of the Data Management Strategic plan, the Data Management team will work to capacitate researchers with the techniques and tools to properly steward their data throughout their research program.  Such initiatives will include:

Managing Data Assets. GWF Data Management Team will be developing a framework to manage and oversee the data assets in order to provide high quality data in an accessible and consistent manner.

Preservation and Discovery. Providing the means and mode for the preservation and discovery of research data collected by GWF is a key requirement of the Data Management program.  

Data and Metadata Enrichment. Developing practices and workflows to enable data and metadata enrichment prior to ingest, QA/QC patterns, and solutions for hosting, indexing, and enabling access and sharing of data are needs of the program.

3d. Provision of Input and Assistance to Data Related Initiatives

GWF Data Management Team will work with the Computer Science Team on the following initiatives:

Improvement of delivery of data and knowledge. Assistance in delivery of large datasets and complex model inputs and outputs to and from diverse stakeholders. Support in testing and delivering the specialized information queries as well as the visualization and interaction modules.

Improvement of management of sensor data. Help in developing an efficient approach to integrating spatially and temporally dense environmental data streams from terrestrial and remote sensing monitoring platforms in order to support science-based decision making for end-users from diverse geographic settings.

Data Management toolset and framework.  Help in development of the framework and tools to support an at-scale system that is responsible for storing, managing, processing and analyzing data collected throughout the life of the project.

GWF Cloud Components. Help in building an understanding of the existing and future data to inform the development of the PaaS (Platform-as-a-Service), which provides a data management platform for collecting, storing, integrating, sharing and managing GWF data. Support SaaS (Software-as-a-Service), platform for analyzing, visualizing and supporting scientific workflows for GWF and data related support for mobile end-user and citizen science applications. Providing all necessary activities to ensure the data is appropriately integrated into the GWF cloud.

The GWF Data Management Team will work with the Core Modelling Team on the following initiatives:

Input / Output / Format. Obtaining and managing inputs, and outputs of the environmental models; assisting as needed with the data extraction and archival processes; assisting as needed in providing the standardized input and output formats for most common models.

Data Access Support. Providing support to create accounts and access various storage and processing resources on Compute Canada Graham allocation designated for the GWF use; providing support to access other processing and archival space as well as services available through the University of Saskatchewan, McMaster University, Wilfrid Laurier and University of Waterloo networks.

In summary, the Data Management Team will work with Core Teams and Individual Projects to identify or develop procedures and processes to assist researchers in managing their data. The team, in collaboration with other personnel, will also work to support Reproducible Research through workshops on managing and reusing the scripts for various levels of data processing. Additional workshops will be offered to help researchers move toward a tidy data (or similar) approach that allows data to be more easily ingested and shared.

4. Contact Us

Please direct questions, ideas, and/or concerns to your local GWF data manager from the list below.  We look forward to assisting you with implementing data management processes in your research.

  • No labels