Streamlined Data Consolidation: Users will be able to upload their raw data files directly to an extended version of the LONI infrastructure where they will automatically be classified, converted, and annotated. By automating much of this process, researchers both uploading and downloading data will be spared the time and effort currently involved in accessing and sharing epilepsy data. This streamlined data consolidation will increase the financial efficiency and scientific productivity of the CWOW and the broader epilepsy research community. In addition, physical samples will be brought together in a single biobank, further reducing coordination challenges. Much of these data have already been collected by the PIs presenting this proposal; however, the huge size of these data combined with the many different file formats makes effective navigation currently frustratingly labor-intensive and error-prone.
User-friendly data search and navigation: By converting data to consistent file formats and tagging that data with metadata, the CWOW will enable Google-style search of all available epilepsy data. However, because data will be interlinked and co-registered across data sets and modalities, the search functionality will not simply match data against individual items like Google—rather, it will find interlinked combinations of data (even across modalities and data sources) that match the desired criteria. This enables sophisticated custom searches that match the functionality of predefined query forms. Users will be able to browse data in its most appropriate visual representation and pivot from one data view or modality to another. Through our experiences with LONI, we have learned the access control and sharing mechanisms required by the community and how effectively to enable inter-project as well as community-scale data sharing. Key components include giving users explicit access control for their data and results as well as providing project groups for larger-scale permissions management. Furthermore, comparable tools that can be repurposed were developed for LONI as part of our planning grant and projects like ADNI.
Automated analysis: The LONI Pipeline contains a common framework for visual and programmatic construction of data-driven workflows for electrophysiology, imaging, and biosample data. We will customize these tools to the study of epilepsy data. With the aid of LONI’s workflow builder, complex analyses are represented visually, further supporting researchers’ investigations. Examples of Pipeline applications include developing a unified coordinate space for seizure locations across organisms, using string similarity and value overlap to predict that different contributor metadata fields are the same, and providing graphical interfaces for linking data. Co-registration algorithms will typically be invoked at upload-time but may be triggered later manually for further refinement. The IAC will provide MRI supervision and integration from different scanners and centers by supervising phantom studies, assessing quality, and fixing problems with heterogeneity. While LONI has primarily used Pipeline for human data, it has also been used to study neural networks of the mouse neocortex. The CWOW would further expand these capabilities, providing robust workflow pipelines for both humans and animal models.
Iterative improvement using novel analytical tools: The sheer quantity of data and the noise inherent in the data necessitates the development of novel analytical tools, incorporating the most recently developed mathematical and statistical tools to discover previously undetected biomarkers. Dr. Bragin et al. recently discovered a novel biomarker, repetitive high frequency oscillations and spikes (rHFOSs). Dr. Gotman, a consultant on this project, has established tools to study the relationship between spikes and HFOs and showed their links to epileptogenesis. Dr. Duncan has developed sophisticated mathematical methods to analyze both animal and human data separately and for trans-species comparisons.
We are keenly aware of multiple conceptual and technical issues regarding data analyses with biomarkers, including within subject correlation, multiplicity, multiple clinical endpoints, and selection bias. Utilization of multiple statistical approaches will allow us to address these concerns in full.
Standardized sample collection, shipping, and biobank storage protocols: The project will define methods for harvesting, freezing, and storing tissue and other biosamples (i.e. serum). Data and tissue stored for collaborating preclinical trials, such as TRACK TBI, ALLO, PPMI, TRACKHD, ICBM, AIBL, ACE, ABIDE, 4RTNI, Mapp, and the Human Connectome Project already have these protocols in place with informatics provided by LONI. Animal protocols for storing parallel samples to humans will be stored and treated in a similar fashion to compare findings from parallel human and animal studies.
A data safety monitoring board (DSMB) will advise us as we perform a rigorous multicenter preclinical antiepileptogenesis trial using a blinded, vehicle-controlled randomized study design to determine the antiepileptogenic effect of the lead compound. The results of the three projects integrated with the IAC, following the close guidance of the DSMB, will assist in planning the optimal design of a future clinical antiepileptogenesis trial for successful drugs.