Pistoia Alliance SEED project to unlock valuable data trapped in electronic notebooks


The Pistoia Alliance, a global, not-for-profit alliance that advocates for greater collaboration in life sciences R&D, today announced the second phase of its Semantic Enrichment of ELN Data (SEED) project. The project addresses the challenge facing R&D from the vast volumes of captured experimental data locked in Electronic Lab Notebooks (ELNs). These unusable and unsearchable data sets are a significant barrier to digital transformation; resulting in duplicated experiments and time spent tracking down and wrangling data. The SEED project phase 2 will build on the success of phase 1 developing relationship mapping and a prototype of an agnostic solution for semantic enrichment for use by ELN vendors. This solution will make ELN data searchable and reusable by semantically enriching free text in ELNs with metadata for every relevant term, unlocking its value for future analysis and aligning with the FAIR principles.

“Currently, pharmaceutical companies can only use a limited amount of valuable data held in ELNs, even though semantic technology is available. This situation has to change, and the Pistoia Alliance is in the unique position of being able to bring together multiple large pharma companies pre-competitively to address the challenge,” commented Gabrielle Whittick, Project Leader and Consultant, Pistoia Alliance. “This kind of cross-industry collaboration is made possible under the umbrella of the Pistoia Alliance as every member can contribute with their experience and knowledge gained and this shared input drives improvements that the whole community can benefit from. The SEED project will annotate and enrich the text to make it searchable – offering the possibility of uncovering new insights that can accelerate drug discovery and lead to new innovations. We are now calling for more companies to get involved in and provide funding for the second phase of the project so we can scale up our work.”

“Pistoia Alliance SEED project to unlock valuable data trapped in electronic notebooks.“

Phase 1 of the project developed new standard assay ontologies for ADME, PD and drug safety which have now been added to BioAssay Ontology (BAO), an opensource database of common assay metadata terms and definitions, and are freely available to the life science community. The project contributors include Pfizer, AstraZeneca, Bristol Myers Squibb, Scibite, Bayer, Biogen, Southampton University, GSK, CDD, Elsevier, Linguamatics, Merck, Sanofi, and Takeda. Pistoia Alliance member Sanofi is already realizing value from the project and plans to align its ADME assay metadata with the new ontology classes added by the SEED project to BAO. This will make Sanofi’s assay data compliant with the FAIR principles.

“The driving motivation behind the initiation of the SEED project is to create a set of open standards for structuring ELN data across Pharma and the life sciences. Delivery of Pharmacokinetic-Pharmacodynamic (PK/PD) and Drug safety assay standards has been a tremendous start and of exceptional benefit across the many partners involved, as well as for those yet to join,” commented Steve Penn, SEED project champion and Medicinal Sciences Information Strategy Lead, Pfizer. “We are now looking to increase the benefit across pharma, incorporating additional data in the form of attributes, mappings and annotations to create relationships between ontology classes to help to describe and define them. The relationships formed between objects within an ontology and to other ontologies and/or standards form a framework for the creation of a graph ontology/knowledge graph. Enabling a plethora of opportunities associated to usage of these data standards, both for legacy and go-forward data.”

See all the latest jobs in Science
Return to news