Fifty thousand dollars is up for grabs for the winners of the Materials Science and Engineering Data Challenge, launched earlier this year by the US Air Force Research Laboratory (AFRL), National Institute of Standards and Technology (NIST), and the National Science Foundation. The goal: to leverage existing digital data to incentivize advancements in materials science and engineering knowledge.
The Challenge calls upon participants, or “solvers,” to submit data analysis approaches that significantly accelerate either the discovery of new materials to meet an application need, or the development of new models describing processing–structure–property relationships. Data used in the Challenge must be publicly accessible and have sufficient supporting information, or metadata, to enable reuse by researchers who did not originally generate it.
The US federal government’s Challenge.gov platform, which has facilitated hundreds of federal agency competitions since it was established in 2010, is hosting the Challenge. It is the first such challenge that focuses broadly on materials science, and also the first that expressly supports the Materials Genome Initiative (MGI), a multiagency effort now in its fifth year that intends to significantly reduce the time and cost to bring new materials to market by more closely linking experimental tools, computational tools, and digital data.
The Challenge emphasizes two aspects of digital data—data access and the application of computer science techniques to analyze materials data. In addition, the Challenge criteria place a priority on gaining new insights through data reuse—an approach more common in fields like biology and astronomy.
“Something like two-thirds of the research done on Hubble data wasn’t done by the original PIs [principal investigators]. It was done by other people going back into the data,” says Jim Warren, Director of the Materials Genome Project at NIST. “If [the participants] use their own data, it’s a small change from the typical research modality. We’re looking for someone who’s been much more creative in assembling information from at least one source outside their own work.”
Chuck Ward, Lead for Integrated Computational Materials Science and Engineering at AFRL, sees an increased return on investment from analyzing reused data. “For example, numerous tensile experiments have been performed on single-crystal materials, yet typically only one point off the tensile curve gets published. Now for someone to study another portion of the curve, they have to reproduce all these tests to inform the development of their new model. . . . It’s terribly inefficient.” Ward is quick to caution that the Challenge does not address everything related to data and MGI. “We can also get value from data reuse through secondary hypothesis testing or corroboration of your own experimental results. Another area would be model validation.”
The Challenge’s emphasis on digital data and access coincides with a broader movement toward open data. In response to a 2013 White House memo requiring public access to the results of federally funded scientific research, US science funding agencies have been updating their data management policies for grantees.
In the spirit of open data, Citrine Informatics, a materials data analytics platform, is providing Challenge solvers with access to its database containing almost 3 million materials-property pairs aggregated from a variety of sources. “It became clear that there aren’t that many publicly available data sources from which teams could draw,” says Greg Mulholland, one of Citrine’s founders. “We saw this as an opportunity to be a provider of that programmatic, structured data.”
Mulholland elaborated, “People have a philosophical sense that data should be open, but very little open data in the materials community has yielded massive discoveries yet. . . . The biggest possible success of the Challenge would be for the community to understand why it’s important for these data to be made available.”
Other organizations are also providing resources to participants. Elsevier’s Materials Today has partnered with HPCC Systems, a high-performance computing platform, to provide training and computation time, and Springer will provide solvers with free access for a limited time to their SpringerMaterials database of over 3000 physical and chemical properties of more than 250,000 materials and chemical systems. The Materials Accelerator Network, a partnership among Georgia Institute of Technology, the University of Michigan, and the University of Wisconsin, has assembled on their website a comprehensive list of available resources.
For the federal agencies, the challenge mechanism offers advantages over a traditional grant solicitation. Both Warren and Ward expect the concept of a challenge to garner excitement and better awareness of the value of digital data in the materials community. “We’re putting $50,000 in and . . . we’ll leverage far more effort than with a single $100,000 grant,” Ward says.
He also believes the mechanism will be effective because of its broad scope. “We don’t specify you must solve this problem. . . . We’re giving folks the opportunity to demonstrate possibilities that we may not be able to put down on paper in a normal solicitation, and I’m hopeful we’ll see new insights or discoveries, or even new ways of thinking about how you might use data that we haven’t thought of yet.”
The Challenge is open to everyone—internationally. Solvers should submit their entries, in the form of a written research report, by March 31, 2016. Along with receiving monetary prizes, winners will be invited to present their work at the Materials Science & Technology 2016 Conference in Salt Lake City, Utah.