Background
Today, scientists, engineers, educators, citizens, and decision-makers have unprecedented amounts and types of data available to them. Data come from many disparate sources, including scientific instruments, medical devices, telescopes, microscopes, satellites; digital media including text, video, audio, e-mail, weblogs, twitter feeds, image collections, click streams, and financial transactions; dynamic sensor, social, and other types of networks; scientific simulations, models, and surveys; or computational analysis of observational data. Data can be temporal, spatial, or dynamic; structured or unstructured. Information and knowledge derived from data can differ in representation, complexity, granularity, context, provenance, reliability, trustworthiness, and scope. Data can also differ in the rate at which they are generated and accessed. The phrase “big data” refers to the kinds of data that challenge existing analytical methods due to size, complexity, or rate of availability.
The challenges in managing and analyzing “big data” can require fundamentally new techniques and technologies in order to handle the size, complexity, or rate of availability of these data. At the same time, the advent of big data offers unprecedented opportunities for data-driven discovery and decision-making in virtually every area of human endeavor. A key example of this is the scientific discovery process, which is a cycle involving data analysis, hypothesis generation, the design and execution of new experiments, hypothesis testing, and theory refinement. Realizing the transformative potential of big data requires addressing many challenges in the management of data and knowledge, computational methods for data analysis, and automating many aspects of data-enabled discovery processes. Combinations of computational, mathematical, and statistical techniques, methodologies, and theories are needed to enable these advances.
On March 29, 2012, the White House announced the Big Data Research and Development Initiative to mobilize the research and development toward Big Data analytics for solving many of the nation's most pressing challenges. A great many agencies are involved, spanning from National Science Foundation (NSF) and National Institutes of Health to the Department of Defense and the Department of Energy. Signal processing and systems engineering communities can be important contributors to big data research and development, complementing computer and information science-based efforts in this direction. Big data analytics entail high-dimensional, decentralized, online, and robust statistical signal processing, as well as large, distributed, fault-tolerant, and intelligent systems engineering. There is a need and opportunity for the signals and systems communities to jointly pursue big data research and development.