Applications and experiments in all areas of science are becoming increasingly complex and more demanding in terms of their computational and data requirements. Some applications generate data volumes reaching hundreds of terabytes and even petabytes. Sharing, disseminating, and analyzing these petascale data sets becomes a big challenge especially when distributed resources are used. The Data Intensive Distributed Computing Laboratory (DIDCLab) at SUNY Buffalo (UB) aims to develop cutting-edge research tools and software to mitigate the data handling bottleneck in distributed environments.
The research tools and software developed at DIDCLab will not only impact computer science research by changing the way data-intensive computing is performed, but it will also dramatically change how domain scientists perform their research by facilitating rapid analysis and sharing of large-scale data and results. Future applications will be able to rely on these tools to manage data movement and storage reliably, efficiently and transparently. It will help the scientists start thinking about totally new scenarios where simulations are closely coupled with large amounts of observational and experimental data, which will revolutionize the data-intensive science.