Friday, October 6, 2017 -
12:00pm to 1:00pm
Add to Calendar
100F Pierce Hall, 29 Oxford St., Cambridge

Atmospheric & Environmental Chemistry Seminar

"Six Million Deep Wells – Insights into 158 Years of Hydrocarbon Operations thru Machine Learning and Geospatial Analytics" with Kelly Rose, National Energy Technology Laboratory.

Reliable and accessible data is fundamental to scientific inquiry and the basis for all current empirical understandings of our physical, natural, and socio-cultural world. The annual volume of new data created amounts to thousands of exabytes, and is forecast to continue grow at nearly an exponential rate thru 2020. As a result, more data are available to researchers and rapidly outpacing traditional methods used to ask and answer scientific questions. The data revolution offers the opportunity for unprecedented inquiry but also poses unprecedented challenges with regards to finding, accessing, integrating, and utilizing data to meet a range of end-user needs. These opportunities are particularly challenging when dealing with spatio-temporal datasets. Here we explore how spatio-temporal data science methods and custom big data algorithms have been applied to the acquisition and analysis of global oil and gas infrastructure data.

One database, developed via traditional data acquisition and integration methods, contains information for the global catalog of deep subsurface wells including oil and gas, groundwater, research, carbon storage and geothermal drilling records. This global dataset spans over two centuries of drilling and includes more than six million wellbore records. Spatial and temporal analyses performed using this dataset provide insights into the implications of human engineering of the subsurface worldwide, including the degree to which the subsurface has been perturbed by drilling related activities, and how exploration and interaction with the Earth’s crust has evolved due to technology and historical trends.

A subsequent database, contains a catalog of open, online oil and gas infrastructure data.  This database was developed through a combination of conventional search approaches and a custom machine learning, cluster computing tools to parse the web and return priority datasets.  The resources found using these two approaches were integrated into a geodatabase and used to assess the quality and quantity of oil and gas infrastructure data and associated attributes worldwide.  Analysis of infrastructure features and data quality for the top 46 oil and gas producing countries was preformed to facilitate understanding of infrastructure distribution globally as well as provide a foundation for more advanced data gap mitigation efforts and infrastructure-environmental analyses going forward.

Research, scientific, and engineering data, including spatio-temporal datasets, are increasingly available through open, online systems. The deep global wells and open, oil and gas infrastructure data acquisition and spatial data studies are examples of how collaborations between scientists, engineers, computing scientists and other data science disciplines can more efficiently find, access and utilize these types of resources in advanced data driven analyses.

Contact Name: 

Jianxiong Sheng

Research Areas: 

Harvard University
Center for the Environment

Address: 26 Oxford Street, 4th Floor, Cambridge
Phone: (617) 495-0368

Connect with us

Follow HUCE to stay updated on energy and the environment at Harvard and beyond.

Subscribe to our mailing list