26 January 2021
Los Angeles, CA

 

Data Den is a thought-leadership alcove within the world of Beyond Limits where we provide an opportunity to dive into the minds of our gifted data scientists to get a better understanding of their domain. Keep reading to catch a glimpse of their essential expertise; without it, artificial intelligence wouldn’t be possible.
 
 
 
 
 
Let’s talk a little about a project you worked on recently. What were the biggest challenges you faced as a data scientist working on that project?
 
I’ve been working on a well management project for an oil and gas supermajor for a while now that involves the kind of tech that “technical people” get really excited about. Tech-centric professionals get excited about the technical challenges at the interface of science and machine learning. Though, the real challenges emerge when you try to put all the relevant pieces together to create a coherent system that intuitively executes a complicated workflow.
 
All data scientists have their own specific, individual tasks they’re working on; it can become really easy to fall prey to the minute details of any one of those tasks. Though, at the end of the day, everyone is creating a lot of moving pieces that need to be able to work together. It’s always fun to dig into the tech, just don’t get bogged down in it because the process isn’t just about data science. The ability to carry out effective evaluation is also about working with databases, the software team, and perhaps internet-based API’s (if the project calls for it). There are a lot of moving pieces that all need to be in sync and talking with each other in order to truly be something.
 
 
Steering into the topic of the data itself. How do you handle missing data? What techniques do you recommend? Is there a specific project, as an example, that comes to mind?
 
There are a lot of different methods for dealing with missing data that range from very simplistic statistical and interpolation approaches to advanced machine learning neural network techniques. With machine learning, you can utilize neural networks or autoencoders to essentially learn behavior from a group and infill those missing data sets as much as possible. You can also make use of the other analog data to which you do have access in order to try and predict what might make up the missing parts. Those are some fairly commonly implemented techniques – but everything depends on each particular application and determining the appropriate method to successfully carry out a particular project.
 
 
 
 
One thing that’s different about the projects I’ve worked on (while we do use those methods a lot) is that, at Beyond Limits, we tend to handle bulk discrepancies using knowledge-based data. So, in the attempt to supplement missing data, we tease out the models for certain behaviors and feed those to a knowledge base that then interprets them. If I don’t have a direct measurement because I don’t have enough measurement locations in a system, or I don’t know the system-state to make a model, this comes in very handy. For example, an expert with twenty years of experience can tell me – based on the signals they do have – exactly what is happening underneath, which we then embed into a knowledge base that essentially interprets other signals in the context of that knowledge to provide a more meaningful interpretation.
 
 
Can you talk a little about codifying domain expert knowledge into an AI system?
 
There are several different avenues we have taken in the past to go about this. Our most commonly traversed path would probably be the same we took to create one of our geological modeling solutions. Essentially, in most situations, we gather an expert’s (or a set of experts’) expertise, along with common industry knowledge captured from research papers, textbooks, etc. If the gathered information is in a nice clean state, we are able to more easily ingest it. However, that is a fairly rare occurrence. In most cases, we want to understand the problem and how existing knowledge relates to that problem. Once we can grasp that understanding we can codify the information into our proprietary IP format.
 
 
 
 
For example, with the aforementioned solution, the idea is to have the ability to model a geological environment based on a number of measurements. We would utilize high resolution one-dimensional measurements of the asset in question along with large scale low resolution, three-dimensional seismic measurements that portray the inside of the earth. Then we correlate these measurements to some hard data samples that may exist pointwise throughout some three-dimensional space. However, we may not have sufficient data to understand the kind of system we are dealing with or to determine what the actual environment looks like.
 
To complete the analysis we can supplement this data using scientific knowledge, like a particular physical process that’s taken place over a long period of time – we can look at systems and know that those processes occurred in a certain order, as well as their relationship to one another. How we interpret such indirect measurements is different based on the type of system we are looking at but the principles of how we incorporate knowledge to supplement the interpretation remain the same. We can build such knowledge and relationships into our knowledge base with interpretations of those indirect measurements that are consistent with the actual physical processes and theories about such a system. So, what we are essentially left with is the ability to build in the true physical relationships of a geological system while using machine learning to automatically understand and build a 3D representation of that system. We’re not just using neural networks to train; we’re actually using the expert knowledge and common principles of science to do so.
 
 
WANT TO LEARN MORE ABOUT OUR PIONEERING SOLUTIONS? TAKE A LOOK AT OUR TECH HERE.
 
 
 
 
Dr. Michael Krause is Senior Manager of AI Solutions at Beyond Limits, a pioneering artificial intelligence engineering company creating advanced software solutions that go beyond conventional AI. Michael specializes in subsurface machine learning with experience spanning major initiatives at supermajors to next-generation digital transformations at small independents. Prior to joining Beyond Limits, Michael was Director of Analytics at Tiandi Energy in Beijing, China, and later at Energective in Houston, Texas. Michael holds a Ph.D. in Energy Resources Engineering from Stanford University.