MM 2022-10-06

The Challenges of Managing Big Data for Resource Estimation

As we discussed in Article #1, in mining, billion-dollar decisions are typically based on the physical analysis of a very small amount of material, while the bulk of the material to be mined, both overburden/waste and the mineralised orebody itself, remains unexamined. To help improve the quality of resource estimates, especially as economic orebodies become increasingly complex, geologists need more data from other sources.

But managing big data can be challenging.

Storage

The process of acquiring, validating and analysing the base data for resource estimation is time-consuming and expensive, which means mining companies must consider the value of the information and knowledge derived from that data when determining how they will store it.

They must also decide how long to store it for: it may take years or even decades before a company makes the decision to mine, while the mining operation itself can take place over decades, so the lifecycle of the data is also long.

Even mining data that is decades old can remain valid and useful for analysis/modelling if appropriately stored and, most importantly, still available. Currently, however, geologists often store the initial data they collect during the exploration phase on a laptop, which both limits access to this data by other project teams and increases the risk that the data — and its potential value – could be lost at any time if a geologist changes roles or the device is retired.

Multiple sources

The sheer range of data available for geological modelling and resource estimation also presents its own challenges, both for storage and for easy retrieval and use by geologists and other stakeholders.

The base data for geological modelling and resource estimation today comes in a wide variety of types — from lab results supplied directly from Laboratory Information Management System (LIMS) systems, to the description of the diamond drilling core from which the physical samples were extracted — that can be classified as either hard or soft. Hard data can be used directly in the estimation process, while soft data can assist with identifying trends, such as increasing levels of deleterious elements or areas with poor processing properties, and provide additional insights into correlations between all the data.

Typically, data acquired via direct measurement or analysis using robust methodologies, such as rigorous quality assurance/quality control (QA/QC) processes, is considered hard data. Soft data, on the other hand, is inferred from other measurements, such as:

downhole geophysical surveys to obtain an indication of density (often used to automatically select the top/bottom lithological contacts), or
multi-/hyper-spectral core scanning data used to estimate mineralogical composition rather than directly determining it in a laboratory using a physical sample.

Soft data can also include indications of quality or amounts of mineralisation from hand-held/portable X-ray fluorescent (XRF) analysers, while relatively new sources of data, such as the penetration rate of the drilling collected by data historians on drilling rigs or from real-time shovel/belt scanners, can also be informative.

Then there is the metadata — additional details such as the time of day the data was collected and the person, company or piece of equipment that collected the data — which is vital for confirming if the data is in the original form, if it has been manipulated or adjusted, or is a calculated average.

The result is a database made up of a diverse collection of text files, Excel spreadsheets, resource models in proprietary binary format files, and data stored in geoscientific information management software packages to core scans — which can take terabytes of data — often collected at different times and by different people/equipment.

Data lifecycle

In addition, large amounts of data from multiple sources acquired over many years adds challenges to both data domaining (dividing the rock mass into volumes with similar characteristics that are distinct from each other) and the Mineral Resource classification process. The geologist must consider the lifecycle of the data used in resource classification, and find a way to accommodate and flag drilling results and other data with lower confidence (or which failed the validation test) while not losing portions of that dataset, such as lithological/structural interpretations, that could still be used for resource modelling purposes.

Also, as more data is collected, the geologist may deem historical data with no or inappropriate QA/QC less reliable for use in mineral resource estimation, and must have a way to incorporate this finding into the database to ensure only the highest quality data is used. For example, if newer, more accurate collar/downhole surveys or laboratory analysis methods highlight weaknesses in previously collected data, that new data could make the use of historical data (such as lithological contact positions or assay information) inappropriate, depending on how the data is used in the resource definition and estimation process.

The same might happen with biased historical data. Bias usually only becomes apparent and downgrades confidence in the data after a considerable period of time. It is crucial to maintain all metadata so that the data does not have to be revalidated before it is used in each resource update cycle.

Requirements for good database management

Properly managing a resource estimation database that includes an array of big data will ensure that technical and financial risks — from borehole core to sale of the commodity — are identified and addressed over the life of the mine and well into closure. To do this, geologists need to be able to:

Discriminate between robust (hard), secondary (soft) and metadata that also needs to be included in the resource database, and to store their reasons for considering the data suitable for estimation or not.
Maintain the integrity of the resource data to ensure that the level of confidence (low to high) in the data can be used to appropriately:
- classify the confidence level of the resource estimates, and
- determine the risk profile of the decisions based on those estimates.
Control access to the database to:
- ensure that only validated and approved data (as opposed to raw data on which the QA/QC has not been verified) is used in the resource estimation process, and
- identify where other data has been confirmed as suitable only for modelling the geology (such as the extent of the mineralised lithologies) as opposed to estimating the mineral content itself.
Provide proof of a strong chain of custody for all data that will confirm, for example, that assay data has not been manipulated. This proof will increase confidence in the estimates during external independent reviews, and illustrate that the database is being well governed — a vital consideration for financing.

What’s next

The two articles that follow in this technical series discuss in more detail methods for responding to the challenges of managing big data, including:

How to use big data in resource estimation, and
How to choose the right technology platform for integrating big data

A related technical series, called How to Use Machine Learning in Resource Estimation, will follow after this one. Topics in this second series include what machine learning is and how it works, along with how it can be used in automatic data domaining to provide geologists with the most suitable sub-datasets to use in estimating distinct volumes of the orebody.

Author

Michael Mattera is a Mining Industry Process Consultant at Dassault Systèmes GEOVIA with 30 years of experience in Industry. Michael holds an MSc (Engineering) in Mineral Economics from the University of the Witwatersrand. He has experience across a wide range of commodities and geographies leading to a broad understanding of multiple mining disciplines and associated technical systems. This experience includes resource modelling and estimation, multi-disciplinary project reviews focusing on Mineral Resources (PFS to Post Investment stages), public reporting of Mineral Resources and Ore Reserves (R&R) in multiple jurisdictions, associated governance and assurance processes and development of multiple R&R reporting systems.