We've all, probably heard the term big data,
another term that can be many things to many different people. I'm referencing here a definition of big data
that isn't in a paper called data analytics in the era of industry 4.0 by James Arrow
and some other authors. This was in the AACE source magazine in April 2019. And during this presentation, the graphics
that I grabbed from Booz Allen Hamilton was actually in this paper. And so there's some things that I'm referencing out of this paper in this presentation. I would recommend you also search out that paper and read it. But again, I'm trying to use the term, in
general, to refer to the capture of all the data that may be useful to our organizations. And that provides value in one way or the
other. Of course, we've always been able to capture data again. When I first started working for this EPC
firm, back in the 1980s, they had a vast library almost a half of a floor of a building dedicated to the library that collected project closeout reports. At that time, each of those project closeout
reports was comprised mainly of four or five, multiple four or five-inch wide binders containing a combination of mostly handwritten reports, maybe some typed reports, and usually some very poor quality copies of existing projects supporting documents. So obviously it was difficult and time consuming for me as an estimator to find the data I was maybe looking for. And at that time it was very difficult to
try to work up the analysis of the data. I was basically trying to manually perform
calculations from the information I was getting using a 10 key calculator. So, you know, boy, I've changed. Things changed since then. Just a few years ago, we may have thought
we were all stars and being able to create pivot tables and made me run regression analysis using spreadsheets. Well, now we're on the verge of using artificial intelligence and machine learning capabilities that can quickly sort through years of digital project records, finding the specific projects and data that match the characteristics of
our new project. I'm not trying to do that in any manual way. I'm simply going to specify the characteristics and let artificial intelligence and machine learning sort through all this and collect
just that information that is pertinent to me and present that back. It's going to automatically normalize that
data for time and location and currency by accessing both historical and current economic indices. Again, that might be able to support discovering
the best set of metrics that I might be able to use to validate my estimates. So our organizations need to prepare then
for the collection of big data and the big data technologies and the technologists, those
information specialists are basically concerned with the following characteristics, often
known as the 5D's. They need to be concerned with the volume
of data to be collected and stored as well as identifying where to store that. Obviously, data storage capabilities have
increased exponentially and they're still increasing while getting less expensive all
the time. So we're all going to become getting used
to the terms of terabytes, petabytes, exabytes, zettabyte but the bottom line really is that
we have the information to inexpensively store, vast amounts of our project histories. So we need to start collecting all that data. Now, the analysis may be in the gun of the
future. Let's start collecting that data. So we have it available. We're going to be concerned with the variety
of data. And what that really means is it's more important
to collect data than not no longer constrained by the volume of data. We can start collecting both structured data
as well as unstructured data. That unstructured data can be analyzed by
the computing power, by our AI and machine learning systems, to organize it as needed
in the future. So greater and greater computing power is
going to make sense out of all this unstructured data. So we can start collecting not only paper
data, computer records. We can also start collecting, video presentations
of meetings. We can collect audio meeting minutes. All of that data will eventually be able to
be handled by these new technologies. We need to be concerned with the velocity
of data, the speed at which it's created and captured. And nowadays we can collect a lot of data
in real-time, as opposed to just add a set interval which has been common in the past. So now I can track actual hours that construction
equipment is being utilized on a minute by minute basis for a current project. I might then be able to use that information. I can refer to that and use that to help support
estimating the construction equipment costs as part of the overall construction indirect
costs for a new project, I do need to be concerned with the veracity, the quality of the data
collected, but again, I'm going to be able to rely on artificial intelligence and machine
learning algorithms. That's going to be able to sort through this
data and discover and identify where I might have gaps in our errors and data. And then I can deal with that accordingly. And finally, I need to deal with the variability,
the considerations of the consistent consistency of the data collected the potential range
of values within all these different types of data. But this is what the technologists are going
to be worried with. I need to understand these issues, but we're
going to rely on the information technologist to try to solve that. And there's a lot of, and I've mentioned some
of the words, there's a lot of various technologies that are related to big data. So obviously computing power, data storage,
and software, such as those incorporating artificial intelligence and machine learning
concepts. The internet of things is related to the connectivity
of all of these things. And yes, the internet of things is a real
term. It's what makes your smart home work. It turns on the light. When you enter a room, it may automatically
start your car. Simply as you approach the door, it's a technology
that is supplemented by things such as near field communication devices, radio frequency,
identification, sensors, and those sorts of things. These are the things that are going to allow
me, how I could determine the, how much where construction equipment is utilized on my project
and how often it's utilized is because it's going to have a near field communication device
attached to it, communicating back to a database. So it's tracking idle time when it's used,
where it's used. And again, all of that information may be
useful to me as an estimator to help prepare my estimates. It's not, unrealistic to consider that eventually
all the major components installed on a facility will have a radio frequency identification
device attached to it. That's gonna identify, when it was purchased,
when it was installed, when it was lasting maintained. Every worker at the construction site may
have a badge or an ID also containing an RFID device, communicating on his daily activities,
the tools he's using, the materials he's requesting, and we're going to have access to all of this
real time data for similar projects that we as estimators can then use to better plan
with determine the best hours to use in our estimate to install that lineal meter or pipe
or that cubic meter of concrete foundation. I'll have better information to understand
what was the true productivity adjustment. When piping was installed on the fifth level
of a structure, as opposed to the ground level of a structure. We're going to be in a position to vastly
improve our estimating capabilities, to cost price, and validate us estimates. In this data-driven future then we're going
to be able to collect in a digital environment, much more data and information than we ever
could with paper. Again, what do I do with it? Well, we're going to be able to enhance our
data collection through data analysis. Again, this amount of data will be overwhelming. It's going to be Data Mining, Artificial Intelligence,
Data Analytics, and the Development of predictive cost models that's going to turn all of that
collected data into useful information to help us improve our estimating capabilities
and thereby support, better decision making by our stakeholders. Artificial intelligence and machine learning
capabilities are going to find the data for us sort through it, normalize it, see that
it meets quality expectations, and provide the analytics to get us the answers that we
as the estimators are looking for, I'll be able to access every purchase order for every
project CrossFit organization in any continent to develop a model for estimating the cost
of a heat exchanger or quickly access the online vendor catalogs of 50 different vendors
to obtain a current price for a six inch gate valve. I'll be able to access government and other
economic indices to generate the best escalation forecast for every category of commodity in
my estimate. Based on macroeconomic models of raw material
pricing and expected labor conditions and other market conditions. So this AI engine, the Artificial Intelligence
engine is going to be able to research detailed project records to more accurately determine
the adjustments that affect labor productivity for heights congestion distance of the work
from the laydown areas, and just a, a myriad of other considerations. We're going to have the data available. We need to be able to describe what we want
out of the data. And then our computing power is going to be
able to find sort and present that information back to us. Fuzzy logic, Artificial neural networks, Regression
analysis, Case-based reasoning, Random forest, Evolutionary computing, Machine learning these
are all the concepts, the technologies that are going to enable advanced analytics that
we can apply in our estimating processes. But again, as I've kind of mentioned, we don't
have to be experts in any of those technologies. We're gonna rely on the information management
specialists to implement those solutions for us. What we do need to understand as estimators
is the data that is out there, that's available, which specific data to combine to provide
the information and the metrics that I'm interested in. And then the technology will work for us to
get the answer we're looking for. We're going to be in a much better position
to compare similar current and historical projects based on cost schedule and other
performance metrics to support estimate validation. We're going to be able to, to access normalized
historical data, in a great way. These technologies are not going to replace
us as estimators. They're going to augment our capabilities
so we can spend more time on the analysis rather than that tedious work of scope, definition,
costing, and pricing. So these advanced analytics is going to support
improved information and validation. The result is that we can provide better information
to support decision making that relies on our cost estimates. So again, we're going to be able to enhance
estimates through these types of technologies, and these improved, predictive metrics.