How Machine Data Can Predict Downtime

Frank Raczon, Senior Editor

Predicting machine-down-type faults before they happen may be the Holy Grail of fleet management.

A wealth of telematics data is available to help for both off-road and on-road fleets, but what are the best ways to look at data? What principles can guide the process to make it uniform, repeatable, and predictive?

Data science is the answer, but don’t be intimidated—a data scientist doesn’t have to be a doctor with a pocket protector brought into your organization at a high price.

In a presentation given at Trimble Dimensions 2016 in Las Vegas by Trent Lezer, VP, Vusion Data Science, and Anne Hunt, statistician with Vusion Data Science, Lezer points out that the training and background of a data scientist can vary. They may be a programmer with math skills, a mathematician with programming skills formally educated in the field of data science, or importantly, someone who learned on the job. “It’s an emerging field and all of these are valid paths into the career,” he says.

All the paths have certain intangibles in common, as well, such as the ability to synthesize technology and business; the tendency to ask good questions; and the desire to tell a story with data.

Also, there’s the ability to manage relationships with business partners and technical teams. But most of all, there’s a natural sense of curiosity to explore the unknown. It sounds a little like a job description for a fleet manager. Since managing data is already a requirement for many managers, why not refine it to realize the greatest benefits? Let’s start with some basics.

In explaining the discipline itself, Lezer begins with a basic definition: “Data is something that can be found, acquired, cleansed, processed, combined, and rearranged,” he says. Other elements include programming, which allows the creation of custom code, and algorithms to find and analyze patterns. Data science also uses scientific method: asking questions, formulating hypotheses, designing experiments, collecting data, learning from the results, and repeating.

The statistics involved are formulated from many mathematical specialties with the purpose of finding unusual data and evaluating its importance. Data science also uses visualization, the display of data in a way that conveys meaning intuitively—and immediately. This is one of the most important elements when it comes to conveying results to upper management.

Finally, there’s domain knowledge. That’s applying your knowledge of the industry to interpret results, focus on relevant areas, and recommend actions for the fleet.

“Data science requires both art and craft from its practitioners,” Lezer says. “Data visualization aides clarity, communicates information, and calls out major findings. Mathematical models can use different methods to tune for specific goals and purposes.”

Much like implementing telematics as a whole, or technology such as machine control, it’s important to establish your goal going in.

“Don’t start analysis before clearly defining your goals,” Lezer says. “Why? Because misaligned goals may result in confusion rather than insight.”

Lezer says goal-setting includes asking the following questions: What is your objective? What decisions will you make? What constraints are in place? And a big one, how will you measure results?

He calls data “the food that fuels your business. Like a good meal, data insights take preparation. Lots of hard work occurs before that data ever arrives at your desk.”

And like many foods, raw data needs to be harvested, collected, and processed to be ready for use. Quality control throughout the process yields successful results.

But always beware that excellent analysis of bad data yields useless results.

An example of combining data in an on-highway fleet application is an illustration of how each data source is powerful alone, but all are stronger in combination. Consider multiple readings coming from the truck fleet, such as load weight, driver attributes, truck attributes, GPS location, road data, the dispatched route, and the truck’s maintenance history.

“Moving and arranging such data for analysis is 80 to 90 percent of the effort,” Lezer says. “And the more that burden falls on practitioners of data science, the less time they have to analyze that data.”

That means automating the gathering and processing as much as possible. “Small gains in data processing efficiency yield big gains in time available for analysis.”

Other tips for data: Keep it fresh. Fresher data allows current and timely analysis, but can be difficult to do. Quality checks throughout data processing contribute heavily to the final result. Lezer says a robust solution includes data-processing tools, integrated quality assurance, home-grown tools and queries, and occasional manual research.

One reason data movement is difficult is because it hasn’t been kept simple, which results in lots of pieces together without a cohesive design. It’s easy to imagine how that can choke the process.

Think ahead to develop a cohesive solution to scale and deliver rapid insights. A well-planned architecture can support multiple areas more efficiently than multiple standalone projects.

There are tools of the trade here, broken down into three categories: basic needs, niche specialties, and some interchangeable options, according to Lezer.

Database tools include data export, transformation, and loading (Exact, Transform, Load), data storage, and data querying (Hadoop, SQL). Mathematical tools include statistical modeling and machine learning. And there is also data visualization.

Niche tools help round out your process by providing focused ways to perform specific tasks that are difficult with generalized tools. Examples of these would be text analytic tools, single-method modeling tools, and Web analytics tools. Interchangeable tools depend on the preferences of data scientists (among the most common decisions to make are R versus SAS versus Python).

“Each tool has strengths and weaknesses,” Lezer advises. “Pick the tools that fit the need and work well together over personal favorites.”

Once you receive the data, it comes down to the presentation. “Tuning the message,” Lezer calls it.

Good visuals communicate results efficiently and highlight key results, but he offers the following warning: “There is a fine line between tuning the analysis to address the right question and cherry-picking data to fit the conclusion.”

“Graphics can be more effective than numbers alone,” Lezer says. “And when the data begins to flow, remember your key questions.”

Before launching into an example of using data for predictive maintenance for on-highway trucks, the presenters shared a quote from Frost & Sullivan: “Every dollar spent on advanced analytics delivers an 8 to 11 times ROI and a restructuring of total cost of ownership (TCO) of trucks that yields a 2 to 3 percent reduction in TCO per vehicle per year.” In short, data science pays.

“As far as priority of maintenance, the top four costs have consistently been fuel, wages and benefits, truck and trailer payments, and repair and maintenance,” Lezer says. The American Transportation Research Institute’s “An Analysis of the Operational Costs of Trucking” says repair and maintenance has been 8 to 9 percent of the total average marginal cost since 2009.

The current state of maintenance for many truck fleets is a scheduled PM cycle, usually determined by mileage, a routine checklist of components and fluids, and a constant striving to try to identify trending issues within the fleet—all with a minimized impact on productivity.

Lezer and Hunt think managers can do better at scheduling maintenance. Without data, managers’ intervention decisions on maintenance are a balancing act between early intervention and deferred intervention. Each involves risks (see the graphic on the previous page). Where does it make the most sense to intervene? Severe faults happen less than 1 percent per month. The key is finding that 1 percent.

This can be done by introducing proactive predictive maintenance, which is using statistical analysis of engine data to pinpoint which trucks have a high likelihood to have a fault. The output must be specific, and you must, of course, require periodic performance readings from the vehicle. The system used to capture information must have dozens of performance-related data elements. Ideally, these would include engine speed, fuel temperatures, any warnings, oil temp, SCR ratio, rail pressure, turbo speed, coolant level, idle hours, trip distance, and location (Trimble’s system is called PeopleNet).

The challenge then becomes how to classify hundreds of performance measurements into faulting behaviors, or how to see the future. The presentation says this occurs with the use of statistical decision trees. In them, splitting rules are applied iteratively, creating a hierarchy of branches. Rules provide a unique path for data to enter the defined class, and rules help predict new node values based on new or unseen data. Random forests are multitree (100s) committees that use randomly drawn samples of data and inputs, and reweighting techniques to develop multiple trees. “A ‘random forest’ results in a robust data model correlating a large number of input values to a predicted outcome,” Lezer says.

Data are run through a forest of statistical decision trees to develop a “proof of concept.” A strong proof of concept example would include 60 days of performance data on 16,000 vehicles, nearly 1 million “truck days,” with aftertreatment faults examined in detail, and also, training results verified against the test set. In testing for faults regarding regens/aftertreatment, the data reveal the timeline of a fault, and the commonalities with typical operating zones and elevated operating zones (left).

The proof of concept results, and scoring by the presentation’s model for one aftertreatment fault, successfully ranked 50 in the top 50 faults and 93 of the top 100. When you consider that the 10 most frequent faults account for 95 percent of the overall volume of severe faults, you are able to engage in event prediction. Predictive maintenance analysis is now possible.

“Analytics can provide valuable insight into your fleet’s health and proactively manage maintenance opportunities ahead of unscheduled downtimes,” Lezer concludes.