Menu

Utilizing Machine Learning to Optimize Data Management

Marc Ender
1,011 views

How to Use Machine Learning for Data Management

How do you feel about benchmarking your systems against a few hundred thousand others in real time? Thanks to machine learning, this is now possible. Since data storage systems supply telemetry data, they can actually be evaluated using artificial intelligence (AI).

NetApp’s Active IQ® cloud service allows hybrid systems to benefit from swarm intelligence. Therefore, you get a chance to learn from the best – automatically and in real time.

Take, for example, motor racing as a test environment under extreme conditions. Enormous amounts of data about the condition of the engine and vehicle are transmitted in split seconds to a central system where they’re analyzed immediately.

This is pretty much what NetApp has been doing with its customer base since 1995 – at a time when the term big data was rarely used. Today, more than 300,000 data management systems provide telemetry data in the form of logs and system performance information. In practice, this means 200 billion data points come together every day, which amounts to approximately 200TB per month. This entire multi-petabyte data lake is ideal for calibrating machine learning models.

A dashboard provides analysis and recommendations in real time

What you’re looking for is an overview of your systems’ performance in the most practical form. The Active IQ dashboard offers real-time monitoring of your system’s health. Through a web browser, you’re given access to analytics and forecasting, while the mobile app provides the same insights anytime and anywhere. On the platform’s main screen you’ll find the status of the overall system environment, as well as proactive recommendations for intervention.

A health trending widget summarizes the current storage infrastructure risks, while the risk advisor determines whether they can be remedied by, for example, updating the operating system to a newer version. In addition, a table shows the percentage of NetApp systems where the acute risk could be eliminated by merely updating the operating system, leveraging knowledge from the community and the extensive installation base of NetApp.

Another widget yields a capacity forecast, displaying systems at 90 percent capacity or more with additional storage added via mouse click. Additional functionality can also show the status of support contracts, highlighting contracts that have expired or will soon expire.

Identify and avoid disruption in advance

The storage efficiency widget compares the efficiency of your system with the average of all Flash FAS systems in the NetApp installation base. If your system’s efficiency is below average, Active IQ will suggest how you can upgrade; for example by switching to a faster all-Flash system. From an overview on the dashboard, you can drill-down to see the data for individual sites, clusters, or storage grids.

A continuous risk assessment, meanwhile, enables interventions before they impact system stability. In the event of system bottlenecks, real-time monitoring ensures that performance issues can be resolved before failures occur.

As a developer, you’re always one step ahead since your data management capacity is monitored and future use is forecasted. If a problem occurs, Active IQ will immediately check if it’s a known phenomenon or a new type of event.

The continuous 24/7 monitoring relieves your IT staff of repetitive tasks; technical support will be alerted when an intervention is required, and at that point, specific recommendations will be given to tackle the issue (guided problem solving). All this becomes possible because the anonymized log and configuration files as well as the telemetry data are continuously analyzed using the machine learning algorithms. The models also compare workloads between similar systems, and thus they become more intelligent over time.

The interface between man and machine is now more natural

Machine learning also helps to considerably improve the interface between man and machine. With the help of IBM Watson, a chatbot could be created with a virtual support agent. This ensures fast answers and shortens waiting times. In the near future, short versions of best practice reports will also be generated by machine learning and delivered as prose text.

Active IQ is a data-based service that combines artificial intelligence, machine learning, and community wisdom. Customers receive predictive analytics, proactive support and hands-on, on-the-fly recommendations to optimize their data management. Active IQ is constantly learning to fully take advantage of your data’s full potential.

System warnings can be processed immediately without conventional batch processing, avoiding any delay, and ninety-eight percent of technical problems are solved automatically. If a problem cannot be fixed automatically, the support team can then use telemetry to troubleshoot 60 percent faster. The total volume of avoided downtime costs, and the amount of time saved, is equivalent to $600 million (as determined by IDC), and the development cycle for new analytics services has been reduced from six months to just one. In addition, using Active IQ results in 85 percent fewer support cases worldwide thanks to forward-looking recommendations.

Marc Ender

Marc Ender heads the Solutions Engineering Team and is responsible for the Cloud Solution Business at NetApp Switzerland. Since 2004 he has worked for major customers across all industries, advising them on data management and future-oriented hybrid cloud solutions. Marc is also a dedicated handball player, applying the motto ‘together to the goal’ in both sport and to business challenges.

View all Posts by Marc Ender

Next Steps

Drift chat loading