Menu

NetApp DataOps Toolkit—What’s in a name?

The NetApp DataOps Toolkit simplifies operations relating to the data that data science frameworks use to train models
Woman sitting on the floor typing on laptop
Table of Contents

Share this page

Mike Oglesby
Mike Oglesby
515 views

What’s in a name? Quite a lot, actually. When we are talking about a product or tool, the name of the product or tool implies certain things. A name often implies functionality. What a product is called can also imply its intended use. A name can be enlightening, or it can be confusing.

After extensive conversations with our customer community, we have decided to rename the NetApp Data Science Toolkit to NetApp DataOps Toolkit. We know that a name change can be disruptive, so we did not make this decision lightly. We believe that this name better reflects the function of the toolkit.

The name

Why the name NetApp® DataOps Toolkit? Although the toolkit was originally developed for data scientists and is still very much targeted toward data scientists, it is not a data science framework like TensorFlow or PyTorch. The name Data Science Toolkit could be misunderstood to imply that the product is similar to those frameworks. Instead, our toolkit simplifies data and storage operations for data scientists, data engineers, and developers. The DataOps Toolkit works with data science frameworks like TensorFlow and PyTorch, which simplify the training of deep learning (DL) models. The NetApp DataOps Toolkit, on the other hand, simplifies operations relating to the data that data science frameworks use to train models.

Accelerating AI workflows

For those of you who are new to the NetApp DataOps Toolkit, let’s review the toolkit’s capabilities. The toolkit has two key features that can greatly streamline AI workflows. 

With the NetApp DataOps Toolkit, a data scientist can almost instantaneously create a space-efficient data volume that’s an exact copy of an existing volume, even if the existing volume contains terabytes or even petabytes of data. Data scientists can quickly create clones of datasets that they can reformat, normalize, and manipulate, while preserving the original “gold-source” dataset. Under the hood, these operations use highly efficient and battle-tested NetApp cloning technology, but they can be performed by a data scientist without storage expertise. What used to take days or weeks (and the assistance of a storage administrator) now takes seconds.

Data scientists can also save a space-efficient, read-only copy of an existing data volume. Based on the famed NetApp Snapshot™ technology, this functionality can be used to version datasets and implement dataset-to-model traceability. In regulated industries, traceability is a baseline requirement, and implementing it is extremely complicated with most other tools. With the NetApp DataOps Toolkit, it’s quick and easy.
The NetApp DataOps Toolkit comes in two different flavors—one for Kubernetes-based environments, and one for traditional virtualized or bare-metal environments. Users can take advantage of the NetApp DataOps Toolkit capabilities in any type of environment that they operate in.

Other use cases

Another reason that we decided to change the name of the toolkit is that its usefulness is not limited to only data science use cases. Working together with our customers, we have been hard at work applying the toolkit to other use cases involving other types of users. What other use cases, you ask? Stay tuned to NetApp blogs to find out! We have another blog coming soon that focuses on some additional benefits of the toolkit.

The name may have changed, but the features and benefits haven’t. With the NetApp DataOps Toolkit, data management is not an impediment to a fast, streamlined AI process. To learn more about the toolkit, visit its GitHub repository. To learn more about all of NetApp’s AI solutions, visit www.netapp.com/ai.

Mike Oglesby

Mike is a Technical Marketing Engineer at NetApp focused on MLOps and Data Pipeline solutions. He architects and validates full-stack AI/ML/DL data and experiment management solutions that span a hybrid cloud. Mike has a DevOps background and a strong knowledge of DevOps processes and tools. Prior to joining NetApp, Mike worked on a line of business application development team at a large global financial services company. Outside of work, Mike loves to travel. One of his passions is experiencing other places and cultures through their food.

View all Posts by Mike Oglesby

Next Steps

Drift chat loading