Dataset life cycle: Multi-cloud at Cohesity

In my previous post of the series I talked about fruition of enterprise data.
In this one, the second about Cohesity presenting at CFD4, I’ll focus on Cloud Adoption Trends and the session driven by Sai Mukundan and Jon Hildebrand.
Since this part includes also a demo, it was quite exciting (as any session were a demo is performed).
We saw Jon managing that PS commands as I expected he did and I mean, greatly.


Proceding with the life cycle of the previous mentioned Dataplatform, a customer typically start his journey to the cloud approaching the “big ones”, AWS, Azure, Google cloud. The use case he’s presenting is the long retention and VM migration, and the cloud means leveraging his storage on-prem. But if the platform on site is, a.e., VMware, adopting one of the previous 3 for long retention means also moving the VMware VM on that platforms: different formats.
The demo driven by Jon shows how simple for the end user is managing this conversion.


Walking down on the life cycle path, the second use case for the customer is using the cloud for test/dev, and this is application-based, the real mobility of the applicationfrom and to the different clouds. In this way he can either reproduce the behaviour of an app for testing purposes, or for further development and finally to put it on production directly from the cloud. As asked, some of the networking aspect are also replicated if needed.

The next step is backing up these migrated, or simply hosted, applications. That’s accomplished by the Dataplatform using APIs, in a cloud native mode. Sometimes, most of them, this isn’t enough: he also asks for a full disaster recovery of them, no matter from a technical point of view which is the destiantion (on-prem, AWS, Azure, etc.)

To close the cycle, the possibility to move across the clouds (Multi-cloud mobility) that means being vendor-agnostic and, consequentely, having a wider horizon evaluating economically all the clouds possibilities.

Now, the demo. From the first step of the life cycle, “Long-term Data Retention & Granular Recovery”. Archiving data on the cloud allows rehidratation of them on the point where the backup was taken, but also on another environment in the cloud or, again in a total new on-prem enviroment: the platform, in all these cases, remain the same.

Before sending the backupped data to cloud archive, they are dedup’ed and indexed to simplify recovery from a research. The index is needed also because the recovery could be performed in a different environment respect the original one, reporting the same metadata collected and created during backup. The following datasets (incremental backups) aren’t sent completely, but only the modified blocks and indexed accordingly.
Building this reference and managing datasets in this way reduces sensibly the network traffic – that’s not for free in almost all the public clouds.


Granularity is the first use case, customer needs a specific file at a specific point. The source depends on retention, if it’s present only on the cloud, that will be picked up from there and v.v. Again: recovery will only affect the modified blocks

Demo ran by Jon displayed the creation of a new job on a VM to be backed up and then archived on a public cloud. From definition of the public clouds, then creation of the SLA, everything available either through the GUI and the APIs (PowerShell in Jon’s case). A new job will be responsible of the operation

The indexing engine is very powerful, and it acts in a Google-way, whatever is the key of research, the index coming out includes all the items with that key, no matter if VM or vCenter, or datastore.

Immediately after completion of the backup, Cohesity send simultaneously the archived data to all of 3 public cloud configured. Jon now went for a granular search of files and folder. Again, impressive is the index engine – it proposes suggestion during typing in the key field, plus filter the results by customized words.

Since the requested data are present either on-prem and in all of 3 clouds, the choice where to pick it up is to the customer.

Another aspect of the life cycle is the VM migration, and the following demo will focus on it, using CloudSpin. This is preformed through policies. During this second demo 2 of the delegates asked for new features – this is one of the reasons I love these events, direct contact and feedback with the vendor, interaction with the key people of that vendor that are able to answer in a technically way, not only marketing stuff.

My first consideration is that multicloud today is invaluable – let the customer move his data where and when he wants is a critical feature for who manages his data.
Second, the cloud itself should be considered also in on-prem environments. Backups of on-prem environments taken and archived in the same premises don’t accomplish all the cases of disaster, dev-ops and other situations where duplication of the same data set is needed.

I’ll keep an eye on Cohesity since his development is having a boost…

One thought on “Dataset life cycle: Multi-cloud at Cohesity”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s