To boldly go where nobody has gone before, you will need a transporter.
SpaceX CEO Elon Musk has said that to survive long term we humans must become a “space-faring species.” And he has set a goal to build the transport infrastructure necessary to colonize new planets starting with Mars. Likewise, Cirrus Data CEO Wayne Lam says that for data systems to survive long term, storage data must be able to travel effortless throughout the IT galaxy. Accordingly, Wayne and his team have set a goal to build the most efficient transport infrastructure for data to colonize new storage platforms.
To know where we are going, we must know where we have been.
Technology changes every five years or so, and data storage is no exception. Prior to 2000, the great majority of workstations and servers were launched with local disk drives inside the box. In the early 2000s, the cost and complexity of Storage Area Networks (SANs) dropped to levels that allowed wider adoption. A SAN is a specialized network that provides access to consolidated, block-level data storage.
The next idea was to externalize data storage within an on-premises SAN outside the box; networked using Fibre Channel or iSCSI. To prevent data transfer bottlenecks, SAN networks are maintained separately from local area networks (LANs). Later on, Network Attached Storage arrays (NAS) became popular. They provided LAN access to consolidated shared disk file systems. Regardless, data storage was a local resource in those days.
Things changed in the late 2010s with the emergence of Amazon Web Services (AWS), the first pay-per-use Internet-based computing service. Designed to make web-scale computing easier for developers, Amazon paved the way for delivery of world class infrastructure via the internet featuring complete end-user control of all computing resources, including pay-per-use data storage. A few years later Google App Engine and Microsoft Azure launched cloud platforms for developers to build, test, deploy, and manage their distributed web apps. In the IT Galaxy today, we see data steadily moving from local hardware into Cloud data centers that allow for resource scaling according to traffic requirements. At the same time, local data storage is by no means going to disappear. We must increasingly deal with both sides.
Where do we go and how do we get there? First, we must address the three challenges:
- Maintaining data that must remain local
- Moving data that must scale into the cloud
- Living and operating in the cloud
The critical issue around how we accomplish this centers on the efficiency of the transport.
Keeping data local: Every journey starts with small steps.
In 2012, Cirrus Data Solutions invented and patented a revolutionary new kind of transport for data called Transparent Data Interception (TDI). TDI uses standard storage ports to create a transparent data path that works like a virtual storage network cable. With TDI, engineers can insert an appliance in-band between a host and its SAN without causing downtime and without requiring any changes to the host, the switch, and the storage. The result was a true plug-and-play technology that provides a fast and non-intrusive way to move on-premises block-level data between hosts. With the launch of its Data Galaxy Migration product, Cirrus Data became the best solution for local, remote and cloud data migrations, helping IT managers to migrate their existing systems to larger storage systems with no data loss, no downtime, and in hours – not weeks. It was a necessary step forward.
Imagine you are a Virtual Machine (VM) living on a data volume that lives on a SAN connected using iSCSI. You need to move to a new machine but cannot afford to shut down your apps. After Data Galaxy Migrate is inserted and your source volume is discovered, your new volume is automatically created, and replication begins. When done, the pMotion feature of Data Galaxy Migrate is triggered effectively retiring the old volume and redirecting all storage I/O to the new volume. From your point of view; however, nothing changed. You and your apps see the same volume throughout the process. It’s like the transporter technology on the Starship Enterprise for those Star Trek fans. You just did a ship to ship transfer, magically appearing on the other side like nothing happened.
How to move data: Why take the long way when there is a faster and cheaper route?
Economy of movement is the key in data migration. Cirrus Data added major value to the data center by adding TDI to Fibre Channel and iSCSI storage networks, and local data migration was just the beginning of what they could do with TDI. Cirrus Data subsequently released a variant of TDI technology a software agent called mTDI. Using standard storage ports and internet transports to create a transparent data path, mTDI works like a storage network cable that stretches across the internet connecting the Cirrus Data Data Galaxy Migrate appliance to cloud-hosted storage volumes provided by Amazon and Microsoft. With mTDI and pMotion, frictionless data migration to/from the cloud with no data loss and no downtime became a reality. Advanced data protection and data copy features became possible with mTDI and the Data Galaxy platform.
In 2019, Cirrus Data Solutions released version 5.0 of its platform named it Data Galaxy Complete. Data Galaxy Complete is much more than a data migration solution. This platform is a data management, data protection console that helps IT managers to move data volumes in and out of the cloud or physical servers with block-level mTDI efficiency. The Data Galaxy platform has been warmly received for many reasons, not the least of which is the way it improves data protection, the tedious task of backing up and restoring data to/from physical or cloud-hosted storage.
Prior to Data Galaxy, most organizations backup their volumes by copying files or with backup and restore programs. Legacy backup software with host-side agents moves files periodically, not continuously and with a full restore required on the other side. This typical progress can take hours and even days to accomplish. With Data Galaxy CDP there is near zero RPO, and very low RTO because files are not being copied across the Internet. With Data Galaxy a physical or cloud-hosted copy of your volume is being surgically backed up at block-level using mTDI on a continuous basis; tracking only the changes to zero copy. A corrupted volume is repaired by recovering precisely the blocks that are needed. Why restore when you can surgically repair only the damaged or infected data which is an order of magnitude faster than typical restores.
Data Galaxy has also proven to significantly reduce replication, transmission, ingress and egress costs associated with large public cloud offerings like Amazon AWS and Microsoft Azure. In the cloud, the minute you initiate a volume the meter runs and you start paying. Data volumes are typically operating at only 50% of their storage capacity, but in the cloud, you always pay for 100% of capacity for every copy. With Data Galaxy, the first factor of 2-1 deduplication eliminates redundant data. The second factor of 2-1 reduction is accomplished with data compression and the third factor of 2-1 comes from volumes that are 50% empty (aka think provisioning). Therefore, the actual cloud storage used by DMS Data Galaxy to back up your volume is reduced by a factor of up to 8x! The reduction in storage volume equates to a capacity billing that is accordingly reduced to 1/8 of the original amount. That’s real money being saved.
Let’s go back to our VM illustration. It’s time to back up your data to a remote facility. This is the equivalent of putting your friend on a spaceship and sending him or her to a remote planet. If moving a human from Earth to Alpha Centauri system is like a migration from Storage A to Storage B, then, rather than putting the passengers into hibernation and then waking them up 20 years later, wouldn’t it be nice to be have a method that only takes 2 years, and allow the passengers (data) to stay awake throughout the journey and let them enjoy the scenery? Who can afford the “downtime” for so long? So, for storage migration, it is the same thing: if there is a way to move the data so quickly and unobtrusively that it can be done totally without downtime, and it takes only hours and not days, that would be a winner. The Cirrus Data Galaxy Migration solution, with TDI, accomplishes exactly that.
Data living in the cloud: If things go wrong out there, you may need to roll back time.
The cloud’s first promise is that their platform is resilient, but there are certain problems the cloud does not solve. What if you are careless enough to corrupt your own data? Public cloud providers, Amazon, Microsoft, or Google cannot stop you from doing that. Even in the cloud, we need data protection and recovery tools to overcome operational accidents and repair corrupted or infected data.
Your data storage is encapsulated within the cloud. What if something’s wrong? Houston, we have a problem. What if ransomware has encrypted the system block on drive L of our cloud-hosted VM. The server is running, but the data is now encrypted. The heart has stopped, but all other organs are good. What can we to do, Captain?
Fortunately, we can protect and repair at the block level using Data Galaxy. In the background, TDI/mTDI has been continuously streaming the data changes from the production copy 0 to Data Galaxy’s copy 1-to-N. The CDP Journal provides visualization of all the bursts of writes, the transactions or file copies, and IO markers can bookmark to the last known good points. We will seamlessly roll back drive L to its last known good state two hours ago, just before ransomware corrupted the system. And, this doesn’t require a full restore from the backup. Aye Captain! How is this possible? Here is a walk-through of how Data Galaxy is typically setup and put into use for undoing the ransomware damage.
In the Data Galaxy console, local fibre channel SAN and our cloud storage is visible.
Our drive L is in there. We see copy 0 of drive L on our local SAN.
Next we created copy 1 of drive L on the Data Galaxy deduped storage system
- Copy 1 GB of data that expects to have a 4:1 reduction. TDI mirrored the writes and showed 250MB of data is stored. The original driver L is 2GB in size…so we stored a 2GB volume using only 250MB… that’s 8x capacity savings.
- UBR dashboard stat dedupe 8x
- Second manual copy of same data, Set a trip meter, dedupe is 577 to 1, bulk of data was not unique, 1 minute to copy
Then we created copy 2 in Cloud storage
- Replicate drive L from local DGS instance to Azure hosted DGS instance
- 95% bandwidth savings according to the replication statistics report
- Azure DGS instance consumed only 250MB of blocks, rather than 2GB.
- There’s the 8x capacity savings at the cloud.
Then App sets IO markers to last known good points
- App calls Data Galaxy APIs to set IO marker during system health check
- TimeWalker shows us Last Known Good Point
After the Ransomware Attack:
- Select our data protected Drive L volume / graphic shows data flows and recovery speed
- Using Time Walker, we browse the CDP journal and zoom in.
- Select the Last Known Good IO marker on CDP Journal
- Launch the surgical, block-level volume repair. Completion time: under 1 minute!
What did we learn?
- The efficiency of the transport matters. With the right transport, you can explore the IT Galaxy with confidence. You and your data will be going places efficiently and without risk.
- Block-level data transfer is an order of magnitude faster than file transfer. With Data Galaxy and TDI, data synchronization can be done in minutes, not hours. You will have fresh data when you need it.
- Data Galaxy replication is an order of magnitude cheaper than file copy. CDS delivers a natural data reduction engine that cuts public data storage and transport costs by a factor of 8.
- While travelling the IT Galaxy, you don’t have to hibernate your passenger to make it to the end of the journey. The magic of pMotion is no downtime during the data migration and immediate removal of the old storage equipment. Data Galaxy impersonate the old storage while building the new storage. You don’t have to postpone your trip. Stay alive and keep on working while you travel.
- When things go wrong, it’s much better to repair your data than to completely restore it. Even in the worst failure, we can use the power of Data Galaxy to undo block level differences. With Data Galaxy and mTDI continuously streaming and recording the data changes from cloud storage to on-premises, you can always roll back in time to a known good copy of the data; the beauty of the TimeWalker.
Want to learn more about Cirrus Data Solutions and its Data Galaxy platform? Visit https://www.cirrusdata.com/technology/