How Do Facebook, Google, And Amazon Plan To Solve The Problem Of Storing Big Data?

How Do Facebook, Google, And Amazon Plan To Solve The Problem Of Storing Big Data?
Tumisu (CC0), Pixabay
INVESTORS3
.

by Aubrey Hansen

Hyperscale data centers are expanding rapidly as ever-growing streams of data are collected, stored, and analyzed on a scale never before imagined. But this does not promise a solution into the future and other options may have to be explored.

The Super 7 companies are not badly placed currently. Smaller organizations find it difficult to emulate the speed and frequency with which they upgrade their servers, with the seven data-giants mostly switching to three as opposed to four or five-year upgrade cycles on their servers.

But data centers are wildly expensive, too. If Google’s capital spend for 2016 is to be believed, they probably sunk over $10 billion US dollars into data centers over the course of the year. Their figures for 2018 are likely to be far higher.

Yes this is capitalism at play, but Big Data is going to be so prevalent and – as seen in the recent uproar – too much data in the hands of one entity can be a very bad thing indeed. The tech space might be well served by a much more level playing field.

The big question one must ask is this: is it sustainable to keep adding more hardware into already vast data centers?

DAGs may help to replace the hyperscale data center

The power of Directed Acyclic Graph (DAG) architecture means that, theoretically, a near infinitely scalable system of storage can be designed wherein data finds a home in the computers or devices of millions of users.

There is a beautiful simplicity behind the concept of DAGs as opposed to the traditional blockchain. The idea is that each node, or user, who contributes disk space will store part of a dataset on the network.

Unlike standard blockchains, there is no need for miners as each node forms a part of the distributed database. Instead, each broadcasted transaction on a DAG architecture entails the validation of the two previous ones.

So every part of storage contributed by a node simultaneously becomes a transaction recorder, contained within the same ledger formed from two layers: the transaction layer and data layer.

Who is developing DAG architectures? 

Most startups are choosing to remain reliant on Ethereum and Bitcoin which have as-yet-unresolved scaling issues, but there are some exploring DAG as a solution with faster transaction speeds and considerably more staying power.

Dagcoin is one such cryptocurrency, with a hope to become a cheaper alternative to Bitcoin with quicker transactions whilst offering scalability to a very high number of users. And they appear to be the only serious platform attempting this.

Closer to a solution regarding data, IOTA envisions a world where people take ownership of the data that they generate in real-time and can furthermore sell this data for crypto tokens. The idea follows that companies who want to buy data – perhaps for targeted advertising but for a range of potential reasons – can pay for it directly from the data producer.

CyberVein based in Shenzhen, China meanwhile have teamed up with Hadoop, which is backed by leading academics as a potential way to help solve the future issue of analyzing Big Data. Through a DAG architecture, the Chinese startup aims to create a decentralized database using storage from the devices of its platform’s users.

Theoretically, CyberVein could eliminate data centers and replace them with a peer-to-peer network spanning the globe. As opposed to the proof-of-work system used by Ethereum and Bitcoin to reward miners, the CyberVein crypto token is awarded upon proof of contribution instead of nodes solving otherwise useless cryptographic puzzles, PoC measures the storage capacity that a node donates to the network.

Things move quickly in the tech space, now more than ever, which is all the more reason to find solutions to how Big Data is stored. Massive data centers could be only a temporary fix and it’s unimaginable that the sheer amount of hardware required is a sustainable and efficient answer for the future.

Only time will tell, but within the next year we may be close to finding out whether a DAG architecture holds the key.