“FIFTH DOMAIN CYBER”
“The Defense Advanced Research Projects Agency (DARPA) has awarded a contract to five organizations in a bid to develop a real-time threat intelligence capability at a time when the amount of raw digital data continues to increase exponentially.
The five contract awardees include large commercial electronics firms Intel Corporation and Qualcomm Intelligent Solutions, the Pacific Northwest National Laboratory, the Georgia Institute Technology and defense-industry company Northrop Grumman.
The award, announced earlier this month, commissions the development of a graph analytics processor, a new type of hardware component that will provide 1,000x the processing power of existing technologies, according to DARPA.
In addition to improving computational speed, the new processor will enable the real-time analysis and visualization of the already vast and ever-growing amount of information that’s given rise to the Zettabyte Era. Currently, there are over 1.2 billion websites online, and internet traffic exceeded one zettabyte for the first time in 2016, reaching an annual run rate for global Internet Protocol (IP) traffic of 1.2 zettabytes, according to Cisco.
For perspective, in the decimal system, a gigabyte is 1000^3, a terabyte is 1000^4, a petabyte is 1000^5, and an exabyte is 1000^6. A zettabyte is 1000^7 or 1,000,000,000,000,000,000,000 bytes. By analogy, one zettabyte is equivalent to 152 million years of high-definition video.
Trung Tran, a program manager in DARPA’s Microsystems Technology Office (MTO), said, “Today’s hardware is ill-suited to handle such data challenges, and these challenges are only going to get harder as the amount of data continues to grow exponentially.”
The new processor will enable a type of Big Data analytics called graph, or network, analysis. Graph analysis is facilitated by graph databases, which are fundamentally different from traditional relational databases. Relational databases store information about the relationships between data in rows and columns, which makes visualizing the relationships abstract if not impossible. By contrast, graph databases allow for visual representation of the relationships between data points using nodes, edges and properties.
Nodes are the individual data points, such as rogue servers, legitimate servers and end-user devices. Edges show the relationships between nodes, such as which rogue servers are communicating with legitimate servers and devices inside a federal agency’s network. Properties describe useful attributes about nodes, such as a rogue server’s IP address, geographical location and the type of malware it’s distributing.
As illustrated, graph databases are handy for cyber threat intelligence, but they also have a range of governmental and commercial use cases, including terrorist network mapping, epidemiology and charting critical dependencies in complex critical infrastructure systems. When combined with machine learning, graph analysis could eventually provide predictive cybersecurity.
Graph databases are not new, but several innovations over the past two decades have made them more useful in enterprise contexts. However, it wasn’t until the need for Big Data analytics that the graph database’s full potential began to be recognized more broadly, which has led to their widespread adoption.
Why are graph databases suddenly so popular? The value from Big Data comes not merely in gathering and storing it, but in showing connections and causal relationships between data points to uncover and understand behavioral patterns and various types of trends. DARPA’s latest contract award aims to fund development a hardware processor capable of showing in real time the subtle, meaningful connections and relationships between vast amounts of seemingly unrelated data.
“This should empower data scientists to make associations previously thought impractical due to the amount of processing required,” said Tran.
The latest initiative is part of the Hierarchical Identify Verify Exploit (HIVE) program, which DARPA announced last summer. HIVE’s mandate is “to develop a powerful new data-handling and computing platform specialized for analyzing and interpreting huge amounts of data with unprecedented deftness.”