In-memory data grids enable instant responses to financial transactions, shopping cart contents, monitoring streams, and other operational data
Operational systems manage our finances, shopping, devices, and much
more. Adding real-time analytics to these systems enables them to
instantly respond to changing conditions and provide immediate, targeted
feedback. This use of analytics is known as operational intelligence,
and the need for it is growing fast.
For example, financial trading applications must rapidly respond to
fluctuating market conditions as market data flows through trading
systems. E-commerce systems must reconcile orders with inventory changes
on a second-by-second basis, and they need to quickly respond to
shopping behavior to offer personalized recommendations. Smart
grid-monitoring systems need to continuously analyze telemetry from many
sources to anticipate and respond to unexpected changes in power grids.
In all of these examples, live, fast-changing data sets churn in active,
ongoing operations. The advantages of responding to this live data in
real time -- to present shoppers with promotions based on the contents
of their shopping carts, for example -- are both compelling and within
reach. The combination of in-memory computing and data-parallel
analysis, running on a cluster of commodity servers, allows systems to
continuously track and analyze live data, extract important patterns,
and generate immediate feedback that steers the system’s behavior. This
technology can be found within a category of software called in-memory
data grids (IMDGs), which have been evolving over the last decade to
help manage operational systems.
What are in-memory data grids?
IMDGs store data in memory and distribute it across a cluster of
commodity servers (or virtual servers in a cloud environment). Using an
object-oriented data storage model, IMDGs provide APIs for reading and
updating data objects with very low latency, typically in less than a
millisecond, depending on the size of the object. This enables
operational systems to use IMDGs for storing, accessing, and updating
fast-changing, “live” data that track the system’s state, while
maintaining quick access times even as the storage workload grows.
IMDGs have “elastic” storage in the sense that you can grow or shrink
both storage capacity and throughput simply by adding or removing
servers. In addition, they store in-memory data with high availability
so that it is continuously available. Servers can fail and recover -- or
otherwise be added and subtracted from the cluster -- without
disrupting operations.
Perhaps most important, IMDGs can take advantage of the cluster’s
computing power to perform data-parallel computations on stored data.
Because data and computing power reside together, thereby avoiding data
motion, IMDGs can provide fast results (often in less than a second)
with minimal overhead. This makes IMDGs well suited to quickly analyzing
the state of an operational system and providing immediate feedback.
Modeling operational systems for operational intelligence
Operational systems usually comprise a large population of highly
dynamic entities, such as stock portfolios in a financial trading
system, online shoppers browsing an e-commerce website, or viewers
controlling set-top boxes in a cable TV network. These entities create a
stream of events that must be correlated, enriched with offline data
(for example, customer preferences or history), and analyzed to discover
patterns and trends.
If this analysis is completed in real time, feedback can be provided to
the operational system to enhance its functionality and improve its
effectiveness. For example, stock trades can be triggered to capture
market fluctuations, shoppers can be offered relevant, personalized
recommendations, and cable TV viewers can be alerted to special
promotions based on their viewing preferences and current selection.
Popular approaches to implementing real-time analytics focus on
analyzing incoming streams of data and reacting to the data within those
streams. Examples include complex event processing used in financial
services and stream processing using Apache Storm, a parallel platform originally designed to analyze Twitter streams.
However, focusing on event processing does not provide a complete
framework for modeling the behavior of real-world entities, which, in
addition to event streams, have both history and context that must be
taken into account. Using an in-memory model of the real-world entities
managed by an operational system, the IMDG can correlate incoming events
and enrich them with offline information to maintain a comprehensive
context that can be subjected to real-time analysis. The output of this
analysis then can be fed directly back to the system to add value to its
operations. It also can be provided to personnel monitoring the system.
Using IMDGs to implement operational intelligence
IMDGs provide exactly the technology needed to implement an in-memory
model for active entities within an operational system and continuously
track incoming events from these entities, enriching them with relevant
historical information and structuring a parallel analysis of aggregate
behavior. This in-memory representation takes advantage of the IMDG’s
object-oriented storage model to organize in-memory data representing
the entities.
Because the IMDG is both elastic and highly available, it can handle
highly variable workloads and run within a mission-critical operational
system. The IMDG’s data-parallel computation engine enables it to
quickly analyze state changes within the model and provide immediate
feedback to the system, while capturing aggregate trends emerging across
all entities.
No comments:
Post a Comment