Stream processing refers to the concept of processing the data in motion. Stream processing may be used for the analysis of data to derive decisions, or computing the data through algorithms in order to generate patterns for subsequent analysis.
The data streams could originate from any of the raw sources of data, sensors, online activities & events or mobile applications. Before initiating the process of stream processing, the data may need to be siphoned off into an intermediate stream, so that you can tap into it without breaking off the stream. Unlike old times, when the data had to be stored at rest in order to process it.
Now, the emergence of big data analytics is sparking a third revolution in decision-making, opening up new possibilities. It's been a given that many low-level routine decisions - deciding if a customer is eligible for a discount, or rerouting orders to achieve greater speed - can be automated through an analytics-fueled rules engine. But dig data also reaches higher up the organization as well. All up and down the corporate ladder, decision-making is shifting outward - either to new players in the enterprise, or to machines
Real-time stream processing provide a new way to do things, including:
Real-time action: React to an event as it happens, and take action in real-time. This allows us to address some of the problems surfacing in KYC (Know Your Customer) and AML (Anti Money Laundering), where we can process events as they occur without any latency.
You can now handle extremely large volume of data because you no longer have the need to store this data at rest. This solves the problem of addressing the events being received from sensors and mobile applications.
There is no more need of batch processing or waiting for the data to arrive in a particular location, because the data is now being computed while it is in the stream.
The last and the most important aspect is the decentralization of data. With multiple streams being processed in real-time, you now have the ability to combine data attributes across streams without the need to centralize the data into a single repository. This allows you to lower the overall infrastructure and operational costs.
With an event driven approach for stream processing, you can combine the power of stream processing and real-time analytics, into complex dashboards that allow you to work off massive data volumes. This would not have been possible with databases at rest that required massive processing power and large storage volumes.
For example:
Analyze a bank transaction as fraudulent
Raising a compliance ticket for a payment processed in real-time
Sensors sending out instructions that may drive behavior at the headquarters with regards to product pricing