Since Big Data has influenced and changed the IT world, data streaming has been seen as an alternative to traditional batch processing.
This blog entry takes a look at both types of processing and describes how the tcVISION solution is successfully used in both traditional batch processing and modern data streaming.
Batch Processing
Since the beginning of data processing there has been batch processing. A form of processing in which input data is read, saved, and then processed together with the master data. Everyone who grew up with data processing knows this form of processing. Even today it is still carried out daily in every computer center. The so-called "batch window" is strategically important and very often not sufficient for processing the data.
With the help of tcVISION some of our customers have solved the problem of the small batch window by replacing existing procedures to transfer changed master data from the mainframe to open systems with direct replication during processing. The traditional procedures extract the master data and transfer it to the target system via file transfer. The data is then imported into the target system via utilities or own applications to serve as a basis for e.g. reporting, data warehousing, and analytics.
This entire process was replaced by direct synchronization between mainframe and target system in real time, saving a lot of valuable batch time.
Data Streaming
The online encyclopedia Wikipedia describes data streaming as follows:
In computer science, a stream is a sequence of data elements made available over time. A stream can be thought of as items on a conveyor belt being processed one at a time rather than in large batches.
The main difference to batch processing is the direct processing of data after its creation or receipt. This processing takes place continuously and almost in real time. This depends, of course, on the application processing the streamed data. There are different applications that process and analyze streaming data. In Open Source we find Apache Spark, Apache Flink, Apache Storm or Kafka Streams. The big players in Cloud computing and Big Data also offer streaming applications: AWS Kinesis, Google Dataflow, or Microsoft Azure Stream Analytics.
The main advantage of data streaming is that the data is up-to-date and can be processed as soon as it is created. Especially in the field of analytical investigations this is a big advantage compared to batch processing, where the received data is first cached and then processed as a whole.
The application area of data streaming can be found in all areas and industries. Streaming data is used in banking, e-commerce, and retail stores.
The "Internet of Things (IoT)" is also a classic area of application, as the received sensor data is processed in a time-critical manner. For example, maintenance work on machines can be recognized and carried out immediately.
In practice, both above-mentioned forms of processing can be found in companies also using Big Data. Data that is not time-critical is still processed in batch mode, and time-critical data is evaluated immediately and directly via streaming.
tcVISION plays an important role in data streaming.
tcVISION is the supplier of data that is created in an online processing environment on a mainframe system (CICS, IMS/DB, Adabas/Natural, CA IDMS), captured in real time by tcVISION, and streamed into a Big Data environment. B.O.S. Software has already introduced this solution in 2017 with Apache Kafka.
Our tcVISION solution is perfectly suited to connect the traditional mainframe (no matter whether the operating system is z/OS or z/VSE) with a Big Data environment or the Cloud.
Practical application experiences are available, and the acceptance and demand is high. tcVISION already supports the strategically important Big Data systems and applications. More will follow in the future.
An overview of all supported input and output destinations can be found here.