Batch System - Design Essentials

Modified on Sun, 26 Feb 2017 17:34 by Biswajit Dash — Categorized as: classical architecture, design

Problem Statement

What is batch a system? What are the essential design elements of such a system?


Design Abstract

What? When?

What is batch system? A batch process is non-interactive execution of series of steps or programs on a set of inputs. So, the three key elements being :-
A batch system provides the containing infrastructure for such batch processing.

When to use a batch? A batch system is used for processing large volume of inputs in an offline mode. Some of the day-to-day usages include :-

The Inner-workings

Batch Flow - Basic

At basic level a batch system consists of three steps :-
Image

Batch Flow - Advanced

What are the key challenges? Unlike the above mentioned steps, a real-life batch system needs to answer some key questions such as :-
The sketch depicts detail flow of a real-life (but simple and sequential) batch system.

Image

StepDescription
Config SettingsThe set of configuration parameters used by the batch system in run-time decision making.
Init Batch ContextThe step initializes the batch execution context, based on which different run-time decisions are taken.
Source: Database/FilesThe source of inputs to the batch process.
Detect Duplicate ExecutionThe step detects if the batch is being repeat executed on the same set of inputs.
Connect Data SourceThe step to connect to the input source and buffer/read the inputs.
Read InputThe step to read or pick a single input from the input set for processing.
Verify Input FormatThe step to verify the format compliance of the current input. This is mostly useful in file based inputs, specifically to verify - length of fields, field count, data-type etc.
Log ErrorThe step to log run-time error.
Log Format ErrorThe step to log the input that does not comply with format specifications. This log is used to perform corrective action on erroring inputs.
Process InputThe step to process the current input. This step is functionality specific, and may be a composition of one or more programs/steps.
Abort BatchThe step to abort the current batch execution. This can perform clean-up tacks like - logging, and closing connection etc.
Close BatchThe step to successfully complete the batch. This can perform clean-up tacks like - logging, and closing connection etc.

Implementation Notes

Besides the above basic flow, the design and implementation of a real-life high performing batch system also need to support :-

Glossary

InputA "single input element" over which processing is applied. It can be a record from database record-set, of line from a file.
NodeA hardware hosting the batch system capable of independently executing a batch process end-to-end.
Continue vs. AbortThe decision to either "continue processing" or "abort processing".


Paper Code: TWP_1003.10, Version: 1.0, Author: Biswajit Dash, License: CC-BY-ND, Published: Aug-2016