I've been working with a few customers who are using WXS to load large quantities of data in to the grid from a database. This can be quite time consuming as fetching the data from the backend is usually the bottleneck. Loading hundreds of gigabytes of data can take well over an hour. This means that a full grid restart is not something such a customer wants to do a lot given the cost of reloading the data.
The bottleneck here isn't WXS, it's the source data system. Many of the customer scenarios I see use the grid as the system of record. If the grid is down then the data doesn't change. We'll discuss when it does change later.
I've written a utility as part of wxsutils (if you're not using this with wxs then you are missing out...). It will make a disk based snapshot of a grid map. It writes each partition for each map to a file by writing the key/value pairs as JSON strings. Each physical box hosting the grid container JVMs will store all the partitions hosted on that box in files in a particular folder on that boxes file system. The current code does it for the maps you specify. So, just execute the code once for each map you want 'snapshotted'.
Later, the customer can execute the read snapshot code to reload that data set from those files in to the grid. This will happen at grid speed. Each box in the grid will read the local files and load them in to the grid as fast as possible. The bottleneck here will likely be network depending on the boxes.
This approach can save a lot of time in many scenarios:
- The customer wants to cycle the whole grid for maintenance on the WXS runtime or the application artifacts. The customer should snapshot the grid before stopping it and then read the snapshot after restarting the grid post maintenance. This is much faster than a reload from the backend.
- The customers wants during testing to have a grid with a well known start data set. Load it once, snap shot it and then use it to reset the grid every time you want to run a test case from a well known start state.
If the data is changing all the time even when the grid is down then we can deal with that also. Probably the 'best' (and best is a loose word here) way to synchronize a grid with a database is by using a change data capture type product. For example, here is IBMs link. These products watch the transaction logs of a database and allow an application to intercept changes in the data without slowing the database down or using triggers etc. The application listener would then typically make a message about the change and send it to the grid where the grid will process the message and apply the change. This keeps the staleness between the database and the grid to a minimum.
If this listener stores these messages in a JMS queue or MQ series queue then even if the grid is down, these changes are tracked. Thus, when the grid is restarted, do not start the msg listener straight away. Instead do the read snapshot and then start the listener which will down apply all changes since we wrote the snapshot. This allows a delta update of the data in this case keeping the snapshot useful.
You can see how to use this snapshot capability on github at this link.
You will need to add the PerContainerGrid definition to your objectgrid.xml and deployment.xml files and put wxsutils.jar and the jackson json jars also on your container classpaths.