Quite a few customers have been looking for a fault tolerant rules engine. I've been working with Victor Moore, a Distinguished Engineer working in Telco on this and we have something now which looks good. Fault tolerant here means that the a specific rules engines needs to run in exactly one Java Virtual Machine and fail over to another if its current JVM fails or stops. The working memory for that rules engine needs to be stored in replicated memory so that if the JVM hosting it fails then there is no lost data.
Clients wanting to access the rules engine need a way to add or remove items from the working memory of the rules engine and also interrogate the working memory looking for results if needed. Of course, the rules engine can execute Java code as a result of rules firing within the current JVM. WebSphere eXtreme Scale handles routing these client requests in a transparent fashion to the JVM currently hosting the rules engine.
The prototype (for Telco and Chemical industries but could be used in any industry) allows the customer to define a set of rules engines. The prototype then runs these rules engines in as many JVMs as are required. WebSphere eXtreme Scale manages the rules engines availability and ensures each rules engine is running in exactly one of the JVMs and we wrote some code to back the working memory of the rules engine with a memory replicated Map. If a JVM failure occurred then we fail over the rules engines that were in that JVM to the JVMs with the replicas of the working set. WebSphere eXtreme Scale would then create additional replicas to achieve full fault tolerance automatically.
WebSphere eXtreme Scale handles making sure the rules engines are spread evenly across the set of JVMs and handles moving them around to ensure this balance when additional JVMs are added or JVMs fail, this is elastic scaling. The prototype could easily be extended to allow rules engines to be added or removed while it's all running.
If anyone is interested in seeing the code then email myself or Vic and this will likely end up as an article/blog entry with the sample code over the next few weeks. The prototype basically gives you an XTP business rules grid.
Event Processing + Rules in a fault-tolerance, scalable way. This probably deserves a mention on TSS
Posted by: Grid curious | June 02, 2009 at 09:08 AM
err, what's your email address? Definitely interested in looking at the code.
Posted by: Grid curious | June 02, 2009 at 09:15 AM
In case you're not aware of it. There's quite a bit of prior art related to fault tolerant rule engine. I did a proof of concept with my rule engine Jamocha and coherence back in September of 2006. Actually, Jomocha was designed from the beginning to take advantage of data grids. I know oracle has also successfully combined JESS with coherence to build a fault tolerant rule engine. Back when Openspaces had their contest, I submitted this idea http://www.openspaces.org/display/REC/Rule+Engine+Compute+Grid.
There's also a paper on DJess http://www.waset.org/pwaset/v4/v4-18.pdf. I know Daniel and his group also explored fault tolerant techniques for JRules.
Posted by: Peter Lin | June 16, 2009 at 07:52 PM