A common question from customers is "Your APIs are proprietary". There is no real standard in the caching space. The JCache JSR doesn't look very attractive to me and I know I'm not on the expert group so who am I to complain about it but from what I'm seen it's a very 90s API, not so attractive when looking at how programming models are advancing with POJOisms and annotations etc. Our EntityManager style POJO interface is much easier to use and less invasive on the application.
Architecturally or conceptually (while the implementations are very different with pros and cons on both sides), products like ObjectGrid (aka WebSphere eXtreme Scale) and Oracle Coherence are similar so worst case if a customer wanted to jump ship then it's possible with some API changes. A pain to do but it's doable. An application designed to work on one should run on the other with no architectural changes needed.
There are also products that claim to have no APIs and customers frequently view the fact the ObjectGrid has an API as meaning ObjectGrid is proprietary and the other product isn't but I beg to differ. Portability is really what proprietary is all about. If something isn't portable (ish) to other products then I say it's proprietary. The product with no API (and yes, this is terracota) locks you in to a unique way of doing this stuff which for me is even more proprietary than an API because architecturally, it's different than the other products and so porting is very, very difficult. Much more so than porting from Coherence to ObjectGrid or vice versa because at least they are similar conceptually.
I won't be changing ObjectGrid to be like Terracota because I need ObjectGrid to scale out linearly to any number of boxes with absolutely constant response times and the APIs we provide give structure to good patterns for designing these kinds of scale out applications. We recently demonstrated a 1000 JVM grid with the same response times under proportionally greater load as a two JVM cluster. The response time curve is absolutely flat as we scaled up the grid and client load proportionally.
These APIs can guide customers down the path to designing applications that scale out. The lack of these APIs for enforcing patterns can lead to customers thinking this is easy, "I used a couple of JVMs, marked all my existing POJOs are clusterable and hey it all works". Not so fast. We're only talking two JVMs here. Lets try a 100 JVMs with a write intensive benchmark and see what kind of trouble we can get into and the answer is quite a bit of trouble.
Unless you've thought about how you will scale up your data structures (i.e. partition them cleanly) then it's going to be very difficult to scale this up at all. The synchronization traffic between the servers would basically kill you and more so when it's all funneled through a single (or even set) of central hub servers. Customers need to think it through and the APIs provide frameworks for patterns which will scale as the grid/cluster grows in size. So, using an API guarantees success? Of course not but it does provide a starting point to getting there.
The trick to scaling is ensuring that a transaction against some data doesn't require other servers (besides replicas) to be processing work on that transaction also (invalidation, synchronizing data etc). Transactions can't use ANY common resources if you want it to scale. If they do then that common resource is a single point of failure for all servers using it even if it's replicated because any outage for it impacts all servers using it for a period of time as well as the common resource will cause response times to grow as load grows.
When a single transaction runs against one JVM and a backup then it won't matter whether the cluster has 2 or 2000 JVMs in it. That transaction will run at the same speed and thats scalability. All transactions should not use any common/central servers so a failure to one server only impacts the subset of the data stored there and not all servers. If that transaction required invalidating a lot of peer servers or pushing updates to them then the response time will rise even if multicast is used because even with multicast, you are still going to be burning CPU and network on the other peer boxes processing the multicast messages.
So, no API with a unique architecture means it's proprietary because it isn't portable to other products. If later when you needed scaling if you tried to port to ObjectGrid or Coherence then you may end up having to redesign the application to scale out and it's likely going to be more difficult at that point given its written already and it may be doing things that worked on a couple of JVMs but won't work on a larger cluster.
Comments