May 10, 2008

WebSphere eXtreme Scale (ObjectGrid) positioning with in memory databases like SolidDB

I'm getting asked a lot since IBM bought SolidDB, why do customers still need WebSphere eXtreme Scale (new name for ObjectGrid) if customers can have an in memory database? Thats a great question and it seems a lot of people are confused. So, here is my attempt to explain whats what.

History

To answer, lets do a little history. Electronic trading systems in the late 90s started switching to in memory database technologies with replication. This meant the application ran on a single box usually with a backup. This has worked for a while now but over the last couple of years these architectures are hitting performance walls.

The walls are that they do not scale horizontally, only vertically, i.e. add more processors. The events that need to be processed by those applications are exploding in volume from thousands/sec 5-10 years ago to 250k-500k/sec today. Single boxes simply cannot keep up anymore and it's very difficult to write applications that can scale linearly on large SMPs. Lets not talk about the cost of large SMPs either. The other issue is they do not address managing clouds or grids of boxes running those databases. That is left to the customers imagination. It's just a pair of boxes and thats it. They do not understand grid or cloud computing or the challenges of hosting applications in that kind of infrastructure.

This isn't limited to electronic trading anymore. Many industries are facing a data transaction performance crisis today:

  • Credit card fraud detection
  • Travel iterinery search and pricing
  • Product recommendations on retail sites
  • State management for batch processing
  • Insurance policy pricing
  • Web 2.0 applications and conversational state
  • Telco customer service management.

The present

Customers are looking to technologies like WebSphere eXtreme Scale to handle this issue. They are re-architecting the applications as constrained tree applications and hosting the applications on eXtreme Scale like infrastructures. Why?

An in memory database is a simple beast. It runs on a box and usually has a backup box. Thats what you buy. You typically try to run your application on the same box if not the same address space as the in memory database to get maximum benefit from the in memory design.

If a box dies then it fails over but doesn't repair the issue. You'd need to get another box and update  the remaining box to replicate there instead. Now, imagine a 100 boxes running the high end XTP stuff, are you going to do that manually when you have a 100 boxes also?

WebSphere eXtreme Scale automatically splits the dataset in to partitions (constrained tree schemas) and has primary and backup copies for each partition. It automatically places the primaries and backups for the partitions on the available boxes saving the customer from the task. An in memory database requires the customer to build infrastructure to handle this themselves, every customer.

WebSphere eXtreme Scale incorporates placement rules to make sure primaries and backups are on different boxes, on different floors, power grids, networks, buildings cities, what ever is needed by the customer. In memory database products require the customer to again do all this manually. Their world is a simple world where all they understand is typically a pair of boxes acting as a primary and backup. The customer is responsible to placing those servers on appropriate boxes.

If you want to scale out the application then just start more server instances and WebSphere eXtreme Scale scales up automatically and transparently to incorporate the extra memory, CPU and network in to the existing grid. Compare that with an in memory database which does nothing.

Routing events/requests in to the grid in the next issue and this can be done through direct routing clients, gateways or direct subscription within the grid.

These approaches can be made to work with an in memory database but the application would need to track what partitions are running where and write all the routing code. WebSphere eXtreme Scale virtualizes the partition primaries and 'knows' where everything is. You don't need to write this code, you get it for free.

Once you start putting application code with the data then eventually you'll want a richer environment application server and these are far and few between for C based products. The last successful C based application servers were Encina/Tuxedo or Component Broker and I don't see any customer looking to roll out new applications on those platforms. Databases offer very poor environments for running applications. Java applications using WebSphere eXtreme Scale can run in J2SE, a Spring container or a full blown application server like WebSphere ND. You can't compare the programming environment within ANY database with an application server, period.

Bottom line, in memory databases were state of the art in the late 90s for XTP systems but this is no longer so. They have no automatic scaling, self repair, and no concept of partitioning. They are simply a normal database optimized to work faster because it's in memory only. They are awesome for replacing a traditional database under the following conditions:

  • Application already uses SQL
  • Everything fits in a single address space
  • Application can't be rearchitected to be a constrained tree application for true linear scalability so slotting in an in memory database gives a temporary relief from performance issues.
  • Data model isn't partitionable
  • A small footprint, zero admin embedded C SQL database is needed.

If you need scale out then I'd seriously reexamine whether you really need SQL and want to write all the supporting logic to force an in memory database in to a scenario it's simply not designed for. It will usually be better to forgo SQL and just use something like WebSphere eXtreme Scale which just does it out of the box.

May 10, 2008 in XTP | Permalink | Comments (0) Sphere

May 06, 2008

Bob Lozanos (Appistry) take on the Spring App server, nice summary

Check it out here.

May 6, 2008 | Permalink | Comments (0) Sphere

SpringSource application server, very nice, shame about the license

First, hats off to Rod, Adrian and the others. It's very nice and very much in line with where most people see this all going in the next few years.

It looks like what others and I have been planning/hoping to do over the next few years. Most of us are looking for an OGSi based distributed platform with a commercial friendly license (EPL, BSD or Apache). I don't think there is an appetite to develop a proprietary OSGi runtime in any camp right now and IBM, through eclipse, has been placing a very good OSGi implementation into real open source as Equinox, i.e. a commercial friend license, no dual licensing fooling around. WebSphere already has an OSGi enabled distributed runtime. I don’t see much of an appetite to develop another one that does almost the same thing as before, I think we’ll use something rather than build it.

The platform should allow middleware or profiles to run on top of it like JavaEE, WebSphere eXtreme Scale, ESBs, Event processing, process engines and so on. The platform should be distributed, i.e. allow me to install a bundle set or profile on a set of machines matching a filter (this is a cluster) and then start/stop machines, do distributed monitoring on it also. Stuff like Wily or JXInsight is pretty cool for monitoring given they don't require code changes as they use aspects to monitor existing code. I'd like to run Oracle and IBM profiles on the same platform at the same time.

I see this as the new JVM, a module or bundle oriented runtime that’s also distributed. I don't want to see this runtime controlled by one vendor and licensed, I'd like to see it available for free as a community project with a commercial friendly license (Apache, EPL, BSD). Most vendors have absolutely no interest in a GPL type license. IBM has already contributed a lot of work towards this end in Eclipse.

I know SpringSource are doing valuable work making OSGi consumable by the mass market. OSGi is too hard for most people. Spring-DM is a great step in the right direction and I think Adrian is standardizing some, if not all of this, in OSGi anyway.

I think the commercial vendors see the value as not in the platform but in the profiles running on top. I expect to see a commoditized runtime for hosting profiles or middleware/applications. Springsource for now seems to be treating the platform as the valuable thing and I just don't see that. Of course, this is what they have so it's valuable to them. IBM and other vendors have a library of profiles like messaging, process flow etc that is why there is a different perspective. I understand why they are doing it but I don't see it being widely adopted by the large commercial vendors. IBM has already contributed a lot of code in Equinox etc under commercial friendly licenses and this will likely continue (I don't speak for IBM in this blog).

I’d like to see monitoring of this also pluggable and I don’t see a need for APIs here. What I’ve seen of Wily and William Louths JXInsight has convinced me that we do not need APIs to do monitoring. Aspects are hands down for Java, the way to go and SpringSource has tackled getting AOP to work in an OSGi environment and hopefully that gets standardized also.

I think monitoring will be a valuable profile on top of the common runtime and I fully expect vendors to be selling this to monitor what-ever people deploy on this infrastructure.

So, to summarize, I don’t see the platform as the valuable thing. I see it as a commodity. I see profiles and monitoring profiles as the valuable thing and I’d like to see a commercial friendly licensed OSGi distributed runtime as the new JVM that vendors build middleware/profiles for. Given, Spring DM is Apache licensed, I can see the extra work in the Spring server being clean roomed and made available with EPL or Apache pretty soon and this will limit the value from selling the SpringSource server. Duplicating higher value profiles like process flow etc is clearly not so easy and this is why these are the valuable things that vendors will continue to charge for. I don't want to trivialize what SpringSource has done, it's very cool and it's needed but given most of its components are Apache 2.0 or EPL then the last gap is a lot simpler than building a Java EE or BPEL flow engine, thats all.

May 6, 2008 in J2SE | Permalink | Comments (6) Sphere

April 25, 2008

Sometimes no API is more proprietary than a vendor API

A common question from customers is "Your APIs are proprietary". There is no real standard in the caching space. The JCache JSR doesn't look very attractive to me and I know I'm not on the expert group so who am I to complain about it but from what I'm seen it's a very 90s API, not so attractive when looking at how programming models are advancing with POJOisms and annotations etc. Our EntityManager style POJO interface is much easier to use and less invasive on the application.

Architecturally or conceptually (while the implementations are very different with pros and cons on both sides), products like ObjectGrid (aka WebSphere eXtreme Scale) and Oracle Coherence are similar so worst case if a customer wanted to jump ship then it's possible with some API changes. A pain to do but it's doable. An application designed to work on one should run on the other with no architectural changes needed.

There are also products that claim to have no APIs and customers frequently view the fact the ObjectGrid has an API as meaning ObjectGrid is proprietary and the other product isn't but I beg to differ. Portability is really what proprietary is all about. If something isn't portable (ish) to other products then I say it's proprietary. The product with no API (and yes, this is terracota) locks you in to a unique way of doing this stuff which for me is even more proprietary than an API because architecturally, it's different than the other products and so porting is very, very difficult. Much more so than porting from Coherence to ObjectGrid or vice versa because at least they are similar conceptually.

I won't be changing ObjectGrid to be like Terracota because I need ObjectGrid to scale out linearly to any number of boxes with absolutely constant response times and the APIs we provide give structure to good patterns for designing these kinds of scale out applications. We recently demonstrated a 1000 JVM grid with the same response times under proportionally greater load as a two JVM cluster. The response time curve is absolutely flat as we scaled up the grid and client load proportionally.

These APIs can guide customers down the path to designing applications that scale out. The lack of these APIs for enforcing patterns can lead to customers thinking this is easy, "I used a couple of JVMs, marked all my existing POJOs are clusterable and hey it all works". Not so fast. We're only talking two JVMs here. Lets try a 100 JVMs with a write intensive benchmark and see what kind of trouble we can get into and the answer is quite a bit of trouble.

Unless you've thought about how you will scale up your data structures (i.e. partition them cleanly) then it's going to be very difficult to scale this up at all. The synchronization traffic between the servers would basically kill you and more so when it's all funneled through a single (or even set) of central hub servers. Customers need to think it through and the APIs provide frameworks for patterns which will scale as the grid/cluster grows in size. So, using an API guarantees success? Of course not but it does provide a starting point to getting there.

The trick to scaling is ensuring that a transaction against some data doesn't require other servers (besides replicas) to be processing work on that transaction also (invalidation, synchronizing data etc). Transactions can't use ANY common resources if you want it to scale. If they do then that common resource is a single point of failure for all servers using it even if it's replicated because any outage for it impacts all servers using it for a period of time as well as the common resource will cause response times to grow as load grows.

When a single transaction runs against one JVM and a backup then it won't matter whether the cluster has 2 or 2000 JVMs in it. That transaction will run at the same speed and thats scalability. All transactions should not use any common/central servers so a failure to one server only impacts the subset of the data stored there and not all servers. If that transaction required invalidating a lot of peer servers or pushing updates to them then the response time will rise even if multicast is used because even with multicast, you are still going to be burning CPU and network on the other peer boxes processing the multicast messages.

So, no API with a unique architecture means it's proprietary because it isn't portable to other products. If later when you needed scaling if you tried to port to ObjectGrid or Coherence then you may end up having to redesign the application to scale out and it's likely going to be more difficult at that point given its written already and it may be doing things that worked on a couple of JVMs but won't work on a larger cluster.

April 25, 2008 in XTP | Permalink | Comments (0) Sphere

April 22, 2008

Video: Introduction to WebSphere eXtreme Scale/ ObjectGrid with patterns and use cases

This is a video pod cast showing 40 or so slides with audio. They cover WebSphere eXtreme Scale or ObjectGrid and outline how the product works, the basic patterns we see employed at customers and some customer scenarios and which patterns apply. It's around 41 minutes long.

Download wxs_pattern_and_use_cases.mov

April 22, 2008 in WebSphere eXtreme Scale Videos | Permalink | Comments (2) Sphere

April 07, 2008

ObjectGrid gets renamed to WebSphere eXtreme Scale

IBM relaunched XD today at the Impact conference in Vegas. ObjectGrid aka DataGrid will now be called "WebSphere eXtreme Scale" from a product point of view. As names go, it could be worse :) but I'll be shortening it to WXS in my slides as it's pretty long. I have two speaking sessions at Impact. The first is tomorrow at 16:45. This is about the new stuff over the past 12 months in ObjectGrid aka WXS and the second is on Wednesday at 10:30 with some of the ways customers are using it. Both sessions start with a gentle introduction to what the product does for the first 20 minutes and then gets in to the different topics of the sessions.

April 7, 2008 | Permalink | Comments (2) Sphere

March 29, 2008

Will conversational flow engines and AJAX help with the coming multi-core problem?

A lot has been said about the coming issues as multi-core processors become mainstream if that hasn't happened already. Processors are getting more cores but individual cores are getting slower. This means that applications will need to be written in a parallel fashion to really take advantage of this and the majority of people will find that difficult, it isn't easy to do.
I've written quite a bit about this but I was thinking yesterday about the impact of all and AJAX and conversational state engines might just a a solution for a lot of web applications.
AJAX basically changes the way web applications are written. Pages before AJAX were monolithic. A single button resulted in a servlet processing the information a page at a time. A page might have had quite a bit of information on it with a lot of fields to validate against a database or caches etc. All of this logic would run on a single thread in the web container. Customer could multi-thread it but that wasn't easy.
AJAX changes this because each field can be validated 'live' using RPC calls to the server that process fields at a time. As you type the various fields on the page, each field can be validated separately using it's own RPC call from the page to the server. This results in many more server RPCs per page than pre-AJAX days but each RPC is much simpler than before also. This allows us to process the page using more smaller RPC and these RPCs can potentially overlap depending on how quickly information is filled in or how the page does things. Breaking up monolithic logic in to potentially parallel streams of smaller units of work is a good recipe for success on todays multi-core processors. Yep, there is more overhead because of the RPCs but it's probably a good trade off, those MIPs have to be used for something, right?
JavaEE does very little for people having to parallelize applications to run well on newer processors. But, Web 2.0/AJAX style applications and the new wave of POJO conversation centric flow engines may provide the building blocks to allow the masses to write parallel code without even realizing it. Conversation centric flow engines should also allow interactions with backend systems to be parallelized also in a natural way thats coded in the same way as people code web flow conversations using frameworks like Spring WebFlow. I can see how as the conversation flow engines like WebFlow evolve then they can also be used to parallelize many tasks in a familiar way and products like ObjectGrid can make that flow reliable and fault tolerant. By evolve, I mean, the conversation needs to be made independent of the web context and instead just be usable in a web context or any other context such as pulling state from several back-ends in parallel and blocking waiting for the results. Conversations may just become the next programming style and this should naturally allow parallel applications to be written easily/tooled and debugged. The current work on closures and conversational state need to be integrated also, lots of similarities and potentially programming model simplifications should be possible by combining both of those.

March 29, 2008 in J2SE | Permalink | Comments (2) Sphere

March 03, 2008

Understanding DataGrid query capabilities

I get asked a lot about the query capabilities of ObjectGrid and DataGrids in general. Customers are used to a normal database query capability and this is different than what a DataGrid provides. A normal database does not have a partitioned schema and it runs normally on a single machine. This means you can run queries against it that join any table to any other table and with enough time then you'll get the correct result.

DataGrids typically use a partitioned schema and run on multiple servers using a shared nothing architecture. They offer a partition aligned query capability. This means that within a single partition, the data can be indexed and then you can run complex queries against all data within that partition, a single partition. This is different than a normal database where a query can use all data. The query engines on DataGrids are usually limited to a single partitions worth of data with some aggregation capability on these results. This applies to all products on the market to my knowledge.

DataGrids allow cross partition queries by usually running the same query across all partitions in parallel and then combining the results using an aggregator.

This is a partition aligned query and is basically composed of two parts. One part runs on every partition producing a set of results from each partition which is then processed and then combined with the results from the other partitions and finally returned to the client. This pattern is usually described as a Map/Reduce type pattern. ObjectGrid has two ways to invoke this type of query.

Option one is a MapAgent which returns a Map<Key,Result> when it's complete. It runs on every partition and then produces results for all records matching a filter. This Map is then basically a simple union of the results from each partition. Some partitions might return no entries depending on the filter.

Option two is a ReduceAgent which returns a single object result. This basically runs on every partition and then processes all entries on that partition matching a query. These entries are processed together to produce a single object result. An example might be the sum of a particular attribute or the min/max value of a single attribute. Finally, these single object results from each partition are reduced again to produce a single final result such as the overall sum of the attribute. A ReduceAgent can be used to implement a MapAgent (just return the Map as the result and do a union) and we provided MapAgent also as a convenience.

Queries need to be rewritten using option 1 or 2 to run on a partitioned datagrid like ObjectGrid. You can't write arbitrary SQL and give it to the grid for execution, the query engine isn't smart enough to figure out how to run any query efficiently on the partitioned architecture. Products like IBM DB/2 EEE/DPF can do this kind of query but the query engine implementation is extremely complex. I was entertaining adding that kind of query engine to ObjectGrid but after talking with Bruce Lindsey (an IBM Fellow and expert on all things databasey) I was talked out of it.

So, I hope this blog post shows how queries should be implemented on a DataGrid type product. It's different than a normal database and once this difference is appreciated then efficient queries can be easily implemented to exploit the parallelism possible using a DataGrid. DataGrids can process data faster than a conventional database because they scale out and use this simplified query model. Clearly, the set of queries is restricted when compared with a normal database but this is necessary from a performance perspective. A database can only use the CPUs within a single box and must access the data from a single store, the disks on that machine. A datagrid has no such limitations as it can run on hundreds of servers using a shared nothing architecture but it can't usually run any query because it uses a partitioned architecture.

March 3, 2008 in XTP | Permalink | Comments (0) Sphere

February 28, 2008

Speaking at IBM's Impact Conference

I'm doing two sessions at Impact on ObjectGrid. One on whats new in ObjectGrid and the other on customer use cases and what they are using it for. Hopefully, I'll meet a few of you there.

February 28, 2008 | Permalink | Comments (1) Sphere

February 25, 2008

Good article on DataGrids in Syscon

Google reader just spotted this article and while written by Oracle, it all applies equally to ObjectGrid. Nice article.

February 25, 2008 in XTP | Permalink | Comments (2) Sphere