Posted at 05:55 PM in Cloud | Permalink | Comments (0)
|
Memcached is a pretty popular free open source distributed cache. It's pretty unique compared with the commercial data grids in that it has a completely unique design.
Memcached is basically a dumb server. This means it's a process running on a Linux box typically which accepts client requests for keys and returns the data. Updates and deletes are also supported. This is ALL the function in the server, hence, dumb server. The client is the only piece which is 'distributed' or even knows about other servers. The clients typically get a collection of host:ip addresses (they all NEED to use the same list) and the clients hash the key over the set of online servers. If there were 4 servers running then all clients will hash data for a key over these 4 servers. Clients will write data for a key K to a particular server using this hash algorithm. One server.
Now, thats what memcached does out of the box. Note, there is no replication on the server side. If a server fails then the clients will hash keys over 3 servers but this changes everything from a hashing point of view so it's likely you'll get a very high miss rate as it's unlikely keys still hash to the same server as before given the modulo part of the hashing is now 3 instead of 4.
Clients work around this by adding a consistent hashing layer on top of the normal client. This works better and writes may be written to a server and the next server in the hash ring. This slows down writes by 2x but allows some recoverability when a server fails. The next server in the ring has the data from the dead server so it's not as bad as before BUT if this server also fails then now the cache has lost the data. It's not replicated remember, it's just written twice. After the first server failed, the clients write changes to the next server and the following server in the hash ring. The new next server in the ring does not have a copy of all the data. It was never written there. It will just have new changes since the failure. There is no active replication going on here.
Most of the materiel on the web shows various workarounds to try to make memcached 'reliable'. A lot of the facebook engineering blog entries focus on this aspect. What they had to do to make it reliable. This works for facebook but won't work in general because I don't think the majority of customers are structured like they are from a build your own mentality. I was talking to someone from one of the big internet shops last week and they always run two memcached clusters for HA. If a server in cluster A fails then they stop using cluster A and switch to B. This works but thats a 50% loss in hardware because of a single failure.
Smart servers
Commercial data grids like IBM WebSphere eXtreme Scale are very different. They handle making the data fault tolerant on the server side with built-in replication and elastic scale out and scale in. Their clients understand the server side data layout or routing. If a server fails with WebSphere eXtreme Scale then no data is lost. The software actively maintains replicas on other servers. Clients that stored an entry with a key K before will still find it. It just works. You don't need consistent hashing on the client, our client does it using routing tables instead. Adding more server processes causes the product to automatically redistribute the minimum amount of data to leverage the new memory/CPU and network available through the new JVMs.
The smart servers allow proper replication which means that the cache isn't as brittle as the memcached cache when servers come and go. If nothing ever fails then the memcached approach works but the potential cache hit rates over time if multiple failures occur mean it can be a liability. The last facebook entry I read on this showed them crawling the logs to try to recover lost data to make the 'next server' more complete. They get this issue and are jumping through hoops to handle it but again, most commercial customers won't do this. They expect the product to do this sort of stuff like replication and they view it as table steaks.
So, memcached for sure has its uses but it's got some limitations which need a fair bit of work by its users to handle correctly if its not exactly what you want or risk some bad scenarios as the cache hit rate climbs. It's very, very fast but you gotta remember, it's not doing much either... No transactions, no consistency, no replication, no isolation and so on. If you don't need these features then it's pretty fast but if you do need this stuff then it starts to slow as you add code on the client side to try and compensate for the limitations.
Posted at 02:18 PM in Cloud, High Availability, nosql, Persistence, Web/Tech, WebSphere eXtreme Scale, XTP | Permalink | Comments (2)
|
VMWare has a problem. Consolidation is the usual angle thats used to sell it. Most boxes run at 5% load so ideally you want to consolidate 15 machines or so on to a single box pushing its utilization to 75%. But, that means you need memory for 15 virtual machines on the box. Thats a lot of memory and memory isn't cheap.
So, they usually say overcommit the memory and everything will be ok. Thats not true (especially for Java applications, we run into this ALL the time, don't believe the ok to overcommit advice) but thats another blog post on why swapping and Java DO NOT MIX right now. VMWare needs a way to reduce the cost of RAM so that it lowers the cost of the platform and lets people host as many virtual machines as possible on a single box without breaking the bank on memory cost.
A normal unvirtualized box runs an operating system and the OS hosts processes. One per application maybe. Memory utilization is pretty efficient. All processes can share libraries in memory, file system caches and the memory of the operating system is amortized over many processes.
This isn't the case with VMWare, a hypervisor. Each virtual machine runs its own copy of the operating system with its own file system cache. There is usually no sharing of memory between virtual machine instances (although thats changing). This means a virtualized box is inherently less memory efficient than a single operating system instance running the same applications. The multiple operating system instances are costing VMWare a lot of memory. A hypervisor may be more CPU efficient but it's not as memory efficient as a single operating system running the same applications. You can enable swapping to avoid the need for so much memory but Java applications don't work well with swapping so lets assume we are not swapping. Now, there are advantages to using a hypervisor but we are talking about the resources needed to run X applications here and memory factors in.
VMWare could get in to the operating system business but thats going to fail. Windows and Linux basically own that space. Microsoft is busy adding hypervisor capabilities to their own stack which will cost less than VMWare so hosting Microsoft applications is probably not a long term business for them.
I said at the beginning that Java is important to VMWare. Normal applications are built against a specific operating system thats evolving. VMWare is unlikely to be able to host normal applications without an operating system because they can't emulate the operating systems. They would need to host the operating system as a virtual machine and as we said earlier, thats not memory efficient.
Now, there are a lot of Enterprise applications written in Java. This means that if VMWare could host Java applications natively, i.e. if VMWare can host a Java virtual machine without an operating system then potentially they can be as memory efficient as an operating system. They can use shared memory between JVM instances for JITed code. They would only need to implement the Java API which is basically portable across machines and operating systems. This is significantly easier than emulating Linux or Windows. A Java skin gives them a way to host Enterprise Java applications natively and cut the memory overhead a hypervisor brings to the table by removing the overhead of an operating system.
A hypervisor has advantages over a native operating system in that it can move virtual machines from one box to another and suspend/resume them. Application or virtual machine mobility is a big selling point for a hypervisor. It's possible that VMWare sees Java as a way to lower the cost of virtualizing Enterprise applications. However, operating system zones and WPARs offer application mobility also so this advantage is not such a big deal anymore. A big advantage WPAR/Zones have over virtual machine mobility is that you're not bound to an operating system with WPAR/Zones. You can upgrade an application by moving it from a Linux Version N to a Linux Version N+1 machine. Moving a VMWare image doesnt upgrade the OS, the OS is still the same after you move the virtual machine. You'd need to upgrade it in the virtual machine which is more hassle than just moving a WPAR/Zone.
This native Java Virtual Machine puts them in a similar situation to Azul. Azul can run standard Java code but unless the applications are certified to be supported on it then customers are reluctant to do it and vendors are reluctant to support their products on that configuration mostly because of support/testing cost. A VMWave JVM would suffer the same consequences. Application server vendors would have to certify and support it to move applications over and that can be a tough thing to get them to do.
Buying Spring gives VMWare Spring TC and Spring DM. These are Java application containers which can probably run 80% of departmental and many enterprise Java applications (Servlets, JDBC, JPA, JMS). It's not JavaEE certified but it's probably enough to run most Java applications. If customers can move Java applications to such a container and hosting them on a virtualized environment then potentially you can lower VMWares TCO as it'll require less memory to host them.
I'm actually skeptical that VMWare offers many advantages over just running an operating system natively on the box for Java applications. Best case, VMWare would match the memory footprint of the operating system and it would likely run the applications a little slower. Application mobility is still an issue for operating system hosted applications but technologies like Solaris Zones or AIX WPARs are starting to address application mobility without the memory cost of a full virtualized environment such as VMWare.
So, to summarize. VMWare could host Java applications natively to lower TCO. This would allow them to host multiple Java applications on a single box with similar TCO best case to a conventional operating system. Hosting commercial application servers like this would likely be a support problem but as I said, many Java applications would run on a native Spring TC environment. Application mobility between boxes was a big advantage for hypervisors such as VMWare but operating system features such as Solaris Zones and AIX WPARs are rapidly eroding that advantage because applications installed in a zone or WPAR are basically portable anyway between machines and have the advantage of not being bound to an operating system version as is the case with a virtual machine image. Once customers figure this out then they may decide that using a conventional operating system with WPARs or Zones is actually a more cost efficient and easier path to swallow than picking up a new platform based on a hypervisor which would not be as memory efficient as a normal operating system would be.
Will VMWare ultimately fail because what customers really want is not virtual machine mobility, they want application mobility?
Posted at 11:07 PM in Cloud, virtualization | Permalink | Comments (7)
|
This is long overdue. You can read about it here. Thanks to Geva Perry to spotting this announcement. This is very cool. This lets people use a cloud hosting service in a much more intuitive way. Lets look at a basic system. You might have an LDAP service and then have a tomcat based service also and a mysql service. A working system is at least three virtual machines (I know it could be combined but work with me here). This means three systems are permanently running. I may add virtual machines to the LDAP and WEB services based on load. A front end HTTP sprayer is another machine which may be permanent.
Posted at 03:20 PM in Cloud | Permalink | Comments (0)
|