My Photo

Become a Fan

DailyMile

Google Ad Skyscraper

« Battle of the mash ups, Adsense versus Yahoo | Main | Cache size and multi-core »

November 21, 2006

Comments

Billy

Chris,
1) Clock speed are coming down. The onus now is on performance per watt not performance per thread as has been the case until AMD kicked Intel around. We'll start to see chips with more and more cores and the clocks will drop to control heat. You can believe me or not but I don't think this is in dispute industry wide.

2) Very few people write very fast Java code, C people don't have the high level libraries in Java and the code tends to be tigher as a result. I agree that JITs have advantages over statically compiled code and can do things impossible with a conventional compiler given profiling info etc.

3) Simple threading if it works is best but not all apps work like that. We didn't make commonj for nothing, we did it because some applications needed very high performance and needs a thread model suited to the application to get this high level of performance. One size does not fit all. If performance

4) Concurrent GC is indeed a beautiful thing but if applications have large caches or utilize those 64 bit address spaces now with variable data when it too will run in to issues.

4a) Trust me, you are not going to see tight coupling between #cores and #caches. Thats not the direction. It's transistors for cores. If someone is building a 64 core system it won't have 64x the cache of a single core.

5) You're assuming only a single process on that set of cores then right? This isn't normal. This will cause issues with the cache and negate some of the gains you speak of.

The point of these articles isn't to present a dogmatic point of view. It's to air issues and promote debate and awareness and as such I welcome all of the comments :)

Alessandro

Very interesting blog.

I believe a few languages are ready to multi-core/multi-thread as Java, due his native support to multi threading.

I agree we have now a new scene, and we all must prepared to face it. I guess we'll need :

* Better profiller tools to debug and optimize multithreaded applications;
* Enhanced JVMs and GC algorithms to use CPU power spread across several cores and better use of L2 and L3 cache.
* And, perhaps, enhancements to Thread API and Concurrency Utilities package, as well good docs and examples.

A final point: I also agree that speed clock per core will slow down. But this can be a good thing, when it's reminds me, as example, Pentium M with nearly half speed could do more than old Pentium 4. Doing more with less clock cycle.

Tarik Guelzim

nice reflexion to begin with. what you said seems logical since clock speed will slow down eventually to compromise the power consumption.

but it seems that this isn't only a a java problem but all languages that rely on a gc to do the clean up so c# is on the same boat!! however I think that the jvm will scale up if used in cluster mode.

Dileep Kumar

Hi Billy,

Nice blog !
Do you have any data to substantiate your claim or it is just your perception. As said by Scott earlier the T2000 server from Sun has outpaced all the 3.x(higher) GHZ processor in terms of performance for most of the mutithreaded JAVA apps. Hey we are heading to a 64 Bit JVM, where GC will be taking lot of cycles and time but there are changes happening to the JVM's as well like induction of ParallerOLDGC flag. As said earlier there must be slowness in slowness in single theraded activity but we don't do lot of that with J2EE based servers ...

Billy

Hi Dileep,
The whole whats better story is real interesting. The new quad intel in 2 socket form is expected to beat the pants of Niagara but it illustrates the differences. 'Only' 4 cores but a high clock, many cores and a lower clock or AMD with networked dual core high clock CPUs?

It will be interesting moving forward to see if high multi-core cpus like niagra can run at full speed in typical scenarios.

Dileep Kumar

I agree with your comments and Niagara is not the end of road and of course Intel and likes can make better/worse processor than what exists today. All I wanted to point out that so far there doesn't seems to be any cases to support your initial claim of "multi-core being bad for java".

James

It would be great if you could post a part two followup to this blog. It would be my thoughts that a slower clock speed is OK if there are hundreds of cores as the time difference would be made up by avoiding preemption.

Architectures such as Azul where a J2EE application can gain access to 384 cores shouldn't slow down response times as you state...

Rob

This seems to fall in line with some of your reasoning Billy.

http://www.dailytech.com/article.aspx?newsid=5201

Essentially the 'next' processors are lower clocked woodcrests. Granted these are specificaly low voltage, but since that also equates to heat, sooner or later we're going to see this type of things as a necessity to higher smp chips.

Oleg Konovalov

I had practical experience with dual processor IBM Intellistation PC with Pentium 700MHz (Windows 2000) in 2001-2002. It run heavy J2EE application (JDK1.3, EJB/Servlets/JSP) Websphere, Oracle, CVS, many developers connecting to VisualAge repository, some performance monitoring tools, anti-virus, often ERWin, it was just going and going and going. Don't remember it showing CPU utilization of more than 50%, running for weeks without crash or reboot, even with load tests, that was just amazing...
Normal single processor PC would choke on half of it in 2001.
So from that I think that multi-core PCs should greatly benefit Java apps with its normal multi-threading.
Or do you think that multi-core PC is so different from dual processor one ?

Jack Rogers

Interesting reply here:
http://dev2dev.bea.com/blog/hstahl/archive/2006/12/multicore_is_go.html

H. Stahl seems to think Multi-Core Java is good for you, and has some results to support it...

Billy

We've actually been emailing about it and while for GC and systems code, we both agree that highly parallel code can be written the problem remains that clock speeds look set to fall which means Java needs a way to allow applications to multi-thread to the masses and it currently doesn't do that. See my response:

http://www.devwebsphere.com/devwebsphere/2006/12/new_java_langua.html

Emilio

I'm happy to say there's another way to stress test Java's ability to scale on multicore.

There's plenty of frameworks for OLTP and SOA like patterns, but ever try and do a bulk data processing app with Java? Say, 10 million rows of data that's 100 fields wide and your batch window is 20 seconds?

Perhaps some of the folks on this thread have time to help build more benchmarks?? I've got 1 on the site and many more in review about to be published.

http://www.pervasivedatarush.com/beta

Piotr Kołaczkowski

Hello,

I cannot agree with your statements. Enterprise applications benefit more from multi-core architectures than desktop applications do, because the threading model is very simple there - each request is handled by a different thread. If the threads don't share too much data, such applications scale pretty well. This is rather a case of better/worse application architecture than the Java itself. And java 5 new multithreading classes are great in simplifying things.

You can switch GC into parallel mode. This is not default, but you can and it works fine on our servers (we use J2EE to process lots of SMS messages from all over the Poland).

Lower clock speeds don't make our applications unresponsive. No-one notices if his request is handled in 50 ms or 100 ms. And if it wasn't multicore, we would have to handle more requests on a same core - some requests would simply wait for others to finish. Then it could be noticeable.

--
Our company website: http://www.dinf.pl/

Billy

Piotr,
Thanks for your comment. Here's a thought, imagine right now you are running on a power 5 or a 3.8Ghz Intel CPU and were happy. Now, I tell you that you are deploying on a PIII/800Mhz CPU. That is going to hit your response time pretty significantly. The problem is path lengths are getting way longer due to richer programming models, frameworks, convenience frameworks etc. Programmers are being saved by Moores law. That is coming to an end. When massively multi-core chips hit the market then you'll see a big drop on per core performance. Code path lengths that were acceptable before will not be now.

Rob

I agree with the original post, these multicore servers take time to do initial mark and remark which are stop the world operations on the CMS garbage collector. It's not uncommon to spend a week trying to tune GC on a big enterprise application, only to see you still get 6 second pauses whenever a 4GB old generation gets full or fragmented.

Andy

I agree that Java's path length is long but you gotta know that new multicore CPUs are designed to execute more instructions in one cycle. So even a C2D 6600 2.4GHz is running faster than a P4 HT 4.0GHz.

martinval

I have two questions:

1) Will hardware vendors really decrease clock speeds noticeably in order to increase core counts if they end up seeing that this makes e.g. Java applications behave noticeably more "sluggish"? (Note: they could instead decide to increase core count only as far as clock speeds stay reasonable - why opt for the hard way?)

2) Will "simpler" languages (like C or Fortran) and "less sophisticated" frameworks really not just "shorten path length" but also prove to make multithreaded design easier and less error prone for the average developer? (Note: even for memory management, going back to the malloc/free did not yield better code, less heap fragmentation or shorter allocation times - why go back to the stone age in terms of concurrency support?)

Vijay

My cheap Linux AMD box is twice as fast as a T2000 when running java app in Websphere. Why is that?

bvh

Billy,

I disagree with you that multi-core is bad for Java.

You are saying that:

"GC has elements that are multi-threaded but the main task remains single threaded and this thread will be slower on a multi-core CPU and get slower as clock speeds drop as more cores are added."

1) You are very wrong about the way GC works. Maybe you are talking about earlier versions of GC. A good article for GC tuning can be found here: http://java.sun.com/docs/hotspot/gc1.4.2/
For so many years experience I still have not met a developer that has complained that the bottleneck is in the Java's GC and not in his/her code.

2) Clock speed means something if and only if you are talking about clock speed on same architecture. Usually that is not the case, so 2 core cpu will be quite different animal than a 4, 8, etc core cpu. So forget about clock speed comparison.

If you visit Dublin give me a buzz and we can continue this dispute in Temple Bar ;).

The comments to this entry are closed.