Skip to Main Content

Java APIs

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Massive Distributed System!

843793Nov 1 2005 — edited Nov 1 2005
As part of my research I have created this huge java distributed system for calculations.

The system is composed of a single "Sender" which sends jobs to a farm of "Calculators" running on about 1000 different host machines. The Calculators send their results to a single "Receiver". Jobs could be easy e.g. 1+1 or hard e.g. something like log(1/[cos(ln3)]). As a result some jobs finish in seconds while others may even take an hour.

I have implemented this system and it works fine. But I am sure there is room for improvement and optimisation.

It needs to be:
more robust. E.g. I need good monitoring to deal with hosts crashing and dying while performing a calculation. How do I detect this? Shall I use the old heartbeat method? I don't think I can risk too many messages on the network.
more scalable E.g. what if I increase the number of hosts. Can RMI handle it or is there an alternative?
more adaptive to the type of job etc

Any other ideas or links to more info are welcome.

Thanks.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Nov 29 2005
Added on Nov 1 2005
3 comments
73 views