[Beowulf] lost in parallel computing
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
CHEN, XIAOMING CHEN25 at engr.sc.eduWed Dec 7 12:53:16 PST 2005
- Previous message: [Beowulf] Beginner advice
- Next message: [Beowulf] lost in parallel computing
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear all, I've been practicing scientific parallel computing for 3~4 years, but as a remote user I never really touched the subjects on parallel computer management. Things work out if the remote computers I am working on are managed well. However, when they are not in good hands, they will go on 'strike' for a long time. This is what I am experiencing now. One remote cluster just reloated recently and it lost myrinet. A new cluster purchased from Dell hasn't been working since it was installed 3 months ago. Another one has some strange behavior. For example, sometimes it writes data twice into a file in a random order; a user cannot kill his process unless he terminates the xwindow (i.e, exit). I guess during this holiday season nobody will stand out to solve the problem. But it seems such problems will continue to exist and evolve as computer technologies evolve themselves. I am wondering if a inexpensive but robust parallel executing environment is possible to build. If it is so difficult to maintain a parallel computer, how can we persuade people to invest money in parallel computers? This is the first time for me to post a message. Please kindly remind me if I do not follow the rules. I appreciate your response. Xiaoming Chen University of South Carolina
- Previous message: [Beowulf] Beginner advice
- Next message: [Beowulf] lost in parallel computing
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
