PTP running on my second Linux box

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

PTP running on my second Linux box

Beth Tibbitts

I just got PTP running on my second Linux machine (IBM's version of Red Hat
- 4.2.1.6 i think).
I built OpenMPI, and  the PTP C projects from the command line.
It was lots easier this time.  Experience I guess.

Observations:
1. the C projects (orte, sdm, proxy, utils) all indicate problems in the
Eclipse PDE in my workspace.
      Complaints say "No rule to make target 'all'   and 'clean'
      I built them from the command line, though, and am ignoring these.
Seems to be OK.
2. The first time I brought it up it complained about not finding ORTE, and
set everything to simulated mode.
      However i changed it all back to ORTE, ran my cleanupOmpi script
(kills orte daemons and deletes temp files),
      relaunched, and all was well.
3. I had to remember to set the debugger to SDM in the launch
configuration.  It always defaults to nothing.
4. I have to keep remembering to run cleanupOmpi after each run.  Then
relaunch my runtime workspace (that runs PTP)
      User sees "Invocation Target Exception" which is the symptom of
needing to do this.
      We could probably catch this and give a more meaningful error
message.
      When my committer status is final (REAL SOON NOW), I could probably
do that.
5. I get a lot of "workspace in use" errors and have to  delete the .lock
file in my runtime workspace.
      Don't know if that's something to do with PTP, or just a flakiness of
Eclipse on Linux.

I haven't run anything more than "hello MPI world,"  but I'm a happy camper
right now.

=============cleanupOmpi.sh==============  per Greg's instructions a while
back.
#!/bin/sh
echo OpenMPI cleanup for PTP...
echo killing orted processes
killall -9 orted

echo killing ompi_server processes
killall -9 orte_server

echo kill dsm or gdb processes if were debugging
killall -9 sdm gdb

echo Remove any directories starting with openmpi in /tmp
rm -rf /tmp/openmpi*



...Beth

Beth Tibbitts  (859) 243-4981  (TL 545-4981)
IBM T.J.Watson Research Center
Mailing Address:  IBM Corp., 455 Park Place, Lexington, KY 40511

_______________________________________________
ptp-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: PTP running on my second Linux box

Clement Chu-3
Hi Beth,

Beth Tibbitts wrote:

>I just got PTP running on my second Linux machine (IBM's version of Red Hat
>- 4.2.1.6 i think).
>I built OpenMPI, and  the PTP C projects from the command line.
>It was lots easier this time.  Experience I guess.
>
>Observations:
>1. the C projects (orte, sdm, proxy, utils) all indicate problems in the
>Eclipse PDE in my workspace.
>      Complaints say "No rule to make target 'all'   and 'clean'
>      I built them from the command line, though, and am ignoring these.
>Seems to be OK.
>2. The first time I brought it up it complained about not finding ORTE, and
>set everything to simulated mode.
>      However i changed it all back to ORTE, ran my cleanupOmpi script
>(kills orte daemons and deletes temp files),
>      relaunched, and all was well.
>3. I had to remember to set the debugger to SDM in the launch
>configuration.  It always defaults to nothing.
>4. I have to keep remembering to run cleanupOmpi after each run.  Then
>relaunch my runtime workspace (that runs PTP)
>      User sees "Invocation Target Exception" which is the symptom of
>needing to do this.
>  
>
Can I have the error message of Invocation Target Exception? I think you
can find it in "Error log view". Thanks.

>      We could probably catch this and give a more meaningful error
>message.
>      When my committer status is final (REAL SOON NOW), I could probably
>do that.
>5. I get a lot of "workspace in use" errors and have to  delete the .lock
>file in my runtime workspace.
>      Don't know if that's something to do with PTP, or just a flakiness of
>Eclipse on Linux.
>
>I haven't run anything more than "hello MPI world,"  but I'm a happy camper
>right now.
>
>=============cleanupOmpi.sh==============  per Greg's instructions a while
>back.
>#!/bin/sh
>echo OpenMPI cleanup for PTP...
>echo killing orted processes
>killall -9 orted
>
>echo killing ompi_server processes
>killall -9 orte_server
>
>echo kill dsm or gdb processes if were debugging
>killall -9 sdm gdb
>
>echo Remove any directories starting with openmpi in /tmp
>rm -rf /tmp/openmpi*
>
>
>
>...Beth
>
>Beth Tibbitts  (859) 243-4981  (TL 545-4981)
>IBM T.J.Watson Research Center
>Mailing Address:  IBM Corp., 455 Park Place, Lexington, KY 40511
>
>_______________________________________________
>ptp-dev mailing list
>[hidden email]
>https://dev.eclipse.org/mailman/listinfo/ptp-dev
>
>  
>

Regards,
Clement

--
Clement Kam Man Chu
Research Assistant
School of Computer Science & Software Engineering
Monash University, Caulfield Campus
Ph: 61 3 9903 1964

_______________________________________________
ptp-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: PTP running on my second Linux box

Greg Watson-2

On Feb 2, 2006, at 3:37 PM, Clement Chu wrote:

> Hi Beth,
>
> Beth Tibbitts wrote:
>
>> I just got PTP running on my second Linux machine (IBM's version  
>> of Red Hat
>> - 4.2.1.6 i think).
>> I built OpenMPI, and  the PTP C projects from the command line.
>> It was lots easier this time.  Experience I guess.
>>
>> Observations:
>> 1. the C projects (orte, sdm, proxy, utils) all indicate problems  
>> in the
>> Eclipse PDE in my workspace.
>>      Complaints say "No rule to make target 'all'   and 'clean'
>>      I built them from the command line, though, and am ignoring  
>> these.
>> Seems to be OK.

I think this happens because eclipse tries to build all the projects  
automatically but because configure hasn't been run there are no  
Makefile's yet. I create an external launch configuration to run  
configure, then you can run the make targets, all from within Eclipse.

>> 2. The first time I brought it up it complained about not finding  
>> ORTE, and
>> set everything to simulated mode.
>>      However i changed it all back to ORTE, ran my cleanupOmpi script
>> (kills orte daemons and deletes temp files),
>>      relaunched, and all was well.

Nathan, can you comment on this?

>> 3. I had to remember to set the debugger to SDM in the launch
>> configuration.  It always defaults to nothing.

Clement, I think this should default to the SDM.

>> 4. I have to keep remembering to run cleanupOmpi after each run.  
>> Then
>> relaunch my runtime workspace (that runs PTP)

I should check in the clean script, since it's going to be needed  
until they fix the problems...

>>      User sees "Invocation Target Exception" which is the symptom of
>> needing to do this.
>>
> Can I have the error message of Invocation Target Exception? I  
> think you can find it in "Error log view". Thanks.
>
>>      We could probably catch this and give a more meaningful error
>> message.
>>      When my committer status is final (REAL SOON NOW), I could  
>> probably
>> do that.

Yes, it would be nice to catch this error and cancel the launch. I  
think Clement has put in some code to do the cancel, so it should  
just be a matter of catching this exception more gracefully.

>> 5. I get a lot of "workspace in use" errors and have to  delete  
>> the .lock
>> file in my runtime workspace.

I don't have this problem unless I kill Eclipse with a kill -9. When  
are you seeing it?

>>      Don't know if that's something to do with PTP, or just a  
>> flakiness of
>> Eclipse on Linux.
>>
>> I haven't run anything more than "hello MPI world,"  but I'm a  
>> happy camper
>> right now.

Cool!

Greg
_______________________________________________
ptp-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: PTP running on my second Linux box

Nathan DeBardeleben


Greg Watson wrote:
>>> 2. The first time I brought it up it complained about not finding
>>> ORTE, and
>>> set everything to simulated mode.
>>>      However i changed it all back to ORTE, ran my cleanupOmpi script
>>> (kills orte daemons and deletes temp files),
>>>      relaunched, and all was well.
>
> Nathan, can you comment on this?
>
When you bring PTP under a clean workspace (such as for the first time)
it defaults to ORTE monitoring and control systems.  However, it prompts
the user to provide the path to their orted and orte_server binaries.  
It then tries to execute them when you 'OK' that dialog box and if it
fails it reverts to the simulation.

If I understand Beth correctly, she's saying that she put in the correct
paths, OKed the dialog, and it still went to the simulation.  This MAY
be a change in the way the systems start up since Clement's change but
I'd really need the stdout/console log to debug this.  Any chance you
can reproduce it and send it along, Beth?

-- Nathan
Correspondence
---------------------------------------------------------------------
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: [hidden email]
---------------------------------------------------------------------

_______________________________________________
ptp-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: PTP running on my second Linux box

Nathan DeBardeleben


Nathan DeBardeleben wrote:

>
>
> Greg Watson wrote:
>>>> 2. The first time I brought it up it complained about not finding
>>>> ORTE, and
>>>> set everything to simulated mode.
>>>>      However i changed it all back to ORTE, ran my cleanupOmpi script
>>>> (kills orte daemons and deletes temp files),
>>>>      relaunched, and all was well.
>>
>> Nathan, can you comment on this?
>>
> When you bring PTP under a clean workspace (such as for the first
> time) it defaults to ORTE monitoring and control systems.  However, it
> prompts the user to provide the path to their orted and orte_server
> binaries.  It then tries to execute them when you 'OK' that dialog box
> and if it fails it reverts to the simulation.
>
> If I understand Beth correctly, she's saying that she put in the
> correct paths, OKed the dialog, and it still went to the simulation.  
> This MAY be a change in the way the systems start up since Clement's
> change but I'd really need the stdout/console log to debug this.  Any
> chance you can reproduce it and send it along, Beth?
>
I just saw this error myself, Beth - or at least a new error that looks
like the one you said. :)
It's definitely something that's come up with Clement's recent change
not to load the monitoring / control system until it's really needed.  
It's just something I need to change about the way I do things under the
new methodology.  I'll get on it ASAP.

-- Nathan
Correspondence
---------------------------------------------------------------------
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: [hidden email]
---------------------------------------------------------------------

_______________________________________________
ptp-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/ptp-dev