IRemoteProcess.isCompleted occaisionally fails to report process completion

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2

I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

The loop I have to wait for process completion is

for (;;) {
if (process.isCompleted()) {
break;
}
if (monitor.isCanceled()) {
process.destroy();
return new Status(IStatus.CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
}
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// Do nothing, sleep just ends early
}
}

I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

Any thoughts or suggestions about what might be going on?

Thanks

Dave


}
}

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev

Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

Greg Watson-2
Hi Dave,

Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

Regards,
Greg

On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

The loop I have to wait for process completion is

for (;;) {
if (process.isCompleted()) {
break;
}
if (monitor.isCanceled()) {
process.destroy();
return new Status(IStatus.CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
}
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// Do nothing, sleep just ends early
}
}

I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

Any thoughts or suggestions about what might be going on?

Thanks

Dave


}
}

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2

Greg
I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

Both my reader threads are waiting

I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.

The Session class is com.jcraft.jsch.Session

I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

Dave



Inactive hide details for Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nGreg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 01/23/2018 10:57 PM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





Hi Dave,

Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

Regards,
Greg
      On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

      I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

      The loop I have to wait for process completion is


      for (;;) {
      if (process.isCompleted()) {
      break;
      }
      if (monitor.isCanceled()) {
      process.destroy();
      return new Status(IStatus.
      CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
      }
      try {
      Thread.
      sleep(1000);
      } catch (InterruptedException e) {
      // Do nothing, sleep just ends early
      }
      }


      I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)


      I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.


      Any thoughts or suggestions about what might be going on?


      Thanks


      Dave



      }
      }

      _______________________________________________
      ptp-dev mailing list
      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit
      https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=



_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

Greg Watson-2
Dave,

Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

Another thought. Do you know if the remote process is using a PTY or not?

You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

Greg


On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

Greg
I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

Both my reader threads are waiting
<17443150.gif>
I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
<17931618.gif>
The Session class is com.jcraft.jsch.Session

I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

Dave



<graycol.gif>Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

From:  Greg Watson <[hidden email]>
To:  Parallel Tools Platform general developers <[hidden email]>
Date:  01/23/2018 10:57 PM
Subject:  Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by:  [hidden email]





Hi Dave,

Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

Regards,
Greg
      On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

      I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

      The loop I have to wait for process completion is


      for (;;) {
      if (process.isCompleted()) {
      break;
      }
      if (monitor.isCanceled()) {
      process.destroy();
      return new Status(IStatus.
      CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
      }
      try {
      Thread.
      sleep(1000);
      } catch (InterruptedException e) {
      // Do nothing, sleep just ends early 
      }
      }


      I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs) 


      I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.


      Any thoughts or suggestions about what might be going on?


      Thanks


      Dave



      }
      }

      _______________________________________________
      ptp-dev mailing list
      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit
      https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2

Greg
The only process left hanging around on the remote side is a sshd process that I think gets created as part of the process of opening the profile configuration dialog, after I'm prompted to connect to the remote node. I'm guessing that Eclipse code is keeping that sshd process as part of maintaining the connection to the remote system, instead of creating a new sshd process for every interaction with the remote system. I'm further guessing that there's some 'ssh' message interchanges where something is missing the fact that the remote bqueues command exited, and failed to post or recognize a completion status.

When the bqueues command gets invoked I see a 'bash' process and a 'bqueues' process.

I killed the sshd process and that didn't change anything.

I'm not sure if the sshd process is using a pty. I think it isn't since 'ps -u dwootton' shows '?' in the TTY column. I have a second sshd process because I ssh to the node and that process has 'pts/40' in the TTY column.

I looked at the /proc/<pid>/cmdline file for both sshd processes. My sshd process has 'dwootton@pts/40' and the Eclipse initiated sshd process has 'dwootton@notty.

I tried looking at other /proc/<pid> files to see what else I could discover, but sshd apparently is setuid root, starts as root and then setuid to dwootton after setting everything up, so I can't view many of the useful /proc files, and I can't gdb attach to the sshd process to see what it's doing..

I tried killing just the Eclipse thread that looked like it was the sshd connection thread, but I apparently can't kill just a single thread. Clicking 'Terminate' in the popup menu I got when I right clicked over the process, but it killed the entire Eclipse runtime instance.

Doing something like sending a 'bqueues -w ; echo 'EOF' probably isn't going to work for me. The LSF target system configuration code issues a few LSF commands, and it appears that LSF is not particularly consistent about where it sends normal messages and error messages. I see some messages I would consider as normal progress messages in the stderr stream. Also, I have some normal completions where nothing is apparently written to stdout, for instance if I have no LSF reservations

Because of this, I'm trying to determine success or failure by getting the process exit status and checking for zero or non-zero. If I'm looking for text in the stderr and/or stdout streams to determine successful completion, I suspect that's going to be a problem since I can't tell from the returned text if I got success or failure. Maybe if I do something like 'bqueues -w ; echo "EOF:$?" then I get the status. Maybe I'll try that if we can't figure out what';s going wrong with sshd.

Finally, I don't know if I have a consistent pattern, but I seem to get a hang the first time I issue a bqueues command in a session and then intermittently after that.

Dave


Inactive hide details for Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g.Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 01/24/2018 12:10 PM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





Dave,

Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

Another thought. Do you know if the remote process is using a PTY or not?

You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

Greg

      On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

      Greg
      I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog


      Both my reader threads are waiting

      <17443150.gif>

      I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.


      The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.

      <17931618.gif>

      The Session class is com.jcraft.jsch.Session


      I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.


      Dave




      <graycol.gif>
      Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

      From:
      Greg Watson <[hidden email]>
      To:
      Parallel Tools Platform general developers <[hidden email]>
      Date:
      01/23/2018 10:57 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      [hidden email]





      Hi Dave,


      Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?


      Regards,
      Greg
              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

              The loop I have to wait for process completion is

              for (;;) {
              if (process.isCompleted()) {
              break;
              }
              if (monitor.isCanceled()) {
              process.destroy();
              return new Status(IStatus.
              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
              }
              try {
              Thread.
              sleep(1000);
              } catch (InterruptedException e) {
              // Do nothing, sleep just ends early
              }
              }

              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

              Any thoughts or suggestions about what might be going on?

              Thanks

              Dave



              }
              }

              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=



_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2
In reply to this post by Greg Watson-2

Greg
I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}.

I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

Dave

Inactive hide details for Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g.Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 01/24/2018 12:10 PM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





Dave,

Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

Another thought. Do you know if the remote process is using a PTY or not?

You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

Greg

      On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

      Greg
      I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog


      Both my reader threads are waiting

      <17443150.gif>

      I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.


      The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.

      <17931618.gif>

      The Session class is com.jcraft.jsch.Session


      I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.


      Dave




      <graycol.gif>
      Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

      From:
      Greg Watson <[hidden email]>
      To:
      Parallel Tools Platform general developers <[hidden email]>
      Date:
      01/23/2018 10:57 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      [hidden email]





      Hi Dave,


      Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?


      Regards,
      Greg
              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

              The loop I have to wait for process completion is

              for (;;) {
              if (process.isCompleted()) {
              break;
              }
              if (monitor.isCanceled()) {
              process.destroy();
              return new Status(IStatus.
              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
              }
              try {
              Thread.
              sleep(1000);
              } catch (InterruptedException e) {
              // Do nothing, sleep just ends early
              }
              }

              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

              Any thoughts or suggestions about what might be going on?

              Thanks

              Dave



              }
              }

              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=



_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

Greg Watson-2
What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?'

Greg

On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

Greg
I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}.

I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

Dave

<graycol.gif>Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 01/24/2018 12:10 PM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





Dave,

Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

Another thought. Do you know if the remote process is using a PTY or not?

You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

Greg

      On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

      Greg
      I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog


      Both my reader threads are waiting

      <17443150.gif>

      I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.


      The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.

      <17931618.gif>

      The Session class is com.jcraft.jsch.Session


      I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.


      Dave




      <graycol.gif>
      Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

      From:
      Greg Watson <[hidden email]>
      To:
      Parallel Tools Platform general developers <[hidden email]>
      Date:
      01/23/2018 10:57 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      [hidden email]






      Hi Dave,


      Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?


      Regards,
      Greg
              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

              The loop I have to wait for process completion is

              for (;;) {
              if (process.isCompleted()) {
              break;
              }
              if (monitor.isCanceled()) {
              process.destroy();
              return new Status(IStatus.
              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
              }
              try {
              Thread.
              sleep(1000);
              } catch (InterruptedException e) {
              // Do nothing, sleep just ends early
              }
              }

              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

              Any thoughts or suggestions about what might be going on?

              Thanks

              Dave



              }
              }

              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2

Greg
That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'

I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
#!/bin/sh
echo "Execute: " $*
$*
echo "EOF:$?"

And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'

The idea is that the hack script just executes exactly what it is passed.

This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.

It looks like something is just completely losing track of the remote command request.in this case.

Dave



Inactive hide details for Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueueGreg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 01/26/2018 11:39 AM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?'

Greg
      On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

      Greg
      I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}.


      I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.


      I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.


      I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")


      So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.


      Dave


      <graycol.gif>Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

      From:
      Greg Watson <[hidden email]>
      To:
      Parallel Tools Platform general developers <[hidden email]>
      Date:
      01/24/2018 12:10 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      [hidden email]





      Dave,

      Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

      Another thought. Do you know if the remote process is using a PTY or not?

      You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

      Greg
              On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

              Greg
              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

              Both my reader threads are waiting
              <17443150.gif>
              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
              <17931618.gif>
              The Session class is com.jcraft.jsch.Session

              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

              Dave



              <graycol.gif>
              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

              From:
              Greg Watson <[hidden email]>
              To:
              Parallel Tools Platform general developers <[hidden email]>
              Date:
              01/23/2018 10:57 PM
              Subject:
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by:
              [hidden email]






              Hi Dave,

              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

              Regards,
              Greg
                              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

                              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

                              The loop I have to wait for process completion is

                              for (;;) {
                              if (process.isCompleted()) {
                              break;
                              }
                              if (monitor.isCanceled()) {
                              process.destroy();
                              return new Status(IStatus.
                              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
                              }
                              try {
                              Thread.
                              sleep(1000);
                              } catch (InterruptedException e) {
                              // Do nothing, sleep just ends early
                              }
                              }

                              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

                              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

                              Any thoughts or suggestions about what might be going on?

                              Thanks

                              Dave



                              }
                              }

                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://dev.eclipse.org/mailman/listinfo/ptp-dev


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev

      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=


      _______________________________________________
      ptp-dev mailing list
      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit
      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8UKZOWGHFEQK7peFbb27Sq7TzqUU8yKSGcWEPbpCK58&s=6zr4BkegolvkKbdUDs170pjhjmktMVWsj4ZMU0eXrCY&e=



_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

Greg Watson-2
Dave,

What do you mean "when the bqueues command disappears"?

Greg

On Jan 29, 2018, at 9:30 AM, David Wootton <[hidden email]> wrote:

Greg
That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'

I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
#!/bin/sh
echo "Execute: " $*
$*
echo "EOF:$?"

And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'

The idea is that the hack script just executes exactly what it is passed.

This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.

It looks like something is just completely losing track of the remote command request.in this case.

Dave



<graycol.gif>Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

From:  Greg Watson <[hidden email]>
To:  Parallel Tools Platform general developers <[hidden email]>
Date:  01/26/2018 11:39 AM
Subject:  Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by:  [hidden email]





What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?'

Greg
      On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

      Greg
      I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}. 


      I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.


      I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.


      I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")


      So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.


      Dave


      <graycol.gif>Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

      From: 
      Greg Watson <[hidden email]>
      To: 
      Parallel Tools Platform general developers <[hidden email]>
      Date: 
      01/24/2018 12:10 PM
      Subject: 
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by: 
      [hidden email]






      Dave,

      Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

      Another thought. Do you know if the remote process is using a PTY or not?

      You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

      Greg
              On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

              Greg
              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

              Both my reader threads are waiting
              <17443150.gif>
              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
              <17931618.gif>
              The Session class is com.jcraft.jsch.Session

              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

              Dave



              <graycol.gif>
              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

              From: 
              Greg Watson <[hidden email]>
              To: 
              Parallel Tools Platform general developers <[hidden email]>
              Date: 
              01/23/2018 10:57 PM
              Subject: 
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by: 
              [hidden email]






              Hi Dave,

              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

              Regards,
              Greg
                              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

                              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

                              The loop I have to wait for process completion is

                              for (;;) {
                              if (process.isCompleted()) {
                              break;
                              }
                              if (monitor.isCanceled()) {
                              process.destroy();
                              return new Status(IStatus.
                              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
                              }
                              try {
                              Thread.
                              sleep(1000);
                              } catch (InterruptedException e) {
                              // Do nothing, sleep just ends early 
                              }
                              }

                              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs) 

                              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

                              Any thoughts or suggestions about what might be going on?

                              Thanks

                              Dave



                              }
                              }

                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://dev.eclipse.org/mailman/listinfo/ptp-dev


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev

      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=


      _______________________________________________
      ptp-dev mailing list
      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit
      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8UKZOWGHFEQK7peFbb27Sq7TzqUU8yKSGcWEPbpCK58&s=6zr4BkegolvkKbdUDs170pjhjmktMVWsj4ZMU0eXrCY&e=


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2

Greg
I have a console session open on the login node where the bqueues command runs. Once I click the List button in the run configuration dialog, I periodically issue a 'ps -u dwootton' command to see what processes are running for me on the login node. I see the bqueues command and my hack script running for a while, which is what I expect. But then I ussue the ps command again and see that both the bqueues command and the hack script have terminated, but no output from stdout or stderr is displayed to my Eclipse console view. When the bqueues command works correctly, I see stdout or stderr, sometimes both, getting text back from the remote command. That's why I'm thinking something is losing track of the command invocation since I should see at least the messages from my hack script, which are issued unconditionally before and after the bqueues command runs.

Dave

Inactive hide details for Greg Watson ---01/29/2018 12:17:24 PM---Dave, What do you mean "when the bqueues command disappears"?Greg Watson ---01/29/2018 12:17:24 PM---Dave, What do you mean "when the bqueues command disappears"?

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 01/29/2018 12:17 PM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





Dave,

What do you mean "when the bqueues command disappears"?

Greg
      On Jan 29, 2018, at 9:30 AM, David Wootton <[hidden email]> wrote:

      Greg
      That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'


      I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
      #!/bin/sh
      echo "Execute: " $*
      $*
      echo "EOF:$?"


      And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'


      The idea is that the hack script just executes exactly what it is passed.


      This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.


      It looks like something is just completely losing track of the remote command request.in this case.


      Dave




      <graycol.gif>
      Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

      From:
      Greg Watson <[hidden email]>
      To:
      Parallel Tools Platform general developers <[hidden email]>
      Date:
      01/26/2018 11:39 AM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      [hidden email]





      What happens if you try a single quoted argument, e.g '
      bqueues -l; echo EOF:$?'

      Greg
              On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

              Greg
              I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}.

              I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

              I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

              I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

              So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

              Dave


              <graycol.gif>
              Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

              From:
              Greg Watson <[hidden email]>
              To:
              Parallel Tools Platform general developers <[hidden email]>
              Date:
              01/24/2018 12:10 PM
              Subject:
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by:
              [hidden email]






              Dave,

              Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

              Another thought. Do you know if the remote process is using a PTY or not?

              You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

              Greg
                              On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

                              Greg
                              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

                              Both my reader threads are waiting
                              <17443150.gif>
                              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

                              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
                              <17931618.gif>
                              The Session class is com.jcraft.jsch.Session

                              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

                              Dave



                              <graycol.gif>
                              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

                              From:
                              Greg Watson <[hidden email]>
                              To:
                              Parallel Tools Platform general developers <[hidden email]>
                              Date:
                              01/23/2018 10:57 PM
                              Subject:
                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                              Sent by:
                              [hidden email]





                              Hi Dave,

                              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

                              Regards,
                              Greg
                                                              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

                                                              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

                                                              The loop I have to wait for process completion is

                                                              for (;;) {
                                                              if (process.isCompleted()) {
                                                              break;
                                                              }
                                                              if (monitor.isCanceled()) {
                                                              process.destroy();
                                                              return new Status(IStatus.
                                                              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
                                                              }
                                                              try {
                                                              Thread.
                                                              sleep(1000);
                                                              } catch (InterruptedException e) {
                                                              // Do nothing, sleep just ends early
                                                              }
                                                              }

                                                              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

                                                              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

                                                              Any thoughts or suggestions about what might be going on?

                                                              Thanks

                                                              Dave



                                                              }
                                                              }

                                                              _______________________________________________
                                                              ptp-dev mailing list

                                                              [hidden email]
                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                              https://dev.eclipse.org/mailman/listinfo/ptp-dev


                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://dev.eclipse.org/mailman/listinfo/ptp-dev

              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev

      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8UKZOWGHFEQK7peFbb27Sq7TzqUU8yKSGcWEPbpCK58&s=6zr4BkegolvkKbdUDs170pjhjmktMVWsj4ZMU0eXrCY&e=


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=E8oMYAzOXDWKpRBJw9dEu8Och2zp6CdOx-ECC0T98nY&s=JKBj8UsPwVtMHjwdDIEdSGjZw3P8IODBW3k0gsAo_1Y&e=



_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

Greg Watson-2
Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script?

Greg

On Jan 29, 2018, at 2:57 PM, David Wootton <[hidden email]> wrote:

Greg
I have a console session open on the login node where the bqueues command runs. Once I click the List button in the run configuration dialog, I periodically issue a 'ps -u dwootton' command to see what processes are running for me on the login node. I see the bqueues command and my hack script running for a while, which is what I expect. But then I ussue the ps command again and see that both the bqueues command and the hack script have terminated, but no output from stdout or stderr is displayed to my Eclipse console view. When the bqueues command works correctly, I see stdout or stderr, sometimes both, getting text back from the remote command. That's why I'm thinking something is losing track of the command invocation since I should see at least the messages from my hack script, which are issued unconditionally before and after the bqueues command runs.

Dave

<graycol.gif>Greg Watson ---01/29/2018 12:17:24 PM---Dave, What do you mean "when the bqueues command disappears"?

From:  Greg Watson <[hidden email]>
To:  Parallel Tools Platform general developers <[hidden email]>
Date:  01/29/2018 12:17 PM
Subject:  Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by:  [hidden email]





Dave,

What do you mean "when the bqueues command disappears"?

Greg
      On Jan 29, 2018, at 9:30 AM, David Wootton <[hidden email]> wrote:

      Greg
      That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'


      I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
      #!/bin/sh
      echo "Execute: " $*
      $*
      echo "EOF:$?"


      And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'


      The idea is that the hack script just executes exactly what it is passed.


      This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.


      It looks like something is just completely losing track of the remote command request.in this case.


      Dave




      <graycol.gif>
      Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

      From:  
      Greg Watson <[hidden email]>
      To:  
      Parallel Tools Platform general developers <[hidden email]>
      Date:  
      01/26/2018 11:39 AM
      Subject:  
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:  
      [hidden email]






      What happens if you try a single quoted argument, e.g '
      bqueues -l; echo EOF:$?'

      Greg
              On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

              Greg
              I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}. 

              I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

              I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

              I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

              So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

              Dave


              <graycol.gif>
              Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

              From: 
              Greg Watson <[hidden email]>
              To: 
              Parallel Tools Platform general developers <[hidden email]>
              Date: 
              01/24/2018 12:10 PM
              Subject: 
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by: 
              [hidden email]






              Dave,

              Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

              Another thought. Do you know if the remote process is using a PTY or not?

              You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

              Greg
                              On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

                              Greg
                              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

                              Both my reader threads are waiting
                              <17443150.gif>
                              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

                              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
                              <17931618.gif>
                              The Session class is com.jcraft.jsch.Session

                              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

                              Dave



                              <graycol.gif>
                              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

                              From: 
                              Greg Watson <[hidden email]>
                              To: 
                              Parallel Tools Platform general developers <[hidden email]>
                              Date: 
                              01/23/2018 10:57 PM
                              Subject: 
                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                              Sent by: 
                              [hidden email]







                              Hi Dave,

                              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

                              Regards,
                              Greg
                                                              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

                                                              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

                                                              The loop I have to wait for process completion is

                                                              for (;;) {
                                                              if (process.isCompleted()) {
                                                              break;
                                                              }
                                                              if (monitor.isCanceled()) {
                                                              process.destroy();
                                                              return new Status(IStatus.
                                                              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
                                                              }
                                                              try {
                                                              Thread.
                                                              sleep(1000);
                                                              } catch (InterruptedException e) {
                                                              // Do nothing, sleep just ends early 
                                                              }
                                                              }

                                                              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs) 

                                                              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

                                                              Any thoughts or suggestions about what might be going on?

                                                              Thanks

                                                              Dave



                                                              }
                                                              }

                                                              _______________________________________________
                                                              ptp-dev mailing list

                                                              [hidden email]
                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                              https://dev.eclipse.org/mailman/listinfo/ptp-dev


                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://dev.eclipse.org/mailman/listinfo/ptp-dev

              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev

      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8UKZOWGHFEQK7peFbb27Sq7TzqUU8yKSGcWEPbpCK58&s=6zr4BkegolvkKbdUDs170pjhjmktMVWsj4ZMU0eXrCY&e=


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=E8oMYAzOXDWKpRBJw9dEu8Och2zp6CdOx-ECC0T98nY&s=JKBj8UsPwVtMHjwdDIEdSGjZw3P8IODBW3k0gsAo_1Y&e=


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2

Greg
I added a sleep just before the exit in the script and that makes no difference. I didn't expect any difference since this execution path should be all non-asynchrouous code. I expect sshd is issuing a fork, exec, and wait to invoke the hack script and then bash does the same when invoking the bqueues command.

The only inconsistent behavior I'm seeing is that sometimes the bqueues command itself times out because LSF daemons apparently aren't responding. But that's all internal to the bqueues command and I do get completion status reported all the way back to my Eclipse code where the return status says the bqueues command exited with rc=255.

I realize the bqueues command could be exiting with some off return code so added an echo statement to my hack script to write the return code to a file on the remote system and the return code was always zero.

Dave

Inactive hide details for Greg Watson ---01/29/2018 03:53:08 PM---Maybe it's a timing issue. What happens if you add 'sleep 5' Greg Watson ---01/29/2018 03:53:08 PM---Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script? Greg

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 01/29/2018 03:53 PM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script?

Greg
      On Jan 29, 2018, at 2:57 PM, David Wootton <[hidden email]> wrote:

      Greg
      I have a console session open on the login node where the bqueues command runs. Once I click the List button in the run configuration dialog, I periodically issue a 'ps -u dwootton' command to see what processes are running for me on the login node. I see the bqueues command and my hack script running for a while, which is what I expect. But then I ussue the ps command again and see that both the bqueues command and the hack script have terminated, but no output from stdout or stderr is displayed to my Eclipse console view. When the bqueues command works correctly, I see stdout or stderr, sometimes both, getting text back from the remote command. That's why I'm thinking something is losing track of the command invocation since I should see at least the messages from my hack script, which are issued unconditionally before and after the bqueues command runs.


      Dave


      <graycol.gif>
      Greg Watson ---01/29/2018 12:17:24 PM---Dave, What do you mean "when the bqueues command disappears"?

      From:
      Greg Watson <[hidden email]>
      To:
      Parallel Tools Platform general developers <[hidden email]>
      Date:
      01/29/2018 12:17 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      [hidden email]





      Dave,


      What do you mean "when the bqueues command disappears"?


      Greg
              On Jan 29, 2018, at 9:30 AM, David Wootton <[hidden email]> wrote:

              Greg
              That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'

              I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
              #!/bin/sh
              echo "Execute: " $*
              $*
              echo "EOF:$?"

              And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'

              The idea is that the hack script just executes exactly what it is passed.

              This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.

              It looks like something is just completely losing track of the remote command request.in this case.

              Dave



              <graycol.gif>
              Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

              From:
              Greg Watson <[hidden email]>
              To:
              Parallel Tools Platform general developers <[hidden email]>
              Date:
              01/26/2018 11:39 AM
              Subject:
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by:
              [hidden email]






              What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?'

              Greg
                              On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

                              Greg
                              I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}.

                              I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

                              I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

                              I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

                              So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

                              Dave

                              <graycol.gif>
                              Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

                              From:
                              Greg Watson <[hidden email]>
                              To:
                              Parallel Tools Platform general developers <[hidden email]>
                              Date:
                              01/24/2018 12:10 PM
                              Subject:
                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                              Sent by:
                              [hidden email]





                              Dave,

                              Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

                              Another thought. Do you know if the remote process is using a PTY or not?

                              You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

                              Greg
                                                              On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

                                                              Greg
                                                              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

                                                              Both my reader threads are waiting
                                                              <17443150.gif>
                                                              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

                                                              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
                                                              <17931618.gif>
                                                              The Session class is com.jcraft.jsch.Session

                                                              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

                                                              Dave



                                                              <graycol.gif>
                                                              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

                                                              From:
                                                              Greg Watson <[hidden email]>
                                                              To:
                                                              Parallel Tools Platform general developers <[hidden email]>
                                                              Date:
                                                              01/23/2018 10:57 PM
                                                              Subject:
                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                              Sent by:
                                                              [hidden email]







                                                              Hi Dave,

                                                              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

                                                              Regards,
                                                              Greg
                                                                                                                              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

                                                                                                                              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

                                                                                                                              The loop I have to wait for process completion is

                                                                                                                              for (;;) {
                                                                                                                              if (process.isCompleted()) {
                                                                                                                              break;
                                                                                                                              }
                                                                                                                              if (monitor.isCanceled()) {
                                                                                                                              process.destroy();
                                                                                                                              return new Status(IStatus.
                                                                                                                              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
                                                                                                                              }
                                                                                                                              try {
                                                                                                                              Thread.
                                                                                                                              sleep(1000);
                                                                                                                              } catch (InterruptedException e) {
                                                                                                                              // Do nothing, sleep just ends early
                                                                                                                              }
                                                                                                                              }

                                                                                                                              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

                                                                                                                              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

                                                                                                                              Any thoughts or suggestions about what might be going on?

                                                                                                                              Thanks

                                                                                                                              Dave



                                                                                                                              }
                                                                                                                              }

                                                                                                                              _______________________________________________
                                                                                                                              ptp-dev mailing list

                                                                                                                              [hidden email]
                                                                                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                                                                                              https://dev.eclipse.org/mailman/listinfo/ptp-dev


                                                              _______________________________________________
                                                              ptp-dev mailing list

                                                              [hidden email]
                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


                                                              _______________________________________________
                                                              ptp-dev mailing list

                                                              [hidden email]
                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                              https://dev.eclipse.org/mailman/listinfo/ptp-dev

                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=


                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://dev.eclipse.org/mailman/listinfo/ptp-dev

              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8UKZOWGHFEQK7peFbb27Sq7TzqUU8yKSGcWEPbpCK58&s=6zr4BkegolvkKbdUDs170pjhjmktMVWsj4ZMU0eXrCY&e=


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev

      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=E8oMYAzOXDWKpRBJw9dEu8Och2zp6CdOx-ECC0T98nY&s=JKBj8UsPwVtMHjwdDIEdSGjZw3P8IODBW3k0gsAo_1Y&e=


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=WrhV3arLuvqCGzT4vfToJNjBJpmWdRvnuUBZTz_T_GQ&s=SW2sdviKY2FPCbhZYCXuK04ZUKS4zWq8xEkg_w7sN_0&e=



_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

Greg Watson-2
Dave,

It's quite an involved path from the thread you have reading from the input stream to the stdout of the command on the remote machine. It's possible that the command could complete on the remote machine before the thread even starts running, though I would have thought that isCompleted() would be false if that happened. Can you add something at the end of the script to check that it ran successfully (e.g 'echo "bqueue finished with status $?" > /tmp/script.out')?

There's not really anything that can "lose track", so I want to establish that the command is actually being run each time.

Regards,
Greg

On Jan 30, 2018, at 7:36 AM, David Wootton <[hidden email]> wrote:

Greg
I added a sleep just before the exit in the script and that makes no difference. I didn't expect any difference since this execution path should be all non-asynchrouous code. I expect sshd is issuing a fork, exec, and wait to invoke the hack script and then bash does the same when invoking the bqueues command.

The only inconsistent behavior I'm seeing is that sometimes the bqueues command itself times out because LSF daemons apparently aren't responding. But that's all internal to the bqueues command and I do get completion status reported all the way back to my Eclipse code where the return status says the bqueues command exited with rc=255.

I realize the bqueues command could be exiting with some off return code so added an echo statement to my hack script to write the return code to a file on the remote system and the return code was always zero.

Dave

<graycol.gif>Greg Watson ---01/29/2018 03:53:08 PM---Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script? Greg

From:  Greg Watson <[hidden email]>
To:  Parallel Tools Platform general developers <[hidden email]>
Date:  01/29/2018 03:53 PM
Subject:  Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by:  [hidden email]





Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script?

Greg
      On Jan 29, 2018, at 2:57 PM, David Wootton <[hidden email]> wrote:

      Greg
      I have a console session open on the login node where the bqueues command runs. Once I click the List button in the run configuration dialog, I periodically issue a 'ps -u dwootton' command to see what processes are running for me on the login node. I see the bqueues command and my hack script running for a while, which is what I expect. But then I ussue the ps command again and see that both the bqueues command and the hack script have terminated, but no output from stdout or stderr is displayed to my Eclipse console view. When the bqueues command works correctly, I see stdout or stderr, sometimes both, getting text back from the remote command. That's why I'm thinking something is losing track of the command invocation since I should see at least the messages from my hack script, which are issued unconditionally before and after the bqueues command runs.


      Dave


      <graycol.gif>
      Greg Watson ---01/29/2018 12:17:24 PM---Dave, What do you mean "when the bqueues command disappears"?

      From:  
      Greg Watson <[hidden email]>
      To:  
      Parallel Tools Platform general developers <[hidden email]>
      Date:  
      01/29/2018 12:17 PM
      Subject:  
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:  
      [hidden email]






      Dave,


      What do you mean "when the bqueues command disappears"?


      Greg
              On Jan 29, 2018, at 9:30 AM, David Wootton <[hidden email]> wrote:

              Greg
              That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'

              I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
              #!/bin/sh
              echo "Execute: " $*
              $*
              echo "EOF:$?"

              And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'

              The idea is that the hack script just executes exactly what it is passed.

              This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.

              It looks like something is just completely losing track of the remote command request.in this case.

              Dave



              <graycol.gif>
              Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

              From:  
              Greg Watson <[hidden email]>
              To:  
              Parallel Tools Platform general developers <[hidden email]>
              Date:  
              01/26/2018 11:39 AM
              Subject:  
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by:  
              [hidden email]






              What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?'

              Greg
                              On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

                              Greg
                              I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}. 

                              I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

                              I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

                              I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

                              So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

                              Dave

                              <graycol.gif>
                              Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

                              From: 
                              Greg Watson <[hidden email]>
                              To: 
                              Parallel Tools Platform general developers <[hidden email]>
                              Date: 
                              01/24/2018 12:10 PM
                              Subject: 
                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                              Sent by: 
                              [hidden email]







                              Dave,

                              Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

                              Another thought. Do you know if the remote process is using a PTY or not?

                              You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

                              Greg
                                                              On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

                                                              Greg
                                                              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

                                                              Both my reader threads are waiting
                                                              <17443150.gif>
                                                              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

                                                              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
                                                              <17931618.gif>
                                                              The Session class is com.jcraft.jsch.Session

                                                              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

                                                              Dave



                                                              <graycol.gif>
                                                              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

                                                              From: 
                                                              Greg Watson <[hidden email]>
                                                              To: 
                                                              Parallel Tools Platform general developers <[hidden email]>
                                                              Date: 
                                                              01/23/2018 10:57 PM
                                                              Subject: 
                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                              Sent by: 
                                                              [hidden email]







                                                              Hi Dave,

                                                              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

                                                              Regards,
                                                              Greg
                                                                                                                              On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

                                                                                                                              I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

                                                                                                                              The loop I have to wait for process completion is

                                                                                                                              for (;;) {
                                                                                                                              if (process.isCompleted()) {
                                                                                                                              break;
                                                                                                                              }
                                                                                                                              if (monitor.isCanceled()) {
                                                                                                                              process.destroy();
                                                                                                                              return new Status(IStatus.
                                                                                                                              CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
                                                                                                                              }
                                                                                                                              try {
                                                                                                                              Thread.
                                                                                                                              sleep(1000);
                                                                                                                              } catch (InterruptedException e) {
                                                                                                                              // Do nothing, sleep just ends early 
                                                                                                                              }
                                                                                                                              }

                                                                                                                              I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs) 

                                                                                                                              I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

                                                                                                                              Any thoughts or suggestions about what might be going on?

                                                                                                                              Thanks

                                                                                                                              Dave



                                                                                                                              }
                                                                                                                              }

                                                                                                                              _______________________________________________
                                                                                                                              ptp-dev mailing list

                                                                                                                              [hidden email]
                                                                                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                                                                                              https://dev.eclipse.org/mailman/listinfo/ptp-dev


                                                              _______________________________________________
                                                              ptp-dev mailing list

                                                              [hidden email]
                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=fVUXCw2ExwmeU4_X7N4n8fB0D-ofzaT4utx-FgX1OeQ&s=qcbLhC7oTOwG7MzIAy-Ku8f_jyIynezOE0RedWwOedY&e=


                                                              _______________________________________________
                                                              ptp-dev mailing list

                                                              [hidden email]
                                                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                                                              https://dev.eclipse.org/mailman/listinfo/ptp-dev

                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8O4hByBVrIyHgzABRKbnES8nsH1kLsMLTJt_Qw6wD2o&s=vBKrQL8fUjRiPZrE-ZRwzVZFs5y7Iq_hFzW5jxMa4TU&e=


                              _______________________________________________
                              ptp-dev mailing list

                              [hidden email]
                              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

                              https://dev.eclipse.org/mailman/listinfo/ptp-dev

              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=8UKZOWGHFEQK7peFbb27Sq7TzqUU8yKSGcWEPbpCK58&s=6zr4BkegolvkKbdUDs170pjhjmktMVWsj4ZMU0eXrCY&e=


              _______________________________________________
              ptp-dev mailing list

              [hidden email]
              To change your delivery options, retrieve your password, or unsubscribe from this list, visit

              https://dev.eclipse.org/mailman/listinfo/ptp-dev

      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=E8oMYAzOXDWKpRBJw9dEu8Och2zp6CdOx-ECC0T98nY&s=JKBj8UsPwVtMHjwdDIEdSGjZw3P8IODBW3k0gsAo_1Y&e=


      _______________________________________________
      ptp-dev mailing list

      [hidden email]
      To change your delivery options, retrieve your password, or unsubscribe from this list, visit

      https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=WrhV3arLuvqCGzT4vfToJNjBJpmWdRvnuUBZTz_T_GQ&s=SW2sdviKY2FPCbhZYCXuK04ZUKS4zWq8xEkg_w7sN_0&e=


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev
Reply | Threaded
Open this post in threaded view
|

Re: IRemoteProcess.isCompleted occaisionally fails to report process completion

David Wootton-2

Greg
I had an echo to a file in the local filesystem, to my hack wrapper script, following the bqueues command, including the bqueues return code. In the case where everything hung the echo reported the bqueues command return code was zero, so the commands were definitely running.

Usually the hang lasted for something like 30 seconds. I logged into a second console session on the node where the bqueues command was running and repeatedly issue 'ps -u dwootton' commands and see the bqueues command and my wrapper until it eventually terminated with no notification back to my Eclipse session.

Dave

Inactive hide details for Greg Watson ---02/02/2018 03:42:31 PM---Dave, It's quite an involved path from the thread you have reGreg Watson ---02/02/2018 03:42:31 PM---Dave, It's quite an involved path from the thread you have reading from the input stream to the stdo

From: Greg Watson <[hidden email]>
To: Parallel Tools Platform general developers <[hidden email]>
Date: 02/02/2018 03:42 PM
Subject: Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
Sent by: [hidden email]





Dave,

It's quite an involved path from the thread you have reading from the input stream to the stdout of the command on the remote machine. It's possible that the command could complete on the remote machine before the thread even starts running, though I would have thought that isCompleted() would be false if that happened. Can you add something at the end of the script to check that it ran successfully (e.g 'echo "bqueue finished with status $?" > /tmp/script.out')?

There's not really anything that can "lose track", so I want to establish that the command is actually being run each time.

Regards,
Greg
      On Jan 30, 2018, at 7:36 AM, David Wootton <[hidden email]> wrote:

      Greg
      I added a sleep just before the exit in the script and that makes no difference. I didn't expect any difference since this execution path should be all non-asynchrouous code. I expect sshd is issuing a fork, exec, and wait to invoke the hack script and then bash does the same when invoking the bqueues command.


      The only inconsistent behavior I'm seeing is that sometimes the bqueues command itself times out because LSF daemons apparently aren't responding. But that's all internal to the bqueues command and I do get completion status reported all the way back to my Eclipse code where the return status says the bqueues command exited with rc=255.


      I realize the bqueues command could be exiting with some off return code so added an echo statement to my hack script to write the return code to a file on the remote system and the return code was always zero.


      Dave


      <graycol.gif>
      Greg Watson ---01/29/2018 03:53:08 PM---Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script? Greg

      From:
      Greg Watson <[hidden email]>
      To:
      Parallel Tools Platform general developers <[hidden email]>
      Date:
      01/29/2018 03:53 PM
      Subject:
      Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
      Sent by:
      [hidden email]





      Maybe it's a timing issue. What happens if you add 'sleep 5' to the end of the script?


      Greg
              On Jan 29, 2018, at 2:57 PM, David Wootton <[hidden email]> wrote:

              Greg
              I have a console session open on the login node where the bqueues command runs. Once I click the List button in the run configuration dialog, I periodically issue a 'ps -u dwootton' command to see what processes are running for me on the login node. I see the bqueues command and my hack script running for a while, which is what I expect. But then I ussue the ps command again and see that both the bqueues command and the hack script have terminated, but no output from stdout or stderr is displayed to my Eclipse console view. When the bqueues command works correctly, I see stdout or stderr, sometimes both, getting text back from the remote command. That's why I'm thinking something is losing track of the command invocation since I should see at least the messages from my hack script, which are issued unconditionally before and after the bqueues command runs.

              Dave

              <graycol.gif>
              Greg Watson ---01/29/2018 12:17:24 PM---Dave, What do you mean "when the bqueues command disappears"?

              From:
              Greg Watson <[hidden email]>
              To:
              Parallel Tools Platform general developers <[hidden email]>
              Date:
              01/29/2018 12:17 PM
              Subject:
              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
              Sent by:
              [hidden email]






              Dave,

              What do you mean "when the bqueues command disappears"?

              Greg
                              On Jan 29, 2018, at 9:30 AM, David Wootton <[hidden email]> wrote:

                              Greg
                              That doesn't work. The result is that the name of the remote command is the complete string 'bqueues -l; echo EOF:$?'

                              I thought I could make this work by running a wrapper script, for instance /home/dwootton/hack on the remote node, where the script is
                              #!/bin/sh
                              echo "Execute: " $*
                              $*
                              echo "EOF:$?"

                              And then changing the invocation command in my Eclipse code to 'private static final String bqueuesCommand[] = {"/home/dwootton/hack", "bqueues", "-l"};'

                              The idea is that the hack script just executes exactly what it is passed.

                              This works correctly most of the time. However, when the bqueues command disappears, I still get absolutely no output to stdout, not even the text from my hack script.

                              It looks like something is just completely losing track of the remote command request.in this case.

                              Dave



                              <graycol.gif>
                              Greg Watson ---01/26/2018 11:39:03 AM---What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?' Greg

                              From:
                              Greg Watson <[hidden email]>
                              To:
                              Parallel Tools Platform general developers <[hidden email]>
                              Date:
                              01/26/2018 11:39 AM
                              Subject:
                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                              Sent by:
                              [hidden email]





                              What happens if you try a single quoted argument, e.g 'bqueues -l; echo EOF:$?'

                              Greg
                                                              On Jan 25, 2018, at 5:05 PM, David Wootton <[hidden email]> wrote:

                                                              Greg
                                                              I tried adding an echo command to the bqueues command and I am not having any success. My original bqueues command that I was passing to the IRemoteProcessBuilder was a String array {"bqueues", "-l"}.

                                                              I changed that to {"bqueues", "-l", ";", "echo", "\"EOF:$?\""} and that failed with a LSF error message that there was no such queue as ";", where the semicolon is being passed as a command parameter to the bqueues command instead of as a command separator for bash.

                                                              I tried changing ';' tp "\\;" to escape the semicolon and it was still passed as a bqueues command parameter, this time '\;'.

                                                              I was able to get the pid of the bash process started to run the bqueues command one time with my original bqueues command hanging and it looks like the command being passed across is actually /bin/bash -l -c cd /autofs/home/dwootton && bqueues -l where "cd /autofs/home/dwootton && bqueues -l" is probably a string parameter to the bash -c option (which tells bash to use the string as the bash command")

                                                              So I'm not sure how I can get this hack to work. I think I have a way to deal with the return status in my Java code, but I'm stuck at getting a working command to pass across to the remote system.

                                                              Dave

                                                              <graycol.gif>
                                                              Greg Watson ---01/24/2018 12:10:09 PM---Dave, Is there anything still running on the remote end? e.g. is there a shell process? You could tr

                                                              From:
                                                              Greg Watson <[hidden email]>
                                                              To:
                                                              Parallel Tools Platform general developers <[hidden email]>
                                                              Date:
                                                              01/24/2018 12:10 PM
                                                              Subject:
                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                              Sent by:
                                                              [hidden email]







                                                              Dave,

                                                              Is there anything still running on the remote end? e.g. is there a shell process? You could try killing it to see if that terminates the session.

                                                              Another thought. Do you know if the remote process is using a PTY or not?

                                                              You might ultimately need to do something hackish, like adding 'echo FOO' to the command and checking to see when FOO comes back.

                                                              Greg
                                                                                                                              On Jan 24, 2018, at 7:24 AM, David Wootton <[hidden email]> wrote:

                                                                                                                              Greg
                                                                                                                              I suspended each thread in the Eclipse debugger once I had a hung run configuration dialog

                                                                                                                              Both my reader threads are waiting
                                                                                                                              <17443150.gif>
                                                                                                                              I expected these threads had exited at this point since the remote process was gone and the associated write-side file descriptors should have been closed, causing the pending read to end, at least on Linux. I'm running Eclipse on windows, so maybe file descriptor behavior there is different.

                                                                                                                              The thread that looks like it might be a connection thread seems to be looping in PipedImputStream.awaitSpace, since I can single step thru it. There is a wait there, with a 1 second timeout.
                                                                                                                              <17931618.gif>
                                                                                                                              The Session class is com.jcraft.jsch.Session

                                                                                                                              I suspended a few other threads and did not see anything that looked like Jsch. I avoided classes that had labels/names that looked like internal Eclipse threads or other unrelated plugins.

                                                                                                                              Dave



                                                                                                                              <graycol.gif>
                                                                                                                              Greg Watson ---01/23/2018 10:57:10 PM---Hi Dave, Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's

                                                                                                                              From:
                                                                                                                              Greg Watson <[hidden email]>
                                                                                                                              To:
                                                                                                                              Parallel Tools Platform general developers <[hidden email]>
                                                                                                                              Date:
                                                                                                                              01/23/2018 10:57 PM
                                                                                                                              Subject:
                                                                                                                              Re: [ptp-dev] IRemoteProcess.isCompleted occaisionally fails to report process completion
                                                                                                                              Sent by:
                                                                                                                              [hidden email]






                                                                                                                              Hi Dave,

                                                                                                                              Off the top of my head I don't know, but Jsch is a nasty piece of work. Can you see if it's stuck in the Jsch code somewhere?

                                                                                                                              Regards,
                                                                                                                              Greg
On Jan 23, 2018, at 3:00 PM, David Wootton <[hidden email]> wrote:

I'm fixing the hangs using the LSF target configuration and have it mostly fixed. One problem I'm running into is that occasionally, the remote process (bqueues -w) exits but the IRemoteProcess.isCompleted() method still returns false, and as a result, my code loops forever waiting for process completion and the run configuation dialog is locked. I can clear the locked state by clicking the red cancel button at the bottom of the dialog.

The loop I have to wait for process completion is

for (;;) {
if (process.isCompleted()) {
break;
}
if (monitor.isCanceled()) {
process.destroy();
return new Status(IStatus.
CANCEL, Activator.PLUGIN_ID, CANCELED, Messages.CommandCancelMessage, null);
}
try {
Thread.
sleep(1000);
} catch (InterruptedException e) {
// Do nothing, sleep just ends early
}
}

I see comments in the IRemoteProcess source that warn that isCompleted() and waitFor() may not work correctly if the calling thread does not read the stderr or stdout streams and the JSch process implementation is used (which appears to be my case since I see that the process builder os a JSchProcessBuilder) . However, in my case I have reads pending on both the stderr and stdout streams for at least one byte, but I am issuing those reads on a different threads from where the remote process was created. (I'm reading on separate threads to avoid my code blocking if the remote process writes so much data to either stream that the stream buffers fill and the process blocks until something reads from these streams to empty the buffer , and that fixes most of the hangs)

I'm not sure what's going on here to cause the hang. I'm wondering if my InputStream objects need a synchronized attribute because it's being used on a different thread, but that also makes no sense since my InputStream veriable is not visible to anythig other than my code reading the stream.

Any thoughts or suggestions about what might be going on?

Thanks

Dave



}
}

_______________________________________________
ptp-dev mailing list

[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit

https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_ptp-2Ddev&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=yA1Giwd7Ls577uUKQ3fQWICGHopYggQ46OvlB30WK5M&m=dxgoY4bU6wDz6CiuhPX23jj1d3y_-UakhfG60stTwms&s=9CmjXEF7GzZnK3JhoadWrjkSGvjE24IWfXaUjxI-0kA&e=



_______________________________________________
ptp-dev mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/ptp-dev