Ticket #3872: USE_PID_NAMESPACES cripples condor_ssh_to_job

In email, gthain wrote:

"Unfortunately, it looks like the ssh_to_job runs in a separate pid namespace, which makes it not very useful... as pid namespaces nest, seems like the correct solution for us is to not just create a pid namespace for the job, but also for the starter, and run the sshd in the namespace of the starter."

[Append remarks]

Remarks:

2013-Aug-26 13:28:12 by tannenba:
So one idea on how to fix this is to take advantage of the nesting property of namespaces, so perhaps the startd starts the starter in a namespace, and then the starter starts in another nested space... but, is this going to cause problems for the procd? If the starter uses a procd started by the master or startd, the two will be using different pids to refer to the same process. We don't want to have a procd per starter, as this defeats the whole idea of having one procd for scalability reasons. Perhaps if using cgroups + pid namespaces, a procd is no longer needed? (although all the cgroup implementation lives in the procd, yes?)

Another idea is having a new process join an existing namespace was recently added in the 3.8 linux kernel. The problem with going this route is it could be years until this kernel is in common production.


2013-Aug-26 13:52:47 by bbockelm:
From discussion on IRC -

There's another issue with PID namespaces as it stands -- most user jobs are not ready to be PID 1. There's a non-POSIX 'gotcha' for PID 1 in Linux; signals without signal handlers are ignored (only SIGKILL and SIGSTOP behave correctly). From "man 2 kill":

NOTES The only signals that can be sent to process ID 1, the init process, are those for which init has explicitly installed signal handlers. This is done to assure the system is not brought down accidentally.

Useful for init -- not particularly useful for a job!

Gregggg proposed to have the condor_starter be PID 1 in the namespace. In this case, we can install the appropriate signal handlers in the condor_starter to alleviate this problem and fix ssh_to_job at the same time.

Brian


2013-Aug-26 15:04:24 by bbockelm:
Further thoughts -- no good solution though --

We could have a "REGISTER_SELF" command (assuming the forked starter does the registration; otherwise, maybe REGISTER_CHILD for Create_Process) with the procd which, using SCM_CREDENTIALS, would allow the procd to have an unambiguous view of which process should be registered; then, either:

I think the first option would be the correct one if this approach is taken. Not 100% sure this is the correct approach to take though.


2013-Sep-06 16:11:36 by tannenba:
CODE REVIEW

Changing this ticket back to active until the following issues are fixed:

  1. Why was a stable series ticket resolved without first going into code review status?

  2. Why was an entire copy of a .cpp file committed into the repo as .cpp.orig? Was this on purpose?? (I doubt it)

  3. The ticket pontificates a bunch of ways to fix this long term w/ pros/cons. Then there is a patch pushed into the code and the ticket resolved. Would be nice if either the ticket or commit message stated somewhere what the patch actually did (i.e. what solution was implemented??). My guess is it simply makes sshd run in the global namespace instead of the job namespace, but I had to infer that by looking at the code.

  4. There is no documentation changes. We need a version history commit, and we perhaps need a small paragraph in the section of the manual that discusses pid names spaces about the impact of using pid namespaces + ssh to job together?


2013-Sep-09 08:08:01 by gthain:
This commit just implements the simple fix of not running the sshd in a private pid namespace, it uses the global pid (and filesystem) namespace.
[Append remarks]

Properties:

Type: defect           Last Change: 2013-Sep-09 14:15
Status: defer          Created: 2013-Aug-26 12:06
Fixed Version: v080003           Broken Version: v080002 
Priority:          Subsystem: DaemonsExecNode 
Assigned To: tannenba           Derived From:  
Creator: pfc  Rust:  
Customer Group: ligo  Visibility: public 
Notify: pfcouvar@syr.edu, anderson@ligo.caltech.edu, tannenba@cs.wisc.edu, bbockelm@cse.unl.edu, gthain@cs.wisc.edu,pcouvare@caltech.edu  Due Date:  

Related Check-ins:

2013-Sep-18 16:56   Check-in [37656]: edit of 8.0.3 version history item ===GT=== #3872 (By Karen Miller )
2013-Sep-09 08:16   Check-in [37520]: Document #3872 (By Greg Thain )
2013-Sep-06 14:02   Check-in [37499]: Remove accidentally added file in #3872 (By Greg Thain )
2013-Sep-06 14:00   Check-in [37498]: Allow condor_ssh_to_job to see the job when USE_PID_NAMESPACES is true #3872 This is the quick fix for stable. Perhaps in devel we should move the startd into the pid namespace. (By Greg Thain )