Transparent SSH host-jumping (Advanced)

In this brief article I am going to describe how I resolved a nagging issue I had with setting up access to hosts which are not directly reachable, but where you need to forward your connection through an intermediate host.
Previously, I was using local SSH port-forwarding technique (although I was configuring the hosts I connect to in the ~/.ssh/config file instead of using the command-line options). However, this approach turned out to be quite inconvenient since every time I wanted to connect to a new host (and, possibly, through a new intermediate host) I had to edit my SSH configuration file and add something like the following:
Host intermediate
        HostName 192.168.1.1
        HostKeyAlias intermediate
        LocalForward 10001 target:22

Host target
        HostName 127.0.0.1
        HostKeyAlias target
        Port 10001
The inconvenience came from two things:
  1. My ~/.ssh/config file was growing uncontrollably
  2. Each time I needed to connect to the target host through the intermediate host I had to open two sessions with one of them being idle most of the time
After a while I stumbled upon an article describing quite a generic way to tunnel through an intermediate host and found the approach quite convenient for the day-to-day use. So, I have added the following block into my ~/.ssh/config file just before the "Host *" section:
Host */* 
        ProxyCommand ssh $(dirname %h) -W $(basename %h):%p
From now on, I could connect to the target host via the intermediate one by simply executing the following command:
$ ssh user@intemediate/target
The configuration with the ProxyCommand directive was spawning two ssh processes with one connected to the intermediate host in the background and the other proxied through the intermediate host and connected to the target running in the foreground, so from my point of view I had just one terminal session open. The configuration allowed to chain as many hosts as I wanted, e.g.:
$ssh user@hostA/hostB/hostC/hostD
The above would result in three ssh processes running in the background (the first connected to hostA, the second connected to hostB proxied through hostA, and the third connected to hostC proxied through hostB) and one foreground process which was connected to hostD proxied via hostC. This is great and quite flexible to use, however, this approach has a number of limitations:
  • you cannot specify different ports for different hosts in the chain
  • neither can you use different login names for different hosts in the chain
  • establishing connection to different chains sharing a part of the chain would not reuse already established connections, i.e. slow connection time
Personally, I'm using the same login name and the same ports on hosts I am accessing, so the first two items were not an issue for me, but the last one was irritating enough and I decided to figure out whether it is possible to optimise it. After a bit of reading the documentation and a few attempts I came up with the following configuration block in my ~/.ssh/config file (remember, this block should be placed _before_ the "Host *" one):
Host */*
        ControlMaster auto
        ControlPath   ~/.ssh/.sessions/%r@%h:%p
        ProxyCommand /bin/sh -c 'mkdir -p -m700 ~/.ssh/.sessions/"%r@$(dirname %h)" && exec ssh -o "ControlMaster auto" -o "ControlPath   ~/.ssh/.sessions/%r@$(dirname %h):%p" -o "ControlPersist 120s" -l %r -p %p $(dirname %h) -W $(basename %h):%p'
Let's review it line by line, so the logic is clear:
Host */*
This host definition block would catch any host specified on the ssh command line when the host name matches the "*/*" pattern, so "ssh hostA/hostB/hostC" will be matched as "hostA/hostB" being the first part before "/" and "hostC" as the second part after "/". Due to a recursive call to ssh (see below) this block will be recursively applied to all hosts in the specified chain
ControlMaster auto
This directive instructs ssh to try to reuse the existing control channel to communicate with the remote host and if such a channel does not exist then it will be created and further connections to the same remote host would benefit from a speedup provided by tthe already established connection
ControlPath ~/.ssh/.sessions/%r@%h:%p
This directive provides ssh with the location of the control channel socket file. The socket file should be unique for each remote host and since we are reusing the existing connection and skipping the authentication the socket file should be tagged with the corresponding login name, this is why we are using %r (remote login name), %h (the remote host name), and %p (the remote port) as part of the file name. Please note that due to our usage of "/" as a host separator in the chain the path constructed here will have a subdirectory defined in the middle of the %h expansion. ssh would not automatically create that subdirectory, so it is something we need to address (see below)
ProxyCommand …
This is the heart of the whole block. I am starting this proxy command with /bin/sh -c '…' since ssh is executing the specified command (this replaces the spawned shell and makes it impossible to conditionally chain commands), therefore I am using the shell binary as the proxy command to get the ability to script my logic. Then I am creating the required directory structure for the control channels under ~/.ssh/.sessions (note the -p argument to mkdir, this will create all the missing parts of the specified tree, but also would silence mkdir in case all of the directories already exist). It is worth to mention that with this mkdir command I am creating the subdirectory for the ControlPath defined for the enclosing "Host */*" block.
The second part of the command line is conditionally executing ssh if mkdir did not report any issues. It is good to execute ssh here since we do not need a redundant shell hanging around in the process tree. In this recursive ssh call we explicitly specify that we also need multiplexing of the control channels created by the parent connections (they are "parent" since this is the connection that established first and which enables access to the hosts further down the specified chain) as well as we explicitly specify the location of the control channel (note that since it is a parrent connection we are stripping the rest of host names from the %h macro using dirname. Finally, the third explicitly specified directive is ControlPersist which is set to 120s. This directive instructs ssh to stay in the background and maintain the control channel in case we decide to reuse it and if not activity on the control channel is detected for 2 minutes the ssh process would terminate. Without this directive the moment you closed the connection which was the master connection all dependent connections would also be closed, e.g. if you have two sessions: one to hostA/hostB and the other to hostA/hostC the moment you closed the first one the second one will be immediately terminated if you do not have the ControlPersist configured. The rest of the ssh arguments is obvious: we connect to the first host in the provided host chain (we are extracting that part with dirname %h) and we are proxying stdin/stdout to the last host in the supplied chain with the -W option
Basically, the control flow when you do "ssh user@hostA/hostB/hostC" is the following:
  1. ssh matches the */* pattern against the provided host name (hostA/hostB/hostC)
  2. ssh tries to reuse the control channel by attempting to open the ~/.ssh/.sessions/user@hostA/hostB/hostC:22 socket, if successful the connection is established and the command prompt is displayed to the calling user, otherwise the execution continues
  3. ssh executes the defined ProxyCommand command
  4. the first part of the command creates ~/.ssh/.sessions/hostA/hostB if it is not there
  5. the second part executes 'ssh … -o "ControlPath ~/.ssh/.sessions/user@hostA/hostB:22" … hostA/hostB -W hostC:22' (this will initiate another round of the above steps, but with a shorter chain and it will be recursive until there is just a single host left, e.g. when we ascend to hostA as the host to connect to)
  6. now, with connected stdin/stdout to port 22 on hostC (in the last iteration) ssh performs the authentication against hostC
  7. if authentication is successful ssh creates the ~/.ssh/.sessions/user@hostA/hostB/hostC:22 control channel socket and becomes the master of that control channel
  8. a command prompt is displayed to the calling user
I hope this little trick will save you some time and will make your life easier. :)

Comments

  1. Herrow.
    Why not just use a different host separator character?

    ReplyDelete

Post a Comment

Popular posts from this blog

Should we use ‘sudo’ for day-to-day activities?

Raspberry Pi 3 toolchain on CentOS 7

SSH: Interactive ProxyCommand