Transparent SSH host-jumping (Advanced)

In this brief article I am going to describe how I resolved a nagging issue I had with setting up access to hosts which are not directly reachable, but where you need to forward your connection through an intermediate host.

Previously, I was using the local SSH port-forwarding technique (although I was configuring hosts I connect to in the ~/.ssh/config file instead of using the command-line options). However, this approach turned out to be quite inconvenient since every time I wanted to connect to a new host (and, possibly, through a new intermediate host) I had to edit my SSH configuration file and add something like the following:

Host intermediate
    HostName 192.168.1.1
    HostKeyAlias intermediate
    LocalForward 10001 target:22

Host target
    HostName 127.0.0.1
    HostKeyAlias target
    Port 10001

Upon closer examination of my day-to-day routine I found two things that frustrated me the most:

My ~/.ssh/config file was growing uncontrollably and became hard to navigate;
Each time I needed to connect to the target host through the intermediate host I had to open two sessions with one of them being idle most of the time.

After a while I stumbled upon an article describing quite a generic way to tunnel through an intermediate host and found the approach quite convenient for the day-to-day use. So, I have added the following block into my ~/.ssh/config file just before the Host * section:

Host */*
    ProxyCommand ssh $(dirname %h) -W $(basename %h):%p

From that point on, I could connect to a target host via an intermediate one by simply executing the following command:

ssh user@intemediate/target

The configuration with the ProxyCommand directive was spawning two ssh processes with one connected to the intermediate host in the background and the other proxied through the intermediate host and connected to the target running in the foreground, so from my point of view I had just one terminal session open. The configuration allowed to chain as many hosts as I wanted, e.g.:

ssh user@hostA/hostB/hostC/hostD

The above would result in three ssh processes running in the background (the first connected to hostA, the second connected to hostB proxied through hostA, and the third connected to hostC proxied through hostB) and one foreground process which was connected to hostD proxied via hostC. This is great and quite flexible to use, however, this approach has a number of limitations:

you cannot specify different ports for different hosts in a chain;
neither can you use different login names for different hosts in the chain;
establishing connection to different chains sharing a part of the chain would not reuse already established connections, i.e. slow connection times.

Personally, I am using the same login name and the same ports on hosts I am accessing, so the first two items were not an issue for me, but the last one was irritating enough and I decided to figure out whether it is possible to optimise it. After a bit of reading the documentation and a few attempts I came up with the following configuration block in my ~/.ssh/config file (remember, this block should be placed before the Host * one):

Host */*
    ControlMaster auto
    ControlPath   ~/.ssh/.sessions/%r@%h:%p
    ProxyCommand /bin/sh -c 'mkdir -p -m700 ~/.ssh/.sessions/"%r@$(dirname %h)" && exec ssh -o "ControlMaster auto" -o "ControlPath   ~/.ssh/.sessions/%r@$(dirname %h):%p" -o "ControlPersist 120s" -l %r -p %p $(dirname %h) -W $(basename %h):%p'

Let’s review it line by line, so the logic is clear:

Host */*

This host definition block would catch any host specified on the ssh command line when the host name matches the */* pattern, so ssh hostA/hostB/hostC will be matched as hostA/hostB being the first part before / and hostC as the second part after /. Due to a recursive call to ssh (see below) this block will be recursively applied to all hosts in the specified chain.

ControlMaster auto

This directive instructs ssh to try to reuse an existing control channel to communicate with the remote host, and if such a channel does not exist it will be created, so further connections to the same remote host would benefit from a speedup provided by the already established connection.

ControlPath ~/.ssh/.sessions/%r@%h:%p

This directive provides ssh with the location of the control channel socket file. The socket file should be unique for each remote host. Since we are reusing the existing connection and skipping the authentication the socket file should be tagged with the corresponding login name, this is why we are using %r (remote login name), %h (the remote host name), and %p (the remote port) as part of the file name. Please note that due to our usage of “/” as a host separator in the chain the path constructed here will have a subdirectory defined in the middle of the %h expansion. ssh would not automatically create that subdirectory, so it is something we need to address (see below)

ProxyCommand …

This is the heart of the whole block. I am starting this proxy command with /bin/sh -c '…' since ssh is exec()uting the specified command (this replaces the spawned shell and makes it impossible to conditionally chain commands), therefore I am using the shell binary as the proxy command to get the ability to script my logic. Then I am creating the required directory structure for the control channels under ~/.ssh/.sessions (note the -p argument to mkdir, this will create all the missing parts of the specified tree, but also would silence mkdir in case all of the directories already exist). It is worth to mention that with this mkdir command I am creating the subdirectory for the ControlPath defined for the enclosing Host */* block.

The second part of the command line is conditionally executing ssh if mkdir did not report any issues. It is good to execute ssh here since we do not need a redundant shell hanging around in the process tree. In this recursive ssh call we explicitly specify that we also need multiplexing of the control channels created by the parent connections (they are “parent” since this is the connection that established first and which enables access to the hosts further down the specified chain) as well as we explicitly specify the location of the control channel (note that since it is a parrent connection we are stripping the rest of host names from the %h macro using dirname.

Finally, the third explicitly specified directive is ControlPersist which is set to 120s. This directive instructs ssh to stay in the background and maintain the control channel in case we decide to reuse it, but if no activity on the control channel is detected for 2 minutes the ssh process would terminate. Without this directive the moment you close the connection which was the master connection all dependent connections would also be closed, e.g. if you have two sessions: one to hostA/hostB and the other to hostA/hostC, the moment you closed the first connection the second one will be immediately terminated if you do not have the ControlPersist configured.

The rest of the ssh arguments is obvious: we connect to the first host in the provided host chain (we are extracting that part with dirname %h) and we are proxying stdin/stdout to the last host in the supplied chain with the -W option.

Basically, the control flow when you do ssh user@hostA/hostB/hostC is the following:

ssh matches the */* pattern against the provided host name (hostA/hostB/hostC)
ssh tries to reuse the control channel by attempting to open the ~/.ssh/.sessions/user@hostA/hostB/hostC:22 socket, if successful the connection is established and the command prompt is displayed to the calling user, otherwise the execution continues
ssh executes the defined ProxyCommand command
the first part of the command creates ~/.ssh/.sessions/hostA/hostB if it is not there
the second part executes
```
ssh … -o "ControlPath ~/.ssh/.sessions/user@hostA/hostB:22" … hostA/hostB -W hostC:22
```
This will initiate another round of the above steps, but with a shorter chain and it will be recursive until there is just a single host left, e.g. when we ascend to hostA as the host to connect to.
now, with connected stdin/stdout to port 22 on hostC (in the last iteration) ssh performs the authentication against hostC
if authentication is successful ssh creates the ~/.ssh/.sessions/user@hostA/hostB/hostC:22 control channel socket and becomes the master of that control channel
a command prompt is displayed to the calling user

I hope this little trick will save you some time and will make your life easier. :)