Categories
Linux Programming Systems Administration

Tips for doing “shell” scripting in Python

I’ve been moving a bunch of my shell scripts to python recently, because of how much better it handles the concept of “a process didn’t run successfully”. Consider the bash script:

#!/bin/bash

# Command 1
rsync thing otherthing

# Command 2
rsync thing2 otherthing2

# Command 3
rsync thing3 otherthing3

# Send success
echo success!
#curl -v http://dms-monitor/tools/ran_bash_script

What happens here if Command 2 fails?

~/tmp[2]> ./test.bash
rsync: link_stat "/home/jtuckey/tmp/thing2" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3]
success!
~/tmp>

Note that we still get the success message at the end – bash silently ignored our error-ing command. So lets look at doing this in python:

#!/usr/bin/env python3

import subprocess

# Command 1
subprocess.run(['rsync','thing','otherthing'])

# Command 2
subprocess.run(['rsync','thing2','otherthing2'])

# Command 3
subprocess.run(['rsync','thing3','otherthing3'])

# Send success
subprocess.run(['echo','success!'])
#subprocess.run(['curl','-v','http://dms-monitor/tools/ran_bash_script'])

We immediately see there is much more…. stuff…. Just to run some commands? What happen whens we run it?

~/tmp> ./test.py
rsync: link_stat "/home/jtuckey/tmp/thing2" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3]
success!
~/tmp>

So no improvement – it still silently ignored our errors.

Tip – Use check=True

Ok, so what we need in python is the extra flag check=True:

#!/usr/bin/env python3

import subprocess

# Command 1
subprocess.run(['rsync','thing','otherthing'], check=True)

# Command 2
subprocess.run(['rsync','thing2','otherthing2'], check=True)

# Command 3
subprocess.run(['rsync','thing3','otherthing3'], check=True)

# Send success
subprocess.run(['echo','success!'], check=True)
#subprocess.run(['curl','-v','http://dms-monitor/tools/ran_bash_script'])

And now lets run it:

~/tmp> ./test.py
rsync: link_stat "/home/jtuckey/tmp/thing2" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3]
Traceback (most recent call last):
  File "./test.py", line 9, in <module>
    subprocess.run(['rsync','thing2','otherthing2'], check=True)
  File "/usr/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['rsync', 'thing2', 'otherthing2']' returned non-zero exit status 23.
~/tmp[1]>

Okay, much better – the code stopped running once we hit a process that failed, and our script itself exited with a non-zero exit code, allowing us to chain things together more easily

Tip – functools.partial

So how can we reduce the noise in the script? There’s a lot of characters that are just the same between each line we run. Well, one way we can approach this is to use functools.partial to partially define our function before we use it, presetting some flags like our check=True flag. Here we go:

#!/usr/bin/env python3

import subprocess
from functools import partial

r = partial(subprocess.run, check=True)

# Command 1
r(['rsync','thing','otherthing'])

# Command 2
r(['rsync','thing2','otherthing2'])

# Command 3
r(['rsync','thing3','otherthing3'])

# Send success
r(['echo','success!'])
#r(['curl','-v','http://dms-monitor/tools/ran_bash_script'])

now we can just call r to call subprocess.run with the check=True argument pre-set! nice, less noise, which is an improvement.

Note that we can also override a single call’s check argument – I can allow Command 2 to fail explicitly:

#!/usr/bin/env python3

import subprocess
from functools import partial

r = partial(subprocess.run, check=True)

# Command 1
r(['rsync','thing','otherthing'])

# Command 2 - we allow this one to fail by setting check=False
r(['rsync','thing2','otherthing2'], check=False)

# Command 3
r(['rsync','thing3','otherthing3'])

# Send success
r(['echo','success!'])
#r(['curl','-v','http://dms-monitor/tools/ran_bash_script'])

running:

~/tmp> ./test.py
rsync: link_stat "/home/jtuckey/tmp/thing2" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3]
success!
~/tmp>

Final Thoughts – how to clean up the list syntax? I’m not sure

The final thing I don’t really like is the lists in every command

['rsync','thing3','otherthing3']

I haven’t really found a good way to fix this, though. One way that works is to split the list each time:

# Command 1
r('rsync thing otherthing'.split())

# Command 2
r('rsync thing2 otherthing2'.split())

I personally find this a bit cleaner to read but it’s not really that much better. Also, if you interpolate the string it can trip you up by accident:

# Command 2
file1 = 'this file has spaces in the name'
r(f'rsync {file1} otherthing2'.split())

Note that what we want to be the second argument now actually gets split into pieces:

~/tmp[2]> ./test.py
rsync: link_stat "/home/jtuckey/tmp/this" failed: No such file or directory (2)
rsync: link_stat "/home/jtuckey/tmp/file" failed: No such file or directory (2)
rsync: link_stat "/home/jtuckey/tmp/has" failed: No such file or directory (2)
rsync: link_stat "/home/jtuckey/tmp/spaces" failed: No such file or directory (2)
rsync: link_stat "/home/jtuckey/tmp/in" failed: No such file or directory (2)
rsync: link_stat "/home/jtuckey/tmp/the" failed: No such file or directory (2)
rsync: link_stat "/home/jtuckey/tmp/name" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3]
Traceback (most recent call last):
  File "./test.py", line 13, in <module>
    r(f'rsync {file1} otherthing2'.split())
  File "/usr/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['rsync', 'this', 'file', 'has', 'spaces', 'in', 'the', 'name', 'otherthing2']' returned non-zero exit status 23.
~/tmp[1]>

This is probably not what we want. For now I just use the list syntax – the benefits of writing in python and getting the full language are worth a bit of annoying syntax.

Leave a Reply

Your email address will not be published. Required fields are marked *