Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I decided to drag myself kicking-and-screaming to the 21st century and start writing my handy-dandy utility scripts in python instead of bash. All was well and good until I made them available to the rest of my team, and suddenly I'm in python dependency hell. I search the internet and there are a lot of different solutions but all have their problems and there's no standard answer.

I decided "to heck with it" and went back to bash. There's no built-in JSON parser but I can use 'grep' and 'cut' as well as anyone so the end result is the same. I push it to our repo, I tell coworkers to run it, and I wash my hands of the thing.



jq has been a lifesaver for me parsing json in bash. Of course, it's an external utility not present by default in most systems.

Another thing to consider is more of a middle-ground approach. Most systems do have a python interpreter, so you can use a lot of base python without worrying about dependency hell. I use inline python in bash all the time, e.g.

  ls | python -c 'import sys,json;lines=sys.stdin.read();print(json.dumps(list(filter(bool,lines.split("\n"))),sort_keys=True,indent=2))'
You can even use variable substitution, if you surround the python code in double quotes. Even mix f-strings and bash substitution

  python -c "print(f'Congrats, ${USER}, you are visitor number ${RANDOM}. This is {__name__}, running in $(pwd)')"


Or use a heredoc to not worry about competing quote chars:

  # python << EOPYTHON
  print("Congrats, ${USER}")
  print("You are visitor ${RANDOM}")
  print("This is {__name__}, running in ${pwd}")
  print("It's a heredoc to allow both quote characters")
  EOPYTHON


Great trick with using the python standard lib! Thanks for posting that.

edit: You probably already know this, but for anyone reading along, piping `ls` is unsafe if you plan to use the paths for anything except for printing them out. A path on linux can contain any byte except for NULL, so when `ls` prints them out, you can get broken behavior if you try to break on newlines.


Just a question - why do you have a dependency hell? You could restrict yourself to the Python standard library, and you would only have one dependency. The Python standard library is much nicer than bash if you need more complex data structures than what bash provides.


"grep" and "cut" are not Bash, they are programs and have dramatically different feature sets between distributions and OSes (grep on MacOS is very different from grep on a modern Linux distribution using GNU Coreutils, and there are many incompatibilities). Many scripts that work on Linux won't work on Mac because of this.

With Bash, your best bet for portability is to run scripts in a Docker container. If you want portable code, you have to bundle your dependencies--there's no free lunch here, including Bash.


When I was at Google I had a similar problem (team wasn't using Blaze). So what I did was to have a wrapper entrypoint around every python entrypoint that would just run that python entrypoint (e.g. foo would execute foo.py). The advantage was that the shell script would first set up a virtual environment for every entrypoint and install all the packages in the requirements.txt that was beside the entrypoint (removing any new ones). Each requirements.txt was compiled from a requirements.in file via pip-sync [1] which meant that devs only had to worry about declaring just the packages they actually directly depended on. Any change to requirements.in would require you to have run pip-sync which wouldn't (by default) upgrade any packages & only lock whatever the current version is (automation unit tests would validate that every requirements.txt matched the requirements.in file).

This didn't solve the multiple versions of python on the host. That was managed by having a bootstrap script written in python2 that would set up the development environment to a consistent state (i.e. install homebrew, install required packages) that anyone wanting to run the tools would run (no "getting started guides") which also versioned itself & was idempotent (generally robust against running multiple times). We also shipped this to our external partners in the factory. Generally worked well as once you ran the necessary scripts once no further internet access was required.

It wasn't easy but eventually it worked super reliably.

[1] https://github.com/jazzband/pip-tools


I actually did something very similar when my application had to execute a python script on any old box and I was strictly forbidden to make any changes on the host machine. My application refused to start if python 3 wasn't found so I didn't have to deal with that mess. It ran bash, setup the venv, did python-y stuff, clean up the venv, take only pictures leave only footprints.


The caveat is that with mine the venv wasn't destroyed at the end of execution. Instead I put a snapshot of the sha256sum of the requirements.txt file which I double-checked on boot. If that changed then I ran pip-sync.

This was critical for devs because this was the underlying thing for all scripts devs ran (build system, terminal to device, unit tests, etc etc). Startup latency was key & I spent time optimizing that to feel as instant as a native executable unless the virtual environment changed which isolated the expensive part (& generally happened more & more rarely for any given tool as I found the dependency set to mature & freeze pretty quickly).

This had a great side benefit making it super-easy to run the scripts once on an internet-connected device & then use that as the base image for all the factory machines that could then be offline because all the virtual envs had been initialized.


This might seem like lunacy, but I really like/recommend Ammonite instead of Python/Bash.

It's Scala, runs on the JVM, and is perfect for writing scripts. (It has a great built in dependency resolver, I mean it uses Ivy, but it downloads the dep by itself, you just import it via the "maven coordinate" - http://ammonite.io/#IvyDependencies )

It gives you a lot more safety/correctness than Python, and it's a bit simpler to install too. (No need to compile extensions, just get JDK8 and it'll run.)


The solution to this (at least the one we've landed on at work) is to make sure your dependencies are packages in a yum repo you include on your systems. For us, that's a local private yum repo our systems have access to which we package perl Module requirements into that aren't in the public repos. We also include our private libraries there. If the utility script is commonly enough used, we'll make an RPM for is as well, or stick it in one of our general purpose utils RPMs and make sure dependencies are set. If that's done, you don't have to worry about dependencies at all, if not, you might have to manually yum install a few things that are grabbed from our yum repo.

There are lots of ways to handle this problem, but if you're handling lots of systems, you presumably already have a method you use to keep them up to date and secure. You presumably are also installing Python from the system packages (if not, you probably shouldn't be writing system utils in it unless you can ensure it's the same on every system you guys maintain, in which case your dependency problem shouldn't be a problem), so tie into that mechanism. It's a lot easier to reason about when there aren't two competing systems, and presumably you aren't going to do away with the security updates the distro provides.


While I can understand your pain related to dependencies with Python. I still cannot wholeheartedly support such of way. Depending on case bash scripts are valuable and should be utilized instead of using Python. However in some cases this can be painful for other developers, if used in wrong use cases.

I recently received a script from partner company that used such of script for forwarding data to their API. It was quite long and had few dependencies that were not visible until you (stupidly) executed it.

Few random thoughts:

- Bash scripts can be ran in environments where all dependencies to binaries are not met. In these cases the script might cause damage if they expect that everything is available.

- When someone is unexpectedly required to modify the script it can be difficult or cause issues when this is done by inexperienced developer (in this age I wouldn't be surprised)

- If the script uses a program that is required to be certain version for getting wanted results it may cause issues

- The environment where script is ran is usually not a vacuum. Another scripts might change environment variables or change/remove programs in general

While dependencies with Python can cause issues in the future. The trade-off is having some sort of control as long you don't execute other binaries directly.


Exception handling and (unit|py)test are worth the headaches.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: