I chose a form of dependency injection (DI) as the solution to a recent problem I needed to solve. This is a quick write-up of how I did it in Python, using the standard library modules.
The Problem¶
I’ll be illustrating the problem in a slightly different context, so I don’t derail too much into the subject, which would just be a distraction. So, if the problem feels unrealistic or stupid, that’s just a result of my unimaginative thinking.
We have an application, let’s call it the task runner, that lets users choose a task to run, and runs that task. Each such task is implemented as a separate Python script file, that take no user inputs, but do connect to a database and a few REST endpoints.
So this is how it works. We have a bunch of wrapper classes that provide high-level abstractions for the database and the REST endpoints. Instances of these classes are given to the task scripts, which use them to perform their task.
The task scripts are also expected to return some information back to our application, with details such as whether the task was successful or the reasons if there’s an error etc. The approach for how this is done is detailed in the following Solution section(s).
The Legacy Solution¶
The current way this is working (which I took the liberty to call The Legacy Solution) is that
when a user requests for a specific task to be run, the application reads up the relevant Python
script file and calls eval
on the contents. A pre-made dictionary holding all the instances of
high-level abstractions is provided in the global scope of this call to eval
.
This has been working well for several years now and, although it feels dirty in hindsight, there were probably good reasons it was done this way:
- It was very simple and easy to implement. There’s little to no magic.
- The task scripts can be updated on production without restarting the application and the changes would take effect immediately.
- The scripts’ logic can be written as module level code. Full freedom on how the code is structured and written.
Arguing on how horrible this approach is would be a great topic for a heated debate, and, fortunately that’s not what I set to write about here. This simple method, while worked, didn’t scale with the team. We soon decided to move to a more sophisticated approach and so started looking.
A major reason (among several) for this decision was to have the scripts not depend on implicit
globals. The use of implicit globals meant that the scripts were using variables that appear as not
defined to static code analyzers. Additionally, since the script file was being read into a string
and eval
-ed, the stack trace from any errors were not very helpful.
The New DI Solution¶
In the new proposed way for this to work, we have made three critical changes:
- The Python script files will be
import
-ed as Python modules, and thetask_main
function at the module level will be called to run the task. - Nothing is implicitly injected into the script’s global scope.
- Access to the API abstractions is done through a form of dependency injection.
In the task scripts, we have a function defined like the following:
def task_main(users_service, sales_service):
# do something with `users_service` and `sales_service`.
Here, the task_main
function is defined to accept two arguments. The users_service
and
sales_service
. In our task runner application, we use the inspect
module to identify the
abstractions being used in task_main
and pass them accordingly. Here’s how it works:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
import importlib
import inspect
def run_task_script(script):
module_name = script.replace('.py', '')
module = importlib.import_module(module_name)
args = inspect.signature(module.task_main).parameters
kwargs = {}
for name in args:
kwargs[name] = get_service_instance(name)
response = module.task_main(**kwargs)
record_task_response(response)
In this function, we first convert the script file name into it’s module name (hoping it doesn’t
contain any spaces or dash characters). Then, we use the importlib
module to import the module of
that name. Next, we call inspect.signature
function on the module’s task_main
function to get
its parameter names.
Based on these argument names (in args
), we then construct a dictionary with these names as keys
and the instance of the API abstraction class, as the value. We then pass this as the keyword
arguments to the call to module.task_main
.
In this way, the scripts don’t assume any implicit globals and the task_main
accepts arguments
that it needs and no more. This makes the code much cleaner and easier to do static analysis on.
Besides, since we import the module and call a function in it, we get nicer stack traces when
there’s an exception.
Conclusion¶
I’m sure there’s better, and more involved implementations of doing DI in Python, but what we’ve done above is enough for the target problem. Additionally, it’s just using the standard library, so, extra brownie points for that!