Recursion in computer science is a subroutine that calls itself. So what is recursion in Python? The same thing. A function that calls itself.
Recursion is a concept that first or second semester computer science classes frequently cover, but it can be confusing. I will admit that I generally tried to avoid using it, but it can be useful. I especially find recursion in Python useful when working with APIs, which is a common Python use case.
Uses for recursion in Python
The classic use case for recursion is a problem that you can iteratively simplify. The idea is that you solve part of the problem, then call the routine again to solve another part, and repeat until the whole problem is solved.
The examples from my computer science classes were generally things like computing factorials. I’ve never had to compute factorials in Python though. So here’s an example that solves a couple of problems you’re much more likely to face in Python.
Even if all you do with Python is pull data from APIs and convert it into other formats, sometimes recursion is helpful in unexpected ways.
Handling intermittent errors with recursion in Python
One use case I ran into about a year ago was when a certain product I will not name decided to start randomly throwing HTTP errors half the time. The problem was intermittent, and I had a support case open, but they weren’t exactly setting any records fixing it. In the meantime, my team had end of month metrics due. Our clients were looking for any reason to be upset, so they weren’t exactly understanding.
I was struggling with a way to do retries when the guy I’d hired a month before suggested recursion. Brilliant!
What we did was put our requests call in a subroutine. The subroutine then called itself to handle exceptions that weren’t due to a botched URL or botched credentials.
It was a brute-force approach for sure, and it probably makes the underlying problem worse by generating a lot of unnecessary traffic. But it allowed us to get our metrics done. And it only required us to write 12 lines of code. It was fast and efficient, both in computational time and in terms of labor.
Here’s a code snippet.
import requests import time def api(url, headers): response = requests.get(url, headers=headers) if response.status_code == 200: return response elif response.status_code == 404: print('404 error on ', url) exit(1) elif response.status_code == 403: print('403 error, check credentials') exit(1) else: # handle errors other than botched creds or file not found with recursion time.sleep(60) # put a pause in here, don't just hammer your API response=api(url, headers)
Handling API pagination with recursion in Python
Vulnerability scanners frequently only send you part of the data you request when you make an API call. It is much more reliable to break the data into pieces and send you part of it, and include a link to the next part. But then it’s on you to grab the links and join all the data together.
Here is an example I came up with using the VDOO scanner, a very cool scanner from JFrog that scans firmware images and other similar files and gives you a list of the vulnerable software it finds inside. If you want to objectively compare the security of two routers, VDOO is the kind of tool you use for that. It’s invaluable for embedded security. Why haven’t you heard of it? Because security is more about pay-to-play than solving problems these days. Sorry not sorry for the rant. This code calls the appropriate API endpoint, uses recursion in Python to get all the pages, merges the JSON into a big Pandas dataframe, and returns the dataframe.
There are other ways to do the same thing, but this is short, only a few lines of code.
import requests import json import pandas as pd def get_vulns(url, headers): response = requests.get(url, headers=headers) a = json.loads(response.text) # vuln data in a['results'] # next page url in a['next']. a['next'] is None if it's the last page a1 = pd.json_normalize(a, record_path=['results']) if a['next'] != None: b1 = get_vulns(a['next'], headers) c1 = pd.concat([a1, b1]) return c1 else: return a1
Recursion in Python: in conclusion
Sometimes recursion seems like a solution in search of a problem. But it can be an efficient way to loop while not totally desecrating computer science best practices. I learned to program in basic, which had a command called go to. Computer scientists don’t like go to, because it interrupts the flow of a program and makes the flow difficult to follow. Admittedly, it can cause spaghetti code.
Python doesn’t have a goto statement. I’ve looked. But what I have found is that when I want a goto, recursion can almost always solve the problem more elegantly.