Skip to content
Home » security » Iterate over a CSV in Python and Pandas

Iterate over a CSV in Python and Pandas

I frequently get data in CSV or Excel format, which I then have to use to deploy a vulnerability management solution like Tenable or Qualys. I use Pandas to process this data, which usually involves iterating each row of the dataframe in Pandas. It abuses Pandas. But it works. Yes, it’s a hack. I don’t care.

How to iterate each row of a dataframe in Pandas

If you just need the snippet of code to iterate over each row of a dataframe, here it is.

for index, row in df.iterrows():
    itemIwant = row['column']

It’s an abuse of Pandas. Purists argue you don’t need to do this because you can process entire columns of data in one swoop and it’s faster. But depending on what you’re doing, like a ton of API calls, iterating over each row may be what you need to do.

To put this into use, substitute your own dataframe for df, and you’ll need to substitute in your own variable name for itemIwant and the column you want for column.

Index is there because you need an index, but in most cases, I never use it except to make the for statement work. Don’t tell my Computer Science 203 instructor please. (Sorry, Dr. Saab.)

A practical example

Iterate over a CSV in Python and Pandas

Iterating each row of a dataframe in Pandas makes the pandas angry. But sometimes it’s the easiest way.

Let’s say you have an Excel file containing network ranges. You have a column named IPs and a column named Description. Let’s iterate it and make Qualys asset groups out of that. Because we can. Sure, you could just enter them by hand in the UI, but if you have several dozen of them, doing it via the API is faster and less error prone.

This example uses Parag Baxi’s excellent qualysapi library for simplicity. Yes, he left Qualys years ago but he’s still maintaining it. It prompts you for credentials if you don’t provide any in line 3.

import pandas as pd
import qualysapi
qgs = qualysapi.connect()
df = pd.read_excel('network.xlsx')
for index, row in df.iterrows():
    ret = qgs.request('/api/2.0/fo/asset/group/', 
          {'action': 'add', 'name': row['Description'], 'ips': row['IPs']})

Super-simple example of a useful script in just 7 lines of code. And you have all the power of Pandas to build on to make it into something even more useful.

%d bloggers like this: