Countif example: Data analysis with Excel

The lowly Countif function is one of the most useful tools for data analysis you’ll find in Excel and other spreadsheets. It’s also not all that well understood, I find. Knowing how to use it sure has made my job much easier since 2013, so I think there’s a chance it will help you too. Here’s a Countif example to show you how to use it.

There’s a misconception out there that you need to be a software developer in order to understand and use Countif. While the syntax does indeed look a lot like a programming language, it’s simple enough that you don’t have to be a programmer to do it. If you understand the mathematical concepts of functions and variables, you can make sense of it. Or if you’re a guy like me who’s written a few simple programs here and there over the years, that’s more than enough. You don’t have to be a good software developer to understand Countif.

Also, I’ll mention Excel a lot here. It’s what I use most frequently and there’s a good chance it’s what you use too. But other spreadsheets, such as Libre Office Calc, usually also offer the Countif function. And it works about the same way.

What Countif is

Countif example
Countif is one of the most useful functions for analysis in Excel and other spreadsheets. Here’s a Countif example so you can put it to work for yourself.

Countif is a function to count the number of occurrences of a value in a range of data. It literally means “count the number of ifs.” An if function returns a Boolean value of one for a match, and zero for a non-match. Therefore, Countif provides you a count of the number of matches.

Several years ago I was working deploying a computer system at a large company. I would get lists of computers from various data sources within the company and use Countif to see how many of those computers were reporting in. It was a quick way for me to find computers that weren’t reporting in. Then I could investigate whether they still existed or needed to be reconfigured. We’ll go through that example in a minute, but first I want to go through a few simpler examples to show how Countif works, and a couple of odd tricks with it that you may not have thought of.

The Countif formula

Countif just takes two arguments: the range of data and the value you want to check for matches. This makes it one of the easier Excel functions to use.

A super-simple Countif example

Here’s a really simple Countif example that should suffice to illustrate the concept. You may be able to adapt it as-is to solve simple problems. Here’s the starting lineup for the 2015 Kansas City Royals, including first name, last name, position and age of each player.

Position First Last Age
C Salvador Perez 25  
1B Eric Hosmer 25  
2B Omar Infante 33  
SS Alcides Escobar 28  
3B Mike Moustakas 26  
LF Alex Gordon 31  
CF Lorenzo Cain 29  
RF Alex Rios 34  
DH Kendrys Morales 32

Age is in column D. To figure out how many 25-year-olds this team had, enter the following equation in any blank cell:

=countif(D2:D10,25)

You should get two, since Eric Hosmer and Salvador Perez were both 25.

A Countif example to match text strings

The players’ first name is in column B. To figure out how many players they had with a first name of Alex, enter this equation in any blank cell. Note that to match anything other than a numeric value, you have to use quotes.

=countif(B2:B10,”Alex”)

In this case you’ll also get two, since both Alex Gordon and Alex Rios match.

If you want to match the entirety of a column, without specifying upper and lower bounds, change the formula like this:

=countif(D:D,25)

On large spreadsheets, the comparison can work faster if you specify a bounds limit. On a spreadsheet of this size, you won’t notice a difference.

A countif example of comparisons with greater than or less than

What if you want to do something other than a simple match? You can do that too. Here’s a formula to determine how many players the team had under the age of 31:

=countif(D2:D10,”<31″)

Note that you have to put the formula in quotes, like you would a text string. The result is five, since Salvador Perez, Eric Hosmer, Alcides Escobar, Mike Moustakas, and Lorenzo Cain were all under 31.

To do greater than or equal, or less than or equal, simply add an equal sign to the equation like this:

=countif(D2:D10,”<=31″)

The result increases to six, since Alex Gordon was 31 years old.

I’m not sure how useful this is, but you can compare on strings too.

=countif(C2:C10,”<G”)

This will tell you how many players’ last names start with letters before the letter “G” in the alphabet.

A countif example with multiple datasets

Let’s go back to my example from earlier in my career, when I would compare lists of computers to see what was reporting in and what wasn’t. To quickly check them, you can just copy the two columns into the same worksheet.

Here’s a list with three systems in each column. Of course this is much useful with a larger number of systems but this makes it nice and easy to illustrate.

In system New List Reporting?
STLLNX01 CINLNX01 =countif(A:A,B2)
DETLNX01 STLLNX01 1
DETWNT24 CINWNT01 0

In this case, I’ve entered an export from my system into column A, and my new list of systems in column B. My countif formula looks just a little different. I include a range, but in the second value, I provide the coordinates of a cell, rather than a hard-coded value. If I copy and paste that value down to fill up column C, Excel automatically adjusts the value.

The countif formula will display either 1 or 0. A value of 1 indicates a match. A value of 0 indicates it doesn’t match. My systems that start with “CIN” don’t appear in column A, so they get a value of 0. STLLNX01 is in column A, so it matches.

Now I can add up the value of column C to see how many systems match between the two lists. I can filter on column C on the value of 0 to find the systems that aren’t reporting.

Using Excel along with countif let me get a lot of systems reporting in a hurry, because it only took me a few minutes to find the systems that were missing.

Using countif as a substitute for pivot tables

Sometimes instead of using a pivot table, I’ll copy the row of data I’m interested in over to column B of a new worksheet, then deduplicate it. Then I’ll copy a new copy of the same row into column A. Then, in column C, I’ll paste this formula:

=countif(A:A,B2)

Then I’ll copy and paste that formula all the way down, or click the box in the lower right-hand corner of the cell to cause it to fill down.

This gives me counts of all the unique values in that column. Depending on how much data you have, this can be less CPU-intensive and memory-intensive than a pivot table. I like to zero in on the most common and least common values in the data set. In my line of work, the extremes are the most interesting.

If pivot tables are causing Excel to crash on you, or they’re taking 30 minutes to finish and driving you nuts, try this trick instead.

Using countif across worksheets or tabs

If you have a worksheet full of data and don’t have room to add extra columns of data for your criteria and your countifs, you can put them in a separate worksheet and reference it. Simply add a new worksheet to your document, then paste your values into column A. Then, in column B, you just have to write your countif statement a bit differently:

=countif(Sheet1!A:A,A2)

This formula assumes your sheet’s name is Sheet1. Just change it as appropriate if your sheet has a different name. You specify the sheet you want, followed by an exclamation point, followed by the range. This will let you accumulate statistics on a sheet separate from your data, which may be less distracting if you have a lot of data.

Using countif across different files

It’s messy, but you can issue a countif against an entirely different file, too, if you want. Here’s a cross-file countif example:

=COUNTIF(‘file:///C:/Users/dave/Documents/Feb 2018 scan.xlsx’#$Sheet1.A$1:A$1048576,A2)

In this example, I’m referencing column A in a file called Feb 2018 scan.xls stored in my Windows user profile. The formula to follow is filename#$sheetname.range. The range is a bit tricky because you have to separate the row and the line with a $ character, and you have to specify the numeric range. You can’t cheat and use A:A to reference the whole row.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this:
WordPress Appliance - Powered by TurnKey Linux