How to merge data in Excel

I understand business analysts ask each other how to merge data in Excel in job interviews. The rest of us struggle and swear a lot. If you don’t count a business analyst among your best friends, here’s the secret you need to merge data with the best of them. Including the detail everyone seems to forget.

I’ll talk about Excel here, but you may be able to adapt this to other spreadsheets too.

How to merge data in Excel using VLOOKUP

The simplest way to merge data is to use the VLOOKUP function. VLOOKUP is clunky but straightforward to set up, as long as you’re willing to re-order columns and can do simple one-for-one lookups like an employee number to a name, or a computer’s hostname to its IP address.

To set up VLOOKUP, copy your data into a separate tab. Put your index data in column A and your additional data in the adjacent columns.

Now you can use a formula like this:

=VLOOKUP(a1,sheet2!a:b,2,FALSE)

a1 is your criteria. This is the piece of data that’s common to both sheets.

sheet2!a:b is your data range. This is the table of data you’re pulling in. For simplicity’s sake, you can just specify each column of data, or you can narrow it down for speed.

The number is the column that contains the data you want to pull in.

And that’s it. For a simple quick and dirty merge, this works well enough. After you pull your data in, you may want to highlight the columns, copy, then paste values over those columns to freeze the data and remove the dependency on the second sheet. This also keeps Excel from having to pull the data in ever again, which can be a slow operation if you have large sheets.

Here are some more secrets to VLOOKUP if you are interested.

How to merge data in Excel using INDEX and MATCH

Most people try to use VLOOKUP to merge data in Excel. VLOOKUP works, but it’s clunky. You have to arrange the columns you want to merge in left-to-right order, for one. If you’re trying to use VLOOKUP and it doesn’t work, I’ll bet that’s your problem. Try moving your lookup column to column A and it will probably start working.

The other problem with VLOOKUP is it doesn’t work if you want to merge on two columns and not just one. You need two columns to do things like taking data from two different IT security tools and mashing it together, maybe because one of the tools provides invaluable insight into what to fix, but the other has better detail of why or how you have to fix it.

I’ll use an easier example than that, though.

The formula using the INDEX and MATCH functions is incredibly cryptic. There are other sites that talk about how it works, if you really want to know. I’ll just tell you how to build it, and where to make your substitutions so you can use it too, and you can get back to delivering that report you said two hours ago should only take a couple of minutes.

An example using two sheets

Let’s take two sets of sales data. Both sets of data contain a customer name and product. But one has the amount they owe and one has the customer’s address and phone number.

Let’s create two tabs, and call one Billing and one Shipping. I like to use single-word names for tabs. It makes things easier.

Here’s Billing.

how to merge data in Excel

And here’s Shipping.

how to merge data in Excel

Notice how the columns aren’t in the same order? No problem. We’re going to build a scary-looking formula to pull over the phone number and the address into the billing sheet. And the formula doesn’t care what order the data is in. So add two more columns to the billing sheet, one for the address, and one for the phone number.

Constructing INDEX and MATCH

The first thing you need to do is figure out the columns you want to match on. Database administrators would call them key fields, but I don’t care what you call them. In this case, we need to match on Customer and on Product. Make a note that Customer is in column A in both sheets, and that Product is column B in Billing and column D in Shipping.

Here’s the formula to pull in the address:

=INDEX(SHIPPING!$A$2:$D$7, MATCH(1, (A2=SHIPPING!$A$2:$A$7) * (B2=SHIPPING!$D$2:$D$7) ,0), 2)

DO NOT hit enter when you finish typing it. You’ll get an error and you’ll say I’m just like every other blogger who posted stuff that doesn’t work. Except you won’t say “stuff.” Instead of hitting enter, hit SHIFT-CTRL-ENTER. This tells Excel to handle the formula as an array, and then it will pull the value in. Someone showed me this trick in 2012 and I forgot it and only just now rediscovered it.

Let’s step through the pieces to demystify how to merge data in Excel.

Defining your array

=INDEX(SHIPPING!$A$2:$D$7, MATCH(1, (A2=SHIPPING!$A$2:$A$7) * (B2=SHIPPING!$D$2:$D$7) ,0), 2)

The portion in bold is a reference to the data you want to pull in. The data we want is on a sheet named SHIPPING, and the data starts at cell A2 and ends at cell D7. Simply adjust this for your sheet. You’ll need to change the name, and the upper bounds of the range.

Defining criteria #1

=INDEX(SHIPPING!$A$2:$D$7, MATCH(1, (A2=SHIPPING!$A$2:$A$7) * (B2=SHIPPING!$D$2:$D$7) ,0), 2)

The portion in bold tells Excel to look in column A on the Shipping tab for a match in Cell A2. You’ll need to adjust it to match the tabs in the data you’re trying to merge–both the columns and the lower bounds. In this case, both sheets have customer data in column A, making this one pretty easy.

Defining criteria #1

=INDEX(SHIPPING!$A$2:$D$7, MATCH(1, (A2=SHIPPING!$A$2:$A$7) * (B2=SHIPPING!$D$2:$D$7) ,0), 2)

The portion in bold tells Excel to look in column D on the Shipping tab for a match in Cell B2. You’ll need to adjust it to match the tabs in the data you’re trying to merge. This one’s a little trickier since the data we want to merge is in column D in the shipping tab and column b on the billing tab.

Defining the result

=INDEX(SHIPPING!$A$2:$D$7, MATCH(1, (A2=SHIPPING!$A$2:$A$7) * (B2=SHIPPING!$D$2:$D$7) ,0), 2)

This last value is simply the column number that contains the data we want. In this case, it’s column 2, the address. To pull in the phone number, simply copy the function and change the number to the column containing the phone number.

How to merge data in Excel: In conclusion

Once both formulas work, fill them down to fill the sheet. If you have a large sheet, I really recommend you save the sheet, then highlight your merged data, copy it, then do a paste special and select paste values. This will make your sheet a lot less computationally intensive.

And that’s how you merge data in Excel. Good luck on your project.

If you found this post informative or helpful, please share it!