Tutorial: How to Compare Two Excel Lists Using DiffEngineX

What's in One List but Not Another?

A common business problem often concerns finding out what names and addresses appear in one list but not another. After the new data has been identified it is useful to be able to extract it into a new Excel workbook.

DiffEngineX can do the bulk of this type of work. Knowing a few Excel tricks and what options to select in DiffEngineX can greatly improve the end results.

Consider the two lists shown below. Even though DiffEngineX has the capability to align similar rows it needs some help from you first. This is because some of the changes involve not just the vertical displacement of rows, but a reordering. In the first list the "Dobbs, Bob" row is before the "Rivers, Doreen" row. In the second list the order has been reversed.

Figure 1 - Two Lists

DiffEngineX will insert blank rows to get existing rows to match up, but it will not reorder them.

To get around this problem you should ask Excel to sort your lists before using DiffEngineX to compare them.

Below we see our two original lists after Excel has sorted them on last and first name. Alternatively we could have sorted them by their ID column.

Figure 2 - Two Lists Sorted by Excel

Without further ado we will now get down to the details.

Step-by-Step Instructions

  1. First Sort using Excel & Save:
    Use Excel to open the two workbooks you want to compare. Click on any cell in the first list. Now click Excel's Data menu (or tab in Excel 2007) and select the Sort item. The Sort dialog will now appear. Sort by Last Name and then by First Name. Hit OK. Now do the same for the second list. In your lists you can sort on any combination of columns that uniquely identifies each row.


  2. Save both your sorted workbooks (under different filenames if you prefer) before closing them.


  3. Start up DiffEngineX - Use Options, Extras & Align Rows:
    Invoke DiffEngineX and click the Options button. In our example we can see that some street addresses are in upper case and others in lower case. Here we don't want such a trivial change to been counted as a modification and so we select the Case Insensitive Comparisons checkbox. Click OK to dismiss the dialog box.


  4. Click the Extras button. In our example both our lists are small, but in real life some lists may contain tens of thousands of rows and have hundreds of differences between them. DiffEngineX uses color to highlight differences in automatically made copies of the workbooks it compares. We don't want to have to fish through thousands of rows just to see a few differences. Make sure the Yes option is selected for Hide Matching Rows. Click OK to dismiss the dialog box.


  5. Select Align Rows on the main part of DiffEngineX's user interface. Use the Browse buttons to point to your sorted Excel workbooks. Click the Start Comparison button.


  6. We now have to tell DiffEngineX what columns uniquely identify each row. As we previously sorted on Last Name & First Name we select columns B and C before clicking the Add button. Hit OK to dismiss the dialog and start the comparison.


  7. The results are shown below in figure 3. We can see DiffEngineX has correctly spotted the three new rows. However the matching rows are still in this workbook. They are only hidden. (You can see that Excel is not showing rows 1, 2, 4, 5 and 7.)


  8. The Final Step: Separate the Wheat from the Chaff:
    Select the Excel worksheet containing the color highlighted new rows. Click Excel's Edit menu and select Go To. Click the Special... button. Click Visible cells only. Hit OK. Select Edit--->Copy. You have now selected and copied just the visible, new rows.


  9. Create a new Excel workbook and use Edit--->Paste to copy across just the new rows. You now have separated the new rows from the hidden matching rows.

Note: For more complicated examples than shown here, rows may end up being colored red, green or purple by default to indicate differences. The colors red and green are used to indicate after row alignment one of two corresponding cells is blank. Purple means a cell has content in both sheets. You will have to inspect both the color highlighted sheets to find out all the differences.

Figure 3 - DiffEngineX Hides Matching Rows

Figure 4 - Use Excel's Edit--->Go To--->Special--->Visible cells only before Copy & Paste

Click here for more information on Florencesoft DiffEngineX.