Deduping with Excel

One of the big challenges with CRM is keeping it filled with high-quality data. This means taking the time to enter data for your consituents, making sure all members of your team are doing the same, and making time to periodically look at your data, cleaning out duplicates as you find them.

One of the most powerful tools for analyzing and working with data is Excel. You can export an Salesforce.com report to Excel, and you can use the power of Excel formulas to make data cleaning easier. Here are a few tips:

Put the Id fields on your report
Always bring over Contact Id and Account Id to your reports if your going to do data cleaning. These Id’s are very helpful in the process as you’ll see.

Use Excel’s sort
You can do more powerful sorting in Excel than you can in Salesforce.com. Deduping by matching first and last name is easier if you sort by those two fields.

Use a simple formula to highlight duplicates
To quickly find duplicates, start by sorting on the fields that you want to search for dupes. I’ll use email in this example.
In a new column in your exported report, add a formula that compares the email for this row with the emails on the row s above and below it.

=IF(OR(A3=A4,A3=A2),"Duplicate Email","")

This formula will print out the word Dupe if the value in cell A3 is the same as that is A4 or A2. You can now quickly scan this new column to see duplicates.

To look for duplicates that have the same first and last name, the formula is slightly more complicated:

=IF(OR(AND(C4=C5,D4=D5),AND(C4=C3,D4=D3)),"Duplicate Contact Name","")

C4 is first name, D4 is last name for the Contact in question. If the name exactly matches the row below or above, it will print the text.

Have fun with hyperlinks
Create hyperlinks back into Salesforce.com for acting on duplicates. To get to a Salesforce.com object like Contact or Account, use a formula like this:

=HYPERLINK("https://na3.salesforce.com/" & B2,"View Contact")

B2 is the Contact Id–a clickable hyperlink is generated that will take you directly to the contact’s record. This will work with any Salesforce.com Id–Contact, Opportunity, Account, etc.

Salesforce has a powerful Contact merge fuction that will combine up to 3 Contacts into one record, preserving any related items like Opportunities. It’s a great way to deal with duplicates. You can link to the wizard from Excel, just pass in the Account Id of the Contacts you want to merge:

=HYPERLINK("https://na3.salesforce.com/merge/conmergewizard.jsp?id=" & B2,"Merge Contacts")

You’ll get dropped in the Merge wizard saving a couple clicks. Just select the contact to merge and you’re off.

Excel formulas can really speed up your data duplication work. If you use these simple tips you can keep your Salesforce.com data as clean as possible, resulting in more successful CRM.

12 Responses to “Deduping with Excel”

  1. Kingsley Joseph Says:

    Don’t forget conditional formatting - combined with sorting, it can give you a bunch of easy wins.

  2. Steve Says:

    An you are the master of conditional formatting, so I believe you!

  3. Chris Says:

    And filtering :-)

  4. Dan Says:

    Speaking of filtering, Ron de Bruin from the Netherlands has a freeware EXCEL add-in named EasyFilter that I find invaluable, particularly its Unique-Duplicate capability. Plop 50K Contacts into an EXCEL workbook and let EasyFilter crank away to find uniques and dupes in a column. I usually concatenate some cells to get things like a Full Name or a Full Name + Account. It takes awhile, even on a powerful machine, but the results are worth it.

    http://www.rondebruin.nl/index.html

  5. Steve Says:

    I just took a look at this add-in. Very nice. Thanks for the tip.

  6. Rob Kingdom Says:

    With the formula above - =IF(OR(AND(C4=C5,D4=D5),AND(C4=C3,D4=D3)),”Duplicate Contact Name”,”")

    How do you apply that to a whole spreadsheet with around 7000 entries?

    Is this possible or is it really easy and im just being thick?

    I’m inexperienced with Excel as you can probably tell.

    Cheers Rob

  7. Steve Says:

    It’s easy. Just copy that formula to each of the 7000 rows and it will work.

  8. Rob Kingdom Says:

    Cheers Steve,

    I’ll have a go and see how i get on.

    Rob

  9. Rob Kingdom Says:

    Had a go at copying the whole formula over the spreadsheet but no luck so far.

    Is there something i’m doin wrong or not doin?

    Cheers,

    Rob

  10. Rob Kingdom Says:

    sorry that should read “formula over the whole spreadsheet”

  11. Kevin Fennelly Says:

    =IF(C2=C3, IF(D2=D3, “DUPE”, ” “), ” “) Is a much more concise version of the excel formula

  12. Andrew Says:

    Decent Excel tutorial.
    However Deduping can be done easier, in my opinion, by importing the data into MS access.
    Simply import the data and add an Autonumber field, create a simple “Make Table Query” with your required fields and the Autonumber field then run it. Create another “Make Table Query” and tie your two tables together using the Autonumber field, run it to make a 3rd table. Export your 3rd table and you are done.
    The beast part is you can save your queries and re-use them for future deduping projects, as long as keep your first and second table names and field names the same.
    You could als make a form and have it run dynamic queries to automate the deduping process.

Leave a Reply