This project was made for Nestle Polska S.A. from Torun. My task was to create a script for automatic correction of contractor's data with SAP, MS Access, MS Excel and VBA. I needed to analyze existing big dataset (over 300 thousands rows), to find and correct details or remove duplicated companies from it.
The problem was that every day sales representatives add tens of new companies into the database. Unfortunately while doing so they sometimes make mistakes, i.e. adding company that already exists in the database under a slightly different name, misspelling the name of a city, street, company (e.g. the Polish city Białystok was written in more than 40 different ways), do not providing all required information.
The final solution was made in two steps:
The results of such approach were more than satisfying. Sometimes system required a user attention to correct the data or create new rules in tables mentioned above. But the longer system worked, the more rules it contained and the less user attention was required.
Using the created application it was shown that ~300 000 companies written into the database were in fact around 110 000 unique companies.
In first 1 month, more than 5 000 correction rules were created resulting in just a few beging added each day and just 10% of the newly created companies required a user intervention.