×
INTELLIGENT WORK FORUMS
FOR ENGINEERING PROFESSIONALS

Log In

Come Join Us!

Are you an
Engineering professional?
Join Eng-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Eng-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

Big List cleanup

Big List cleanup

Big List cleanup

(OP)
I have wondered about this and do it manually myself. If you have long lists and they have simple errors  like o instead 0 and dates in the wrong format and data with other simple error, is there tool to help remove those errors before you validate the data.  

RE: Big List cleanup

You may be able to use Ctrl 'H' for search and replace for the letters.

For the date formats, that's another problem... maybe someone here can help... You can copy one date, and then past this correct date to an offending cell... other than that, I don't know.

Dik

RE: Big List cleanup

Well, there is no magic command that can know what is supposed to be right and find everything that is wrong.

Filters and sorting can make many simple errors visually obvious.  O vs. 0 can be easily found this way if it's a single character.  the number will be on one side of the column, the letter on the other.

If there is a "rule" about what is supposed to be in a cell then you can use a formula or VBA to check for compliance to the rule.

Identifying the exact nature of a rule violation and correcting it is much harder, and usually it's faster to make the correction manually.

Better yet, use data validation to prevent erroneous entries in the first place.

RE: Big List cleanup

(OP)
I mean something where you list the possible types of errors, it autoreplaces the errors but still retains the changes without leaving in the spreadsheet so that you still review the changes after all the errors are not necessarily errors (a computer can't tell if there is a formatting error has placed the number in the place) - a very long spreadsheet is already very large and the corrections log could blow your storage capacity for the little error tracking gain.
My cases were chemical lists for chemical management where the lists come from all over the university and they fill in alphapropanol (50%) as  a-propanol, a-propanol (using the alpha symbol and alpha-propanol and lists of microsoft codes with dates in wrong formats. Equally the errors could be due a data source error like a data-logger that occasionally makes a symbol instead of a number or screws up the date so its American rather than British.    

  

RE: Big List cleanup

Use drop-down lists and data validation to restrict entry to only the "proper" spelling/format/whatever.

Regular expressions might be helpful in a VBA function to help weed things out, but any programmatic approach can only catch errors that you can foresee.

RE: Big List cleanup

With the date I would recommend to use for that field a format that displays the month in (3) letters. That will help anyhow to see if it is a date in the correct time frame.

If you would have a specific (long) list of your chemicals you could add a Conditional formatting to the cells or indeed a data validation.

However, data validation either prevents to enter certain values or will issue a warning. Afterwards you could not see if somebody did not care about the warning. Also data validation works only if you are really entering data manually. If you would copy/paste data the data validation would be lost.

Conditional formatting could also be added to fields that should contain numeric values.

A disadvantage of Conditional formatting could be that it may take more space and could somewhat slow down your Workbook if applied to many 1000s of cells.

Why an easy solution if you can make it complicated?
Greetings from the Netherlands

RE: Big List cleanup

If I'm understanding your data source, you have my sympathy.  It sounds like you get data from all sorts of sources from manual entry to datalogger entry with no control on your part of how the folks capture the data.  People that say "just put in an input screen" don't understand.

I've done that sort of thing (just this week actually) and while I'm quite comfortable writing VBA, the patterns are too distributed for a program to help much (i.e., out of a million records across 2000 sheets there are 100,000 errors, but no single error occurs more than 100 times, faster to edit it than write a program).

I always start by making a copy (because most of the time I have to abandon some non-trivial number of attempts).  Then I use "sort" liberally to try to group particular errors for repair.  The easy stuff is "o" for "0" in a field that is only numbers.  You can use "replace" for that.  The impossible stuff is knowing if "02/04/2012" is Feb 4 or March 2.  Everything else is somewhere in between.

Often it is useful to dump all the spreadsheets into Access (with all of the fields "text") and use the Access sorting and screening tools.  They are much more powerful and you don't risk sorting a field without sorting the identification data (we've all ignored the warnings in Excel).

Good luck and try to stay awake while doing this horrible, mindless task (it really sucks to introduce a bunch of new errors by doing a face plant in the keyboard).

David

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Eng-Tips Forums free from inappropriate posts.
The Eng-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Eng-Tips forums is a member-only feature.

Click Here to join Eng-Tips and talk with other members!


Resources