Desktop productivity for business analysts and programmers

Replace faulty data

Reply
N/A
Posts: 0

Replace faulty data

Hi everybody,

I have a dataset which contains some faulty data values and a dataset which contains only the corrected values. The variable names are the same.

How do I replace the faulty values in my main data set with the ones in the corrected one?

Thanks in advance!
N/A
Posts: 0

Re: Replace faulty data

I don't believe there is any GUI method to do this in E.G. It strikes me as rather more problematic than helpful to try and add such an interface. So if what I suggest does not work, you should contact Tech Support.

You will need a code node and write some SAS code to perform the update. SAS provide a sample of this code on their web site which you can find by visiting support.sas.com and searching the samples area, or the online documentation. Try this page. http://support.sas.com/kb/24/784.html

Kind regards

David
N/A
Posts: 0

Re: Replace faulty data

There are a number of ways to do it, but from another perspective, there is only one.

Obviously you have to read in both datasets, and replace the incorrect values with the corrected values.

Simply, in form:

Data original;
merge original correct_values;
by var1 var2 ... ;
run;
quit;

The order of original and correct_values is critical.

So, the first thing to do is make a practice set of datasets, and experiment with them until you are absolutely certain you are getting what you want/expect.

The second thing to do, is to be sure that only the variables that have invalid data are in the correct_values dataset to be sure you don't accidentally change variables in original that are correct.

The only other way is to change the variable names in the correct_values dataset set so that you can read in both, not collide by accident, and then explicily/manually set the correct values in the original. From this place, you can do an inplace using _infile_ and _file_ buffers, or some other method.
N/A
Posts: 0

Re: Replace faulty data

for a general update, use the update statement rather than the merge statement. Then changes will replace only new information -- the change records do not need to replace all variables/columns.

PeterC
N/A
Posts: 0

Re: Replace faulty data

I referred the correspondent to the SAS website deliberately. There are a number of problems with using the merge statement, beyond those that relate to the data order and any additional columns on the update data set.

If you are replacing a cell with a missing value or if the column already exists on the source table and the data are not unique at the level of the BY statement then the results will not be as expected.

There are a number of user papers along with very good detailed explanations of MERGE and UPDATE on the SAS website and these should be read carefully and understood.

Kind regards

David
N/A
Posts: 0

Re: Replace faulty data

Hey guys,

I tried to get things right using the update statement. Unfortunately it went wrong and I couldn't figure out what the problem is. Now I did it by left joining the corrected data and successively merging the right data in a few working steps. It worked.

Anyways, thanks for your help!
Ask a Question
Discussion stats
  • 5 replies
  • 134 views
  • 0 likes
  • 1 in conversation