Solved: Re: Overwriting incorrect values

uopsouthpaw · Posted 03-05-2019 03:25 PM

Hello,

I have a data set where I know certain values are incorrect and need changing. I can't change them from my extraction program (as they are locked). I know, I know, but this is what I have to work with. The issue is that there are multiple fields that need to be corrected. If I put all the fields in the merge/overwrite doc, it changes the old values to blank. Is there a way to only take the values specified in the overwrite data set and not all of it? An example of my overwrite data set is below. The "Parent" data set has the below fields and more.

PR ID	Reportability Indicator	Complaint Code (s)	Clinical Harm(s)	Product Family	Model Number	Hospital Country:
1001						Italy
4556						United Kingdom
4557
4584
4585						Germany
4611
9212	Yes
11001	Yes
11028	Yes
12014	Yes
12289				ABCD
18315	Yes
19008	Yes
19149						France
19328	Yes
19649				XYZ
19800	Yes
20229					PN-12345

Thanks,

Wes

ballardw · Posted 03-05-2019 04:14 PM

@PGStats wrote:

Use the update statement. Default behavior doesn't overwrite missing values. To update dataset have with values from dataset overwrite do :
data updated;
update have overwrite;
by PR_ID;
run;

Which will require Have and Overwrite to both be sorted by PR_ID and the Have data set to have only one record with a given value of PR_ID.

If you have multiple records with the same value of PR_ID you would need a second (and possibly more variables) to uniquely identify the record to update and sort both sets by the combination of variables and use that on the BY statement.

If multiple records with the same PR_ID require the same updated values you may want to look into Proc SQL join on PR_ID and Coalesce or Coalescec for the variables involved.

View solution in original post

PGStats · Posted 03-05-2019 03:53 PM

Use the update statement. Default behavior doesn't overwrite missing values. To update dataset have with values from dataset overwrite do :

data updated;
update have overwrite;
by PR_ID;
run;

PG

ballardw · Posted 03-05-2019 04:14 PM

@PGStats wrote:

Use the update statement. Default behavior doesn't overwrite missing values. To update dataset have with values from dataset overwrite do :
data updated;
update have overwrite;
by PR_ID;
run;

Which will require Have and Overwrite to both be sorted by PR_ID and the Have data set to have only one record with a given value of PR_ID.

If you have multiple records with the same value of PR_ID you would need a second (and possibly more variables) to uniquely identify the record to update and sort both sets by the combination of variables and use that on the BY statement.

If multiple records with the same PR_ID require the same updated values you may want to look into Proc SQL join on PR_ID and Coalesce or Coalescec for the variables involved.

Overwriting incorrect values

Re: Overwriting incorrect values

Re: Overwriting incorrect values

Re: Overwriting incorrect values

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away