Re: How to update a sas dataset using another dataset if there are mis...

Sheeba · Posted 06-04-2018 06:08 PM

Hi All,

I have a sas dataset which has got missing values for two variables (CD_B,CD_A). I would like to update the values from another sas dataset if the values are missing for CD_B,CD_A.

Since the master dataset (one here) will be huge in actual scenario . Is there any way I can avoid sorting and update the missing values?

data one;

input file_type $ cardtl $ acctdtl $ CD_B CD_A ;

datalines;

S IS123 F 123 111

A IS023 H 224 .

B IS223 P . 333

;

run;

data three;

input cardtl $ acctdtl $ CD_B CD_A;

datalines;

IS023 H 224 222

IS223 P 524 333

;

run;

Thanks in advance,

Regards,

Sheeba

Kurt_Bremser · Posted 06-04-2018 06:18 PM

If dataset three is small enough, you could create a format and do it in one sequential step.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Sheeba · Posted 06-04-2018 06:27 PM

Hi ,

Thanks a lot for the reply.

The datasets are huge and the data in the update dataset is bound to change. so I added the default values in a spreadsheet , imported into a dataset. joining the two datasets seems to take up a lot of space and is indefinitely running.

Regards,

Sheeba

Kurt_Bremser · Posted 06-04-2018 07:00 PM

Define "huge" in terms of number of observations.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Sheeba · Posted 06-05-2018 06:45 PM

Hi,

master dataset has over 1 million records and update has less around 200 .

Regards,

Sheeba

Astounding · Posted 06-04-2018 06:56 PM

Depending on the number of updates, you could use the second data set holding the updates to write out a set of IF / THEN statements to a file. Just use FILE and PUT statements. Based on your example data, the file should contain (you can correct this if it's wrong):

if cardtl = 'IS123' and acctdtl='F' then do;

if CD_B=. then CD_B = 224;

if CD_A=. then CD_A = 222;

end;

else if cardtl = 'IS223' and acctdtl='P' then do;

if CD_B=. then CD_B = 524;

if CD_A=. then CD_A = 333;

end;

From that point it's easy:

data one_revised;

set one;

%include "path to file holding IF THEN statements";

run;

Sheeba · Posted 06-05-2018 06:46 PM

Hi Astounding,

Thanks a lot for the response,

Sure I will try this out.

Regards,

Sheeba Swaminathan

s_lassen · Posted 06-05-2018 06:48 AM

While a format can work for random access, as suggested by @Kurt_Bremser, using SET with KEY= is probably faster if you have a lot of data:

proc sql;

create index idx on three(cardtl,acctdtl);

run;

data want;

set one;

if missing(CD_B) then do;

if missing(CD_A) then

set three key=idx/unique;

else

set three(drop=CD_A) key=idx/unique;

end;

else if missing(CD_A) then

set three(drop=CD_B) key=idx/unique;

run;

I made the code a bit more complicated than absolutely necessary (by checking if both variables were missing, and then just using one SET statement), as that will probably give better performance with large tables.

Sheeba · Posted 06-05-2018 06:48 PM

Hi s_lassen,

Thanks a lot for the suggestion,

I will try this out with modified if conditions

Regards,

Sheeba Swaminathan

ruchi11dec · Posted 06-05-2018 07:01 AM

you can do this by update :

data one;

update one three;

by cardt1;

run;

ballardw · Posted 06-05-2018 10:58 AM

@ruchi11dec wrote:

you can do this by update :

data one;

update one three;

by cardt1;

run;

Update done this way would require 1) both data sets to be sorted (which the OP explicitly stated he didn't want to do) and 2) that the value of the by variable cardt1 uniquely identify each record, which we don't know.

The NOTSORTED option cannot be used with Update.

The MODIFY option might work but still relies on having BY variables to uniquely identify the records to modify.

data work.class;
   set sashelp.class;
run;

data work.update;
   name='Henry';
   height=99;
run;

data work.class;
   modify work.class
          work.update
   ;
   by notsorted name;
run;

Sheeba · Posted 06-05-2018 06:52 PM

Hi ballardw,

Thanks a lot for the suggestion.

there was no unique by variable key to identify the records. So I created a unique key and then used modified statement and it worked for me.

Thank you.

Regards,

Sheeba

ruchi11dec · Posted 06-05-2018 06:49 PM

Hi ruchi11dec ,

Thanks a lot for the response,

I tried update statement but I don't have a unique key in both the datasets..

Regards,

Sheeba Swaminathan

How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Re: How to update a sas dataset using another dataset if there are missing values in some columns ?

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away