G'day!
Appreciate any help you can give.
I've two datasets with the following names and content:
I'd like to update the field 'status' in the dataset 'toupdatedataset' with the value 'N' for all of the observations in the 'idsdataset' dataset based on 'id' in both datasets.
The outcome I'm looking for is listed in the dataset 'updateddataset'.
How can I update 'status' field without needing to enter each 'id' individually? Thank you.
data idsdataset;
input id $ type$;
datalines;
1234567 Pub
1245678 Pub
;
data toupdatedataset;
input id $ type $ status $;
datalines;
1234567 Pub .
1245678 Pub .
2349878 Pri N
3948573 Priv Y
;
data updateddataset;
set toupdate;
if id in ('1234567' , '1245678') then status='N';
run;
Here is one way:
data updateddataset; merge toupdatedataset (in=Inleft) idsdataset (in=Inright) ; by id; if missing(status) and inleft and inright then Status='N'; run;
If the data sets on the Merge statement are not sorted by ID then you should sort them prior to merge.
The dataset option IN= creates a temporary variable (means that variable is not written to the data) that is a 1/0 (True/False) value. So you can test if a record is coming from both data sets. I am assuming that you want to change the missing status values if present only. If that is not actually the case then remove the "missing(status) and" part of the If.
Please make sure that you spell your data sets consistently. I could not tell if there was supposed to be a 4th data set named Toupdate that was actually different than Toupdatedataset.
Warning about the above solution: If your ISDATASET contains ID and TYPE not already in the TOUPDATEDATASET they will be added (and status will be missing). If this might occur and you do not want it to then you would subset the data with an "IF inleft;"
Here is one way:
data updateddataset; merge toupdatedataset (in=Inleft) idsdataset (in=Inright) ; by id; if missing(status) and inleft and inright then Status='N'; run;
If the data sets on the Merge statement are not sorted by ID then you should sort them prior to merge.
The dataset option IN= creates a temporary variable (means that variable is not written to the data) that is a 1/0 (True/False) value. So you can test if a record is coming from both data sets. I am assuming that you want to change the missing status values if present only. If that is not actually the case then remove the "missing(status) and" part of the If.
Please make sure that you spell your data sets consistently. I could not tell if there was supposed to be a 4th data set named Toupdate that was actually different than Toupdatedataset.
Warning about the above solution: If your ISDATASET contains ID and TYPE not already in the TOUPDATEDATASET they will be added (and status will be missing). If this might occur and you do not want it to then you would subset the data with an "IF inleft;"
Thanks for catching that I didn't name the dataset in the set statement in the 'data updateddataset' statement 'toupdatedataset' and instead named it 'toupdate'.
I like Ballard's way better (mine is the long way around) but I'll post this anyway:
*populate the missings;
proc sql;
create table populatedidsdataset as
select a.id, a.type,
case
when missing(status) then 'N'
else status
end as status
from
toupdatedataset a inner join idsdataset b
on a.id = b.id and a.type = b.type;
quit;
* keep your good ones;
data parsed;
set toupdatedataset;
if not missing(status);
run;
* smash your populated ones and good ones back together;
data want;
merge populatedidsdataset parsed;
by id;
run;
Hello @jcis7
As I understand from your post, the requirement is that if the id's in both the datasets ( idsdataset and toupdatedataset ) match then set the values of Status to N in the updated dataset . Thus I would go by what @ballardw has given with a minor change. My code would be as follows
data idsdataset;
input id $ type$;
datalines;
1234567 Pub
1245678 Pub
;
data toupdatedataset;
input id $ type $ status $;
datalines;
1234567 Pub .
1245678 Pub .
2349878 Pri N
3948573 Priv Y
;
proc sort data=idsdataset;
by id;
run;
proc sort data=toupdatedataset;
by id;
run;
data updateddataset;
merge toupdatedataset (in=Inleft)
idsdataset (in=Inright)
;
by id;
if inleft and inright then Status='N';
run;
Thank you everyone!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.