BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jcis7
Pyrite | Level 9

G'day!

 

Appreciate any help you can give.

 

I've two datasets with the following names and content:

  1.  'idsdataset' that has the variable name 'id' and 'type'
  2. 'toupdatedataset' that has the variable names 'id', 'type', and 'status'

 

I'd like to update the field 'status' in the dataset 'toupdatedataset' with the value 'N' for all of the observations in the 'idsdataset' dataset based on 'id' in both datasets.

The outcome I'm looking for is listed in the dataset 'updateddataset'.

 

How can I update 'status' field without needing to enter each 'id' individually?  Thank you.


data idsdataset;
   input id $ type$;
   datalines;
1234567 Pub
1245678 Pub
;
data toupdatedataset;
   input id $ type $ status $;
   datalines;
1234567 Pub .
1245678  Pub . 
2349878 Pri N
3948573 Priv Y
;

data updateddataset;
  set toupdate;
if id in ('1234567' , '1245678') then status='N';
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Here is one way:

data updateddataset;
  merge toupdatedataset (in=Inleft) 
        idsdataset      (in=Inright)
  ;
  by id;
  if missing(status) and inleft and inright then Status='N';
run;

If the data sets on the Merge statement are not sorted by ID then you should sort them prior to merge.

The dataset option IN= creates a temporary variable (means that variable is not written to the data) that is a 1/0 (True/False) value. So you can test if a record is coming from both data sets. I am assuming that you want to change the missing status values if present only. If that is not actually the case then remove the "missing(status) and" part of the If.

 

Please make sure that you spell your data sets consistently. I could not tell if there was supposed to be a 4th data set named Toupdate that was actually different than Toupdatedataset.

 

Warning about the above solution: If your ISDATASET contains ID and TYPE not already in the TOUPDATEDATASET they will be added (and status will be missing). If this might occur and you do not want it to then you would subset the data with an "IF inleft;"

View solution in original post

5 REPLIES 5
ballardw
Super User

Here is one way:

data updateddataset;
  merge toupdatedataset (in=Inleft) 
        idsdataset      (in=Inright)
  ;
  by id;
  if missing(status) and inleft and inright then Status='N';
run;

If the data sets on the Merge statement are not sorted by ID then you should sort them prior to merge.

The dataset option IN= creates a temporary variable (means that variable is not written to the data) that is a 1/0 (True/False) value. So you can test if a record is coming from both data sets. I am assuming that you want to change the missing status values if present only. If that is not actually the case then remove the "missing(status) and" part of the If.

 

Please make sure that you spell your data sets consistently. I could not tell if there was supposed to be a 4th data set named Toupdate that was actually different than Toupdatedataset.

 

Warning about the above solution: If your ISDATASET contains ID and TYPE not already in the TOUPDATEDATASET they will be added (and status will be missing). If this might occur and you do not want it to then you would subset the data with an "IF inleft;"

jcis7
Pyrite | Level 9

Thanks for catching that I didn't name the dataset in the set statement in the  'data updateddataset' statement 'toupdatedataset' and instead named it 'toupdate'.

HB
Barite | Level 11 HB
Barite | Level 11

I like Ballard's way better (mine is the long way around) but I'll post this anyway:

 



*populate the missings;
proc sql;
create table populatedidsdataset as 
select a.id, a.type,
case
	when missing(status) then 'N'
	else status 
end as status
from
toupdatedataset a inner join idsdataset b
on a.id = b.id and a.type = b.type;
quit; 

* keep your good ones;
data parsed;
set toupdatedataset;
if not missing(status);
run;

* smash your populated ones and good ones back together;
data want;
	merge populatedidsdataset parsed;
	by id; 
run;
Sajid01
Meteorite | Level 14

Hello @jcis7 
As I understand from your post, the requirement is that if the id's in both the datasets ( idsdataset and toupdatedataset ) match  then set the values of Status to N   in the updated dataset .  Thus I would go by what @ballardw has given with a minor change. My code would be as follows

data idsdataset;
   input id $ type$;
   datalines;
1234567 Pub
1245678 Pub
;
data toupdatedataset;
   input id $ type $ status $;
   datalines;
1234567 Pub .
1245678  Pub . 
2349878 Pri N
3948573 Priv Y
;
proc sort data=idsdataset;
by id;
run;
proc sort data=toupdatedataset;
by id;
run;
data updateddataset;
  merge toupdatedataset (in=Inleft) 
        idsdataset      (in=Inright)
  ;
  by id;
  if inleft and inright then Status='N';
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 860 views
  • 4 likes
  • 4 in conversation