- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Could anyone help me on how to update observations in an existing SAS dataset, using another dataset. Like, I have 2 datasets :-
DATA Data_one;
input id prod $ v1 v2 v3 v4 v5;
datalines;
100 abc 1 0 0 0 0
102 asd 1 1 0 0 0
105 0 1 0 0 0
100 pqr 1 0 0 0 0
102 sdf 0 1 0 0 0
103 rty 1 1 0 0 0
;
run;
DATA Data_two;
input id v3 v4 v5;
datalines;
100 1 0 1
102 0 1 0
103 0 0 0
105 1 0 0
;
run;
I want the fields v3,v4,v5 of Data_one to be updated according to Data_two on the basis of the ID field. I tried using merge statement, but that updated only the first observation for each group. For ex. in Data_one, I want for ID 100 the fields should get updated as 1,0,1 under 'abc' as well as under 'pqr', and not only under 'abc'. Please help.
Thank you in advance!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here's an approach. Assuming both data sets are first sorted by ID:
data want;
merge data_one (in=in1) data_two (in=in2 rename=(v3-v5=w3-w5));
by id;
if in1;
if in2 then do;
v3=w3;
v4=w4;
v5=w5;
end;
drop w3-w5;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Have you tried an update statement?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I tried using that. But, it updated only the very first observation of each group of the BY variable. Like, for ID 100 it updated only the first observation and not the second one.
Please help. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you have multiples in both files or only the first file?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here's an approach. Assuming both data sets are first sorted by ID:
data want;
merge data_one (in=in1) data_two (in=in2 rename=(v3-v5=w3-w5));
by id;
if in1;
if in2 then do;
v3=w3;
v4=w4;
v5=w5;
end;
drop w3-w5;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Reeza : No, only the first dataset contains multiples.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You'll have to explicitly code it if you have multiples.
A data step or SQL will work. It can be a SQL join or an update.
Here's a SQL join example, others have posted a data step solution that will work as well.
proc sql;
create table want as
select a.id, a.prod, a.v1, a.v2, coalesce(b.v3, a.v3) as v3, coalesce(b.v4, a.v4) as v4, coalesce(b.v5, a.v5) as v5
from data_one as a
left join data_two as b
on a.id=b.id
order by 1,2;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot for your quick help.
Thanks to both of you 🙂