Hi there,
I am having a data consisting of multiple transactions for a person within a same day but some of the records for the same person is missing e.g site ID. I want to fill the missing site ID by taking value from one of the site ID from the transactions of that person on the same date assuming that in a day all transaction occurs in one site only. Can some one kindly help me to move forward.
data have;
input name $ service_date site_ID ;
cards;
aaa 20140102 101
aaa 20140102
bbb 20140102 102
ccc 20140104 103
ccc 20140104
ccc 20140104 103
aaa 20140105 104
aaa 20140105
aaa 20140105 104
;
run;
data want;
input name $ service_date site_ID ;
cards;
aaa 20140102 101
aaa 20140102 101
bbb 20140102 102
ccc 20140104 103
ccc 20140104 103
ccc 20140104 103
aaa 20140105 104
aaa 20140105 104
aaa 20140105 104
;
run;
Thank you in advance for your kind reply.
Regards,
Please check the site_id2 variable with expected result
proc sort data=have;
by name service_date;
run;
data want;
set have;
retain site_id2;
by name service_date;
if first.service_date then site_id2=site_id;
run;
Please check the site_id2 variable with expected result
proc sort data=have;
by name service_date;
run;
data want;
set have;
retain site_id2;
by name service_date;
if first.service_date then site_id2=site_id;
run;
What if the first record for a given date has a missing site? This program would propagate that missing value to all records for the given date.
Hi Mkeintz,
Unfortunately I am experiencing the issue mentioned by you in your comment. Can you advice me further to overcome the impact of missing first record for a given date.
Thank you in advance for your kind reply.
Regards,
for your query, i considered the scenario you mentioned. We could avoid the same by changing the sorting order of the site_id, we need to ensure that the non missing site_id comes first and the same could be retained to other records. This time I used the update statement to achieve the expected output along with the proc sort step.
data have;
infile cards missover;
input name $ @5 service_date site_ID ;
cards;
aaa 20140102
aaa 20140102 101
bbb 20140102 102
ccc 20140104 103
ccc 20140104
ccc 20140104 103
aaa 20140105 104
aaa 20140105
aaa 20140105 104
;
run;
proc sort data=have;
by name service_date descending site_ID;
run;
data want;
update have(obs=0) have;
by name service_date descending site_ID;
output;
run;
Hi Jag,
Once again thank you for your superfast incredible reply. Just extending the "sorting part" of the solution advised by you, helped to achieve the expected solution using retain statement as before.
In order to increase my knowledge base, I am curious to know the solution based on update statement. Can you review the update statement part as following its use, I am not getting expected result.
Thank you in advance for your kind reply.
Regards,
Hi Jag,
Thank you for your kind patience in replying my queries. Your advice worked to provide me expected outcome.
Regards,
Why not keep the basic structure of the program, but read the data in two parallel streams, one for site_id missing and one for the rest?
data want;
set have (where=(site_id^=.)) have (where=(site_id=.));
retain site_id2;
by name service_date;
if first.service_date then site_id2=site_id;
run;
Hi Mkeintz,
Thanks for providing an alternative approach.
Regards,
Deepak
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.