I am pulling data from a Teradata table and creating a dataset with last 4 months of data. But now, I have to calculate the number of patients visited each month in each department in last 4 months and create a dataset to load to another table. Also, I need to have those column names as mnth1, mnth2, mnth3, mnth4 for last 4 month in sequence based on the months.
I have the sample data for have and want. Any help is highly appreciated.
Thanks
have:
dt | dep_cd | dep_type | patient ID |
12/5/2019 | A1 | LP | 65785 |
1/6/2020 | A1 | LP | 39876 |
2/3/2020 | A1 | LP | 89765 |
1/5/2020 | A1 | LP | 36543 |
1/20/2020 | A1 | LP | 33876 |
12/14/2019 | A1 | LP | 20987 |
12/14/2019 | A1 | LP | 20988 |
12/16/2019 | A1 | LP | 21876 |
11/16/2019 | A2 | RP | 91087 |
11/20/2019 | A2 | RP | 76509 |
12/15/2019 | A2 | RP | 34987 |
1/5/2020 | A2 | RP | 54763 |
12/18/2019 | A2 | RP | 45908 |
12/25/2019 | A2 | RP | 66754 |
12/29/2019 | A2 | RP | 51432 |
want
dep_cd | dep_type | mnth1 | mnth2 | mnth3 | mnth4 | today_dt |
A1 | LP | 0 | 4 | 3 | 1 | 3/13/2020 |
A2 | RP | 2 | 4 | 1 | 0 | 3/13/2020 |
data have;
input dt :mmddyy10. (dep_cd dep_type patient_ID) ($);
format dt mmddyy10.;
cards;
12/5/2019 A1 LP 65785
1/6/2020 A1 LP 39876
2/3/2020 A1 LP 89765
1/5/2020 A1 LP 36543
1/20/2020 A1 LP 33876
12/14/2019 A1 LP 20987
12/14/2019 A1 LP 20988
12/16/2019 A1 LP 21876
11/16/2019 A2 RP 91087
11/20/2019 A2 RP 76509
12/15/2019 A2 RP 34987
1/5/2020 A2 RP 54763
12/18/2019 A2 RP 45908
12/25/2019 A2 RP 66754
12/29/2019 A2 RP 51432
;
proc summary data=have nway completetypes;
class dep_type dep_cd dt;
format dt monyy7.;
output out=temp(drop=_type_);
run;
proc sql;
create table temp2 as
select *
from temp
where cats(dep_type,dep_cd) in (select cats(dep_type,dep_cd) from have);
quit;
proc transpose data=temp2 out=want(drop=_:) prefix=month;
by dep_type dep_cd notsorted;
var _freq_;
run;
dep_type | dep_cd | month1 | month2 | month3 | month4 |
---|---|---|---|---|---|
LP | A1 | 0 | 4 | 3 | 1 |
RP | A2 | 2 | 4 | 1 | 0 |
data have;
input dt :mmddyy10. (dep_cd dep_type patient_ID) ($);
format dt mmddyy10.;
cards;
12/5/2019 A1 LP 65785
1/6/2020 A1 LP 39876
2/3/2020 A1 LP 89765
1/5/2020 A1 LP 36543
1/20/2020 A1 LP 33876
12/14/2019 A1 LP 20987
12/14/2019 A1 LP 20988
12/16/2019 A1 LP 21876
11/16/2019 A2 RP 91087
11/20/2019 A2 RP 76509
12/15/2019 A2 RP 34987
1/5/2020 A2 RP 54763
12/18/2019 A2 RP 45908
12/25/2019 A2 RP 66754
12/29/2019 A2 RP 51432
;
proc summary data=have nway completetypes;
class dep_type dep_cd dt;
format dt monyy7.;
output out=temp(drop=_type_);
run;
proc sql;
create table temp2 as
select *
from temp
where cats(dep_type,dep_cd) in (select cats(dep_type,dep_cd) from have);
quit;
proc transpose data=temp2 out=want(drop=_:) prefix=month;
by dep_type dep_cd notsorted;
var _freq_;
run;
dep_type | dep_cd | month1 | month2 | month3 | month4 |
---|---|---|---|---|---|
LP | A1 | 0 | 4 | 3 | 1 |
RP | A2 | 2 | 4 | 1 | 0 |
@novinosrin This works great for me. But I got a new requirement. The have dataset looks like this. A new dep_cd A3 is added couple months back with the eff_date. I need the want dataset with the average of the counts but dep_cd A3 should only have average of the months since it was effective. Is there a way to achieve that?
dt dep_cd dep_type patient_id eff_date
12/5/2019 A1 LP 65785 1/12/2005
1/6/2020 A1 LP 39876 1/12/2005
2/3/2020 A1 LP 89765 1/12/2005
1/5/2020 A1 LP 36543 1/12/2005
1/20/2020 A1 LP 33876 1/12/2005
12/14/2019 A1 LP 20987 1/12/2005
12/14/2019 A1 LP 20988 1/12/2005
12/16/2019 A1 LP 21876 1/12/2005
11/16/2019 A2 RP 91087 5/1/2010
11/20/2019 A2 RP 76509 5/1/2010
12/15/2019 A2 RP 34987 5/1/2010
1/5/2020 A2 RP 54763 5/1/2010
12/18/2019 A2 RP 45908 5/1/2010
12/25/2019 A2 RP 66754 5/1/2010
12/29/2019 A2 RP 51432 5/1/2010
01/05/2020 A3 AP 40089 1/1/2020
01/10/2020 A3 AP 79021 1/1/2020
WANT
dep_type dep_cd month1 month2 month3 month4 EFF_DATE AVG
LP | A1 | 0 | 4 | 3 | 1 | 1/1/2005 | 2 |
RP | A2 | 2 | 4 | 1 | 0 | 1/1/2010 | 1.75 |
AP | A3 | 0 | 2 | NULL | NULL | 1/1/2020 | 1 |
Hi @nickspencer I seek a small clarification, can you please explain this result denoting to the actual month. That seems strange or I am not sure what I am missing in with respect to the logic.
AP | A3 | 0 | 2 | NULL | NULL | 1/1/2020 | 1 |
Since this AP/A3 dep code /type was added for the first time in 1/1/2020(eff_dt) and it had 2 patient in January, I don't want to include other months in the calculation of average. I just want to include Jan and Feb to calculate the average. So the average number of alerts I should have is 1 for these 2 months. I put month2 and month3 as null but we can put zero should be fine but don't want to include in the average calculation since the code was not effective those months. Hope this makes sense.
AP | A3 | 0 | 2 | NULL | NULL | 1/1/2020 | 1 |
data have;
input dt :mmddyy10. dep_cd $ dep_type $ patient_id eff_date :mmddyy10.;
format dt eff_date mmddyy10.;
cards;
12/5/2019 A1 LP 65785 1/12/2005
1/6/2020 A1 LP 39876 1/12/2005
2/3/2020 A1 LP 89765 1/12/2005
1/5/2020 A1 LP 36543 1/12/2005
1/20/2020 A1 LP 33876 1/12/2005
12/14/2019 A1 LP 20987 1/12/2005
12/14/2019 A1 LP 20988 1/12/2005
12/16/2019 A1 LP 21876 1/12/2005
11/16/2019 A2 RP 91087 5/1/2010
11/20/2019 A2 RP 76509 5/1/2010
12/15/2019 A2 RP 34987 5/1/2010
1/5/2020 A2 RP 54763 5/1/2010
12/18/2019 A2 RP 45908 5/1/2010
12/25/2019 A2 RP 66754 5/1/2010
12/29/2019 A2 RP 51432 5/1/2010
01/05/2020 A3 AP 40089 1/1/2020
01/10/2020 A3 AP 79021 1/1/2020
;
proc summary data=have nway completetypes;
class dep_type dep_cd dt ;
format dt monyy7.;
output out=temp(drop=_type_);
run;
proc sql;
create table temp2 as
select a.*,ifn(dt<eff_date,.,_freq_) as count,eff_date,mean(calculated count) as avg
from temp a ,(select distinct dep_type,dep_cd,eff_date from have) b
where a.dep_type=b.dep_type and a.dep_cd=b.dep_cd
group by a.dep_type,a.dep_cd ,eff_date
order by a.dep_type,a.dep_cd ,eff_date,dt;
quit;
proc transpose data=temp2 out=want(drop=_:) prefix=month;
by dep_type dep_cd eff_date avg;
var count;
run;
@nickspencer Please see if the above helps and I am sure you can follow the logic. Let me know if you need any clarification. Thanks
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.