At this point, I'd like to insist on you providing the data in the requested format, and not in any other format.
Then we will be able to write code to get the desired results.
Ahhh now I understand;
here it is;
data sumSMP;
input pkid visit month2 $ status $ 12.;
datalines;
1 1 -3 eligible
1 1 -3 eligible
1 1 -2 eligible
1 1 0 eligible
2 1 -1 intervention
2 1 2 intervention
2 1 3 intervention
2 1 0 intervention
3 1 2 intervention
3 1 -1 intervention
3 1 -2 intervention
3 1 0 intervention
3 1 0 intervention
3 1 3 intervention
4 1 1 eligible
4 1 0 eligible
;
run;
Hope I have understood correctly
So here is code to create the dataset implied by the list of values you posted.
data have;
input pkid visit month2 status $20.;
cards;
1 1 -3 eligible
1 1 -3 eligible
1 1 -2 eligible
1 1 0 eligible
2 1 -1 intervention
2 1 2 intervention
2 1 3 intervention
2 1 0 intervention
3 1 2 intervention
3 1 -1 intervention
3 1 -2 intervention
3 1 0 intervention
3 1 0 intervention
3 1 3 intervention
;
Please show the result that you want to get from that data.
To elaborate;
Since month2 is coded from -3 to 3 (-3, -2,-1,0,1,2,3);
I would want total visits/discrete id for each time period by status like;
-3 -2 -1 0 1 2 3
Eligible 2 1 0 0 0 0 0
Intervention 0 1 1 1.5 0 1 1
I have calculated each value (by hand) as follows;
Eligible 2/1=2 1/1=1 0 0 0 0 0
Intervention 0 1/1=1 2/2=1 3/2=1.5 0 2/2=1 2/2=1
Don't know if there is an easy way of explaining it?
Thanks for your time
For each cell numerator is the number of visits and denominator is the number of discrete ids to be precise.
Thanks
So in that report your columns are values of MONTH2 and your rows are values of STATUS.
It also sounds like don't actually want to see the raw totals. Instead you want the average number of visits per ID. Or total vists over number of ids.
You could calculate those in one SQL query if you want.
proc sql ;
create table want as
select status,month2
, count(*) as numerator
, count(distinct pkid) as denominator
, calculated numerator / calculated denominator as result
from have
group by status,month2
order by status,month2
;
quit;
You could make a report that looks like how you presented it using PROC REPORT.
proc report data=want;
columns status result,month2;
define status / group;
define month2 / across order=internal ;
define result / sum ' ';
run;
month2 status -3 -2 -1 0 2 3 eligible 2 1 . 1 . . intervention . 1 1 1.5 1 1
It is a little harder to do using PROC MEANS (summary). First count per status month2 and pkid. Then sum the counts per status and month2 and count number of pkid. You can use the MEAN statistics to get the value you wanted.
proc summary data=have nway;
class status month2 pkid ;
output out=step1 ;
run;
proc summary data=step1 nway ;
by status month2;
var _freq_;
output out=want (rename=(_freq_=denominator))
sum=numerator
mean=result
;
run;
Just to give you an idea of what I mean if it's not clear;
PKid -3 -2 -1 0 1 2 3 total visits
2 2 6 0 6 0 0 0 14
3 3 13 4 15 4 0 0 39
5 2 5 13 1 4 0 0 25
So -3, -2, -1, 0, 1,2,3 are time periods and each id had different frequency of visit during each period, this is only the intervention group (by status variable) How do I take average visits per pkid during each time period in a data step, when the data is in the following long form;
pkid visit month2 status
1 1 -3 eligible
1 1 -3 eligible
1 1 -2 eligible
1 1 0 eligible
2 1 -1 intervention
2 1 2 intervention
2 1 3 intervention
2 1 0 intervention
@sks521 wrote:
Just to give you an idea of what I mean if it's not clear;
PKid -3 -2 -1 0 1 2 3 total visits
2 2 6 0 6 0 0 0 14
3 3 13 4 15 4 0 0 39
5 2 5 13 1 4 0 0 25
So -3, -2, -1, 0, 1,2,3 are time periods and each id had different frequency of visit during each period, this is only the intervention group (by status variable) How do I take average visits per pkid during each time period in a data step, when the data is in the following long form;
pkid visit month2 status
1 1 -3 eligible
1 1 -3 eligible
1 1 -2 eligible
1 1 0 eligible
2 1 -1 intervention
2 1 2 intervention
2 1 3 intervention
2 1 0 intervention
I'm going to have to agree with @Kurt_Bremser , you need to provide (a portion of) your data via the method he specified, and not via any other method. The link is at the bottom of his message, and I also provide it here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
You can do it with two selects in one SQL:
data have;
input ID $ status :$15. eventdate :yymmdd10. age sex $;
format eventdate yymmddd10.;
if EventDate <= '28FEB2016'd then delete;
else if '01MAR2016'd <= EventDate <= '31AUG2016'd then month2 = -3;
else if '01SEP2016'd <= EventDate <= '28FEB2017'd then month2 = -2;
else if '01MAR2017'd <= EventDate <= '31AUG2017'd then month2 = -1;
else if '01SEP2017'd <= EventDate <= '28FEB2018'd then month2 = 0;
else if '01MAR2018'd <= EventDate <= '31AUG2018'd then month2 = 1;
else if '01SEP2018'd <= EventDate <= '28FEB2019'd then month2 = 2;
else if '01MAR2019'd <= EventDate <= '31MAY2019'd then month2 = 3;
datalines;
1 Eligible 2016-04-14 73 male
1 Eligible 2017-02-12 73 male
1 Eligible 2018-09-03 73 male
1 Eligible 2016-01-15 73 male
1 Eligible 2016-11-17 73 male
2 Intervention 2018-01-15 73 male
2 Intervention 2016-01-12 73 male
2 Intervention 2017-04-15 73 male
2 Intervention 2018-09-22 73 male
2 Intervention 2018-07-26 73 male
2 Intervention 2018-03-15 73 male
3 Intervention 2017-02-13 73 male
3 Intervention 2017-01-12 73 male
;
proc sql;
create table want as
select id, avg(visitsum) as average
from (
select id, month2, count(*) as visitsum
from have
group by id, month2
)
group by id;
quit;
Well, this sql gives me following output;
id average
1 4.00
2 4.66
3 7.88
Doesn't really solve the problem. Still not the average per time period, I'm afraid!
So what should be the result for the dataset created by this step?
data have;
input ID $ status :$15. eventdate :yymmdd10. age sex $;
format eventdate yymmddd10.;
if EventDate <= '28FEB2016'd then delete;
else if '01MAR2016'd <= EventDate <= '31AUG2016'd then month2 = -3;
else if '01SEP2016'd <= EventDate <= '28FEB2017'd then month2 = -2;
else if '01MAR2017'd <= EventDate <= '31AUG2017'd then month2 = -1;
else if '01SEP2017'd <= EventDate <= '28FEB2018'd then month2 = 0;
else if '01MAR2018'd <= EventDate <= '31AUG2018'd then month2 = 1;
else if '01SEP2018'd <= EventDate <= '28FEB2019'd then month2 = 2;
else if '01MAR2019'd <= EventDate <= '31MAY2019'd then month2 = 3;
datalines;
1 Eligible 2016-04-14 73 male
1 Eligible 2017-02-12 73 male
1 Eligible 2018-09-03 73 male
1 Eligible 2016-01-15 73 male
1 Eligible 2016-11-17 73 male
2 Intervention 2018-01-15 73 male
2 Intervention 2016-01-12 73 male
2 Intervention 2017-04-15 73 male
2 Intervention 2018-09-22 73 male
2 Intervention 2018-07-26 73 male
2 Intervention 2018-03-15 73 male
3 Intervention 2017-02-13 73 male
3 Intervention 2017-01-12 73 male
;
Mathematically speaking I am looking for an output like;
Average visits per ID for pre-intervention period= total visits/total number of ids (discrete ids attending the hospital during that period)
For example;
average number of visits during pre-intervention;
Eligible= 4.44
Intervention= 5.66
Average number of visits during intervention;
eligible=5.33
Intervention= 6.66
Average number of visits post-intervention;
eligible=3.99
Intervention=6.22
Something like that.
Logic; for e.g. id '1' would have had 20 visits over the course of study- 3 during pre-intervention, 7 during intervention and 10 during post-intervention. Similarly, ID '2 ' would have had 9 visits; 1 during pre-intervention, 0 during intervention and 8 during post-intervention.
So I want to create average visits per discrete ID for each period. I hope it's clear now? Please ask if it's not clear.
Thanks for your time on this
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.