BookmarkSubscribeRSS Feed
PaigeMiller
Diamond | Level 26

At this point, I'd like to insist on you providing the data in the requested format, and not in any other format.

 

Then we will be able to write code to get the desired results.

--
Paige Miller
sks521
Quartz | Level 8

Ahhh now I understand;

 

here it is;

 

data sumSMP;
input pkid visit month2 $ status $ 12.;
datalines;

1           1        -3             eligible

1           1        -3             eligible

1           1        -2             eligible

1           1         0             eligible

2           1        -1            intervention

2           1         2            intervention

2           1         3            intervention

2           1         0            intervention

3           1         2            intervention

3           1        -1           intervention

3           1        -2           intervention

3           1         0           intervention

3           1         0           intervention

3           1         3           intervention

4           1         1           eligible

4            1        0           eligible

;

run;

 

Hope I have understood correctly

Tom
Super User Tom
Super User

So here is code to create the dataset implied by the list of values you posted.

data have;
  input pkid visit month2 status $20.;
cards;
1 1 -3 eligible
1 1 -3 eligible
1 1 -2 eligible
1 1 0 eligible
2 1 -1 intervention
2 1 2 intervention
2 1 3 intervention
2 1 0 intervention
3 1 2 intervention
3 1 -1 intervention
3 1 -2 intervention
3 1 0 intervention
3 1 0 intervention
3 1 3 intervention
;

Please show the result that you want to get from that data.

 

sks521
Quartz | Level 8

To elaborate;

 

Since month2 is coded from -3 to 3 (-3, -2,-1,0,1,2,3);

 

I would want total visits/discrete id for each time period by status like;

                        -3             -2            -1                0               1         2          3

Eligible             2              1               0                0               0        0          0

Intervention      0             1              1               1.5             0         1          1   

 

I have calculated each value (by hand) as follows;

 

Eligible             2/1=2      1/1=1       0                 0               0        0          0

Intervention      0             1/1=1       2/2=1         3/2=1.5     0       2/2=1    2/2=1 

 

Don't know if there is an easy way of explaining it?

 

Thanks for your time 

sks521
Quartz | Level 8

For each cell numerator is the number of visits and denominator is the number of discrete ids to be precise.

 

Thanks

Tom
Super User Tom
Super User

So in that report your columns are values of MONTH2 and your rows are values of STATUS.

It also sounds like don't actually want to see the raw totals.  Instead you want the average number of visits per ID.  Or total vists over number of ids.

You could calculate those in one SQL query if you want.

proc sql ;
  create table want as
  select status,month2
       , count(*) as numerator
       , count(distinct pkid) as denominator
       , calculated numerator / calculated denominator as result
  from have
  group by status,month2
  order by status,month2
  ;
quit;

You could make a report that looks like how you presented it using PROC REPORT.

proc report data=want;
  columns status result,month2;
  define status / group;
  define month2 / across order=internal ;
  define result / sum ' ';
run;
                                                     month2
  status                       -3         -2         -1          0          2          3
  eligible                      2          1          .          1          .          .
  intervention                  .          1          1        1.5          1          1

It is a little harder to do using PROC MEANS (summary). First count per status month2 and pkid.  Then sum the counts per status and month2 and count number of pkid.  You can use the MEAN statistics to get the value you wanted.

proc summary data=have nway;
 class status month2 pkid ;
 output out=step1 ;
run;
proc summary data=step1 nway ;
  by status month2;
  var _freq_;
  output out=want (rename=(_freq_=denominator))
    sum=numerator 
    mean=result
  ;
run;

 

sks521
Quartz | Level 8

Just to give you an idea of what I mean if it's not clear;

 

PKid        -3     -2   -1     0       1     2    3    total visits

2                2     6     0     6      0     0     0    14

 

3               3      13   4     15    4    0      0    39

 

5               2      5    13     1     4    0       0    25

 

So -3, -2, -1, 0, 1,2,3 are time periods and each id had different frequency of visit during each period, this is only the intervention group (by status variable) How do I take average visits per pkid during each time period in a data step, when the data is in the following long form;

 

pkid      visit     month2    status

1           1        -3             eligible

1           1        -3             eligible

1           1        -2             eligible

1           1         0             eligible

2           1        -1            intervention

2           1         2            intervention

2           1         3            intervention

2           1         0            intervention

PaigeMiller
Diamond | Level 26

@sks521 wrote:

Just to give you an idea of what I mean if it's not clear;

 

PKid        -3     -2   -1     0       1     2    3    total visits

2                2     6     0     6      0     0     0    14

 

3               3      13   4     15    4    0      0    39

 

5               2      5    13     1     4    0       0    25

 

So -3, -2, -1, 0, 1,2,3 are time periods and each id had different frequency of visit during each period, this is only the intervention group (by status variable) How do I take average visits per pkid during each time period in a data step, when the data is in the following long form;

 

pkid      visit     month2    status

1           1        -3             eligible

1           1        -3             eligible

1           1        -2             eligible

1           1         0             eligible

2           1        -1            intervention

2           1         2            intervention

2           1         3            intervention

2           1         0            intervention


I'm going to have to agree with @Kurt_Bremser , you need to provide (a portion of) your data via the method he specified, and not via any other method. The link is at the bottom of his message, and I also provide it here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

--
Paige Miller
Kurt_Bremser
Super User

You can do it with two selects in one SQL:

data have;
input ID $ status :$15. eventdate :yymmdd10. age sex $;
format eventdate yymmddd10.;
if EventDate <= '28FEB2016'd then delete;
else if '01MAR2016'd <= EventDate <= '31AUG2016'd then month2 = -3;
else if '01SEP2016'd <= EventDate <= '28FEB2017'd then month2 = -2;
else if '01MAR2017'd <= EventDate <= '31AUG2017'd then month2 = -1;
else if '01SEP2017'd <= EventDate <= '28FEB2018'd then month2 = 0;
else if '01MAR2018'd <= EventDate <= '31AUG2018'd then month2 = 1;
else if '01SEP2018'd <= EventDate <= '28FEB2019'd then month2 = 2;
else if '01MAR2019'd <= EventDate <= '31MAY2019'd then month2 = 3;
datalines;
1 Eligible 2016-04-14 73 male
1 Eligible 2017-02-12 73 male
1 Eligible 2018-09-03 73 male
1 Eligible 2016-01-15 73 male
1 Eligible 2016-11-17 73 male
2 Intervention 2018-01-15 73 male
2 Intervention 2016-01-12 73 male
2 Intervention 2017-04-15 73 male
2 Intervention 2018-09-22 73 male
2 Intervention 2018-07-26 73 male
2 Intervention 2018-03-15 73 male
3 Intervention 2017-02-13 73 male
3 Intervention 2017-01-12 73 male
;

proc sql;
create table want as
select id, avg(visitsum) as average
from (
  select id, month2, count(*) as visitsum
  from have
  group by id, month2
)
group by id;
quit;
sks521
Quartz | Level 8

Well, this sql gives me following output;

 

id    average

1     4.00

2     4.66

3     7.88

 

Doesn't really solve the problem. Still not the average per time period, I'm afraid!

Kurt_Bremser
Super User

So what should be the result for the dataset created by this step?

data have;
input ID $ status :$15. eventdate :yymmdd10. age sex $;
format eventdate yymmddd10.;
if EventDate <= '28FEB2016'd then delete;
else if '01MAR2016'd <= EventDate <= '31AUG2016'd then month2 = -3;
else if '01SEP2016'd <= EventDate <= '28FEB2017'd then month2 = -2;
else if '01MAR2017'd <= EventDate <= '31AUG2017'd then month2 = -1;
else if '01SEP2017'd <= EventDate <= '28FEB2018'd then month2 = 0;
else if '01MAR2018'd <= EventDate <= '31AUG2018'd then month2 = 1;
else if '01SEP2018'd <= EventDate <= '28FEB2019'd then month2 = 2;
else if '01MAR2019'd <= EventDate <= '31MAY2019'd then month2 = 3;
datalines;
1 Eligible 2016-04-14 73 male
1 Eligible 2017-02-12 73 male
1 Eligible 2018-09-03 73 male
1 Eligible 2016-01-15 73 male
1 Eligible 2016-11-17 73 male
2 Intervention 2018-01-15 73 male
2 Intervention 2016-01-12 73 male
2 Intervention 2017-04-15 73 male
2 Intervention 2018-09-22 73 male
2 Intervention 2018-07-26 73 male
2 Intervention 2018-03-15 73 male
3 Intervention 2017-02-13 73 male
3 Intervention 2017-01-12 73 male
;
sks521
Quartz | Level 8

Mathematically speaking I am looking for an output like;

 

Average visits per ID for pre-intervention period= total visits/total number of ids (discrete ids attending the hospital during that period)

 

For example;

 

average number of visits during pre-intervention;

Eligible= 4.44

Intervention= 5.66

 

Average number of visits during intervention;

eligible=5.33

Intervention= 6.66

 

Average number of visits post-intervention;

eligible=3.99

Intervention=6.22

 

Something like that.

 

Logic; for e.g. id '1' would have had 20 visits over the course of study- 3 during pre-intervention, 7 during intervention and 10 during post-intervention. Similarly, ID '2 ' would have had 9 visits; 1 during pre-intervention, 0 during intervention and 8 during post-intervention.

 

So I want to create average visits per discrete ID for each period. I hope it's clear now? Please ask if it's not clear.

 

Thanks for your time on this

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 26 replies
  • 1495 views
  • 0 likes
  • 4 in conversation