BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sofia_de_garay
Fluorite | Level 6

Hello New SAS User community, I am working in a long data set with several hundred people where every participant (var id) was prescribed an antibiotic for 10 days with each line representing 1 day (var day). The dose (var dose) may change by day. I am trying to create a variable that captures the maximum duration of each dosing change As you can see in the table below, id #1 had 4 changes over time (600mg for 4 days, 0mg for 1 day, 300mg for 3 days, and then 600mg for 2 days) and id #2 had no changes. Using the data step, created a variable 'count_state' that enumerates the number of days at each dose over time, restarting when there is a change. Help: I would like to create a new variable (bold column below) that is the maximum of count_state for each dose before there is any change.

 

Thank you very much in advance! 

 

iddaydosecount_state max_count_state
1160014
1260024
1360034
1460044
15011
1630013
1730023
1830033
1960012
11060022
21600110
22600210
23600310
24600410
25600510
26600610
27600710
28600810
29600910
2106001010

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

First let's convert your LISTING into an actual DATASET.  While we are at it let's name your last two derived variables as EXPECT_COUNT and EXPECT_MAX so we have something to use to check our programs results.

Spoiler
data have;
  input id day dose expect_count expect_max;
cards;
1 1 600 1 4
1 2 600 2 4
1 3 600 3 4
1 4 600 4 4
1 5 0 1 1
1 6 300 1 3
1 7 300 2 3
1 8 300 3 3
1 9 600 1 2
1 10 600 2 2
2 1 600 1 10
2 2 600 2 10
2 3 600 3 10
2 4 600 4 10
2 5 600 5 10
2 6 600 6 10
2 7 600 7 10
2 8 600 8 10
2 9 600 9 10
2 10 600 10 10
;

To identify when the DOSE changes you can use BY statement with the NOTSORTED keyword.

data step1;
  set have;
  by id dose notsorted;
  if first.id then group=0;
  if first.dose then do;
    group+1; count=0;
  end;
  count+1;
run;

Once you have this grouping variable finding the MAX of COUNT is simple.  PROC SQL makes this easy because it will remerge aggregate results back onto the detailed observations.

proc sql;
create table want as
select *,max(count) as max_count
from step1
group by id,group
order by id,day
;
quit;

Result:

                            expect_    expect_                       max_
Obs    id    day    dose     count       max      group    count    count

  1     1      1     600        1          4        1         1        4
  2     1      2     600        2          4        1         2        4
  3     1      3     600        3          4        1         3        4
  4     1      4     600        4          4        1         4        4
  5     1      5       0        1          1        2         1        1
  6     1      6     300        1          3        3         1        3
  7     1      7     300        2          3        3         2        3
  8     1      8     300        3          3        3         3        3
  9     1      9     600        1          2        4         1        2
 10     1     10     600        2          2        4         2        2
 11     2      1     600        1         10        1         1       10
 12     2      2     600        2         10        1         2       10
 13     2      3     600        3         10        1         3       10
 14     2      4     600        4         10        1         4       10
 15     2      5     600        5         10        1         5       10
 16     2      6     600        6         10        1         6       10
 17     2      7     600        7         10        1         7       10
 18     2      8     600        8         10        1         8       10
 19     2      9     600        9         10        1         9       10
 20     2     10     600       10         10        1        10       10

Note that COUNT is not really needed since the MAX of COUNT is the same thing as the number of observations in the group.

data step1;
  set have(keep=id day dose);
  by id dose notsorted;
  if first.id then group=0;
  group+first.dose;
run;

proc sql;
create table want as
select *,count(*) as max_count
from step1
group by id,group
order by id,day
;
quit;

 

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

Here's how I would do it. In a DATA step, identify each place where the dose changes (so count_state=1) and turn this into its own group, with a unique group number. Then determine the max value of count_state in each group. Merge the max in with the original data.

 

/* UNTESTED CODE */
data intermediate;
    set have;
    if count_state=1 then group+1;
run;
proc summary data=intermediate nway;
    class group;
    var count_state;
    output out=maxx max=max_count_state;
run;
data want;
   merge intermediate maxx;
   by group;
run;


If you want tested code, we need to see data presented as working SAS data step code (examples and instructions). Data copied and pasted from Excel, or Excel files, are not acceptable, and we cannot test code provided via copy/paste or in Excel files.

--
Paige Miller
Tom
Super User Tom
Super User

First let's convert your LISTING into an actual DATASET.  While we are at it let's name your last two derived variables as EXPECT_COUNT and EXPECT_MAX so we have something to use to check our programs results.

Spoiler
data have;
  input id day dose expect_count expect_max;
cards;
1 1 600 1 4
1 2 600 2 4
1 3 600 3 4
1 4 600 4 4
1 5 0 1 1
1 6 300 1 3
1 7 300 2 3
1 8 300 3 3
1 9 600 1 2
1 10 600 2 2
2 1 600 1 10
2 2 600 2 10
2 3 600 3 10
2 4 600 4 10
2 5 600 5 10
2 6 600 6 10
2 7 600 7 10
2 8 600 8 10
2 9 600 9 10
2 10 600 10 10
;

To identify when the DOSE changes you can use BY statement with the NOTSORTED keyword.

data step1;
  set have;
  by id dose notsorted;
  if first.id then group=0;
  if first.dose then do;
    group+1; count=0;
  end;
  count+1;
run;

Once you have this grouping variable finding the MAX of COUNT is simple.  PROC SQL makes this easy because it will remerge aggregate results back onto the detailed observations.

proc sql;
create table want as
select *,max(count) as max_count
from step1
group by id,group
order by id,day
;
quit;

Result:

                            expect_    expect_                       max_
Obs    id    day    dose     count       max      group    count    count

  1     1      1     600        1          4        1         1        4
  2     1      2     600        2          4        1         2        4
  3     1      3     600        3          4        1         3        4
  4     1      4     600        4          4        1         4        4
  5     1      5       0        1          1        2         1        1
  6     1      6     300        1          3        3         1        3
  7     1      7     300        2          3        3         2        3
  8     1      8     300        3          3        3         3        3
  9     1      9     600        1          2        4         1        2
 10     1     10     600        2          2        4         2        2
 11     2      1     600        1         10        1         1       10
 12     2      2     600        2         10        1         2       10
 13     2      3     600        3         10        1         3       10
 14     2      4     600        4         10        1         4       10
 15     2      5     600        5         10        1         5       10
 16     2      6     600        6         10        1         6       10
 17     2      7     600        7         10        1         7       10
 18     2      8     600        8         10        1         8       10
 19     2      9     600        9         10        1         9       10
 20     2     10     600       10         10        1        10       10

Note that COUNT is not really needed since the MAX of COUNT is the same thing as the number of observations in the group.

data step1;
  set have(keep=id day dose);
  by id dose notsorted;
  if first.id then group=0;
  group+first.dose;
run;

proc sql;
create table want as
select *,count(*) as max_count
from step1
group by id,group
order by id,day
;
quit;

 

sofia_de_garay
Fluorite | Level 6

Thank you for this super clear explanation. This was very educational for me! 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 837 views
  • 3 likes
  • 3 in conversation