BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Merdock
Quartz | Level 8

I have a dataset with over 700 records, similar in format to the sample dataset below.

 

Each participant (ID) can have up to a maximum of 5 visits (visit=00 through 04); some might have only one visit, others might have more than one.

 

Status is a binary variable that denotes their disease status (0=negative, 1=positive for disease) and this status can change from visit to visit. For example, a participant might be positive at baseline (Status=1 at visit=00) and then negative at the next visit(s), another participant might have the same status across all visits and so on.

 

I would like to know how many times participants had change in Status from visit to visit but not sure how to do this programmatically. I am thinking to present counts for all the different combinations of disease status across the different visits that the participants had. For example, with a total of 5 visits, we’d have a total of 2^5=32 possible combinations, like so:

0-0-0-0-0,  N=N1 (where N1 is number of patients who had Status=0 at all 5 visits)

1-0-0-0-0,  N=N2 (N2 is number of patients who had Status=1 at Visit=00 but 0 at all subsequent visits)

0-1-0-0-0,  N=N3 (N3 is number of patients who had Status=1 at Visit=01 but 0 at all other visits)

ETC.

 

My only idea for how to approach this is to transpose the data and then manually list all the 32 different combinations with if-then statements and flag them so I can then add them up and see how many of each I have but this seems really cumbersome, inefficient and probably not the best solution since it doesn't address cases where participant had just two or three visits with a change in status somewhere along the way (For example, case of ID=002 or 008, 009 etc.). Does anybody have any suggestions or sample code for how they would approach this problem? Code of what I’ve done so far is shown below.

 

Thanks in advance!

 

data dat1;
	input ID$ Visit$ Status$;
datalines;
001	00	0
001	01	0
001	02	0
001	03	0
001	04	0
002	00	1
002	01	0
002	02	0
002	03	1
003	01	1
003	02	1
004	00	1
004	02	1
004	03	0
005	00	1
006	02	0
007	02	0
007	04	0
008	00	1
008	02	1
008	03	0
009	00	0
009	01	1
009	02	0
009	03	1
010	01	1
011	00	1
011	01	0
011	02	0
011	03	1
011	04	0
;
run;
proc print data=dat1; run;

*Convert data from long to wide with transpose;
proc transpose data=dat1 out=dat_wide prefix=Status;
    by ID;
    id visit;
    var status;
run;
proc print data=dat_wide; run;

data dat_wide2;
	set dat_wide;
	if Status00=0 and Status01=0 and Status02=0 and Status03=0 and Status04=0 then flag1=1;

	if Status00=0 and Status01=0 and Status02=0 and Status03=0 and Status04=1 then flag2=1;

	if Status00=0 and Status01=0 and Status02=0 and Status03=1 and Status04=0 then flag3=1;
	/*etc for the remaining 29 combinations*/
run;
proc print data=dat_wide2; run; 

 

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Your data is in perfect shape to solve the problem, no need to transpose it.

data want;
   set dat1;
   by Id;
   
   length num_changes 8;
   retain num_changes;
   
   last_status = lag(status);
   
   if first.id then do;
      num_changes = 0;
   end;
   else do;
      num_changes = num_changes + (last_status ^= status);
   end;
   
   if last.id then do;
      output;
   end;
   
   drop last_status status;
run;

 

View solution in original post

4 REPLIES 4
andreas_lds
Jade | Level 19

Your data is in perfect shape to solve the problem, no need to transpose it.

data want;
   set dat1;
   by Id;
   
   length num_changes 8;
   retain num_changes;
   
   last_status = lag(status);
   
   if first.id then do;
      num_changes = 0;
   end;
   else do;
      num_changes = num_changes + (last_status ^= status);
   end;
   
   if last.id then do;
      output;
   end;
   
   drop last_status status;
run;

 

Merdock
Quartz | Level 8
@andreas_lds, thank you kindly for your help, this helps solve my problem! Accepted it as solution.
Cheers!
Kurt_Bremser
Super User

Do this:

data changes;
set dat1;
by id;
change = ifn(first.id,0,status ne lag(status));
run;

You can now perform summary statistics on change as needed (overall, by id, by visit).

Merdock
Quartz | Level 8
@Kurt_Bremser, thank you so much, this works as well!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 327 views
  • 2 likes
  • 3 in conversation