BookmarkSubscribeRSS Feed
em1535
Fluorite | Level 6

Hi,

tsmk is my time-dependent covariate, deathstatus_inhance is censor, and stime_inhance is time. The "Change step" worked well, but the "Count step" only worked for four observations and stopped.

Please help me to correct my code.

 

 

 

DATA analysis.change; 
	SET analysis.smk_stime2;
		ARRAY tsmk_(*) tsmk_1-tsmk_5; *call in the time-varying smoking variables; 
			ARRAY chng(5); *the new indicator variables; 
				t=1; *initialize the position variable for the indicator variables; 
				DO i = 2 TO 5; 
				IF tsmk_(i) NE tsmk_(i-1) THEN DO; *detects whether there is a change in smoking status; 
		chng(t) = i-1; *assigns the last year the status remained constant;
		t=t+1;
		END;
	END;
RUN; 

DATA analysis.count;
    SET analysis.change;
    ARRAY tsmk_(*) tsmk_1-tsmk_5;  /* call in the time-varying smoking variables */
    ARRAY chng(*) chng1-chng5;      /* call in the indicator variables */
    start = 0;                      /* initialize the beginning time for the study */
    censor2 = 0;                    /* initialize the new censor variable */
    t = 1;                          /* initialize the position variable for the indicator variables (chng1-chng5) */

    DO i=1 TO stime_inhance;        /* makes sure we only output the records that smoking status remains constant */
        IF (chng(t) > . and chng(t) < stime_inhance) or i = stime_inhance THEN do;
            /* assign the value of smoking status */
            IF chng(t) > . THEN smoking_status = tsmk_(chng(t));
            ELSE smoking_status = tsmk_(stime_inhance);  /* assign the end time */
            stop = min(chng(t), stime_inhance);          /* assign the value of the censor variable */
            IF i = stime_inhance THEN censor2 = deathstatus_inhance; /* assign the new start time */
            IF t > 1 THEN start = chng(t-1);            /* move the position variable */
            t = t + 1;
            OUTPUT;  /* output the record to the new dataset */
        end;
    END;
RUN;
7 REPLIES 7
SASJedi
Ammonite | Level 13

Please provide sample data for analysis.smk_stime2

Check out my Jedi SAS Tricks for SAS Users
PaigeMiller
Diamond | Level 26

In addition to providing the data as requested by @SASJedi , please show us the ENTIRE log for data step that creates analysis.count. Please copy the log as text and then paste it into the window that appears when you click on the </> icon.

PaigeMiller_0-1663012019648.png

--
Paige Miller
em1535
Fluorite | Level 6

Here is my data frame:

idnum	deathstatus_inhance	stime_inhance	tsmk_1	tsmk_2	tsmk_3	tsmk_4	tsmk_5
1580	0	                85.8809	          1	  1	  1	 0	 0
1581	1	                38.7023	          0	  1	  1	 0	 0
1582	1	                1.347	          1	  1       1	 1       1
1585	0	                85.7166	          0	  0	  0	 0	 0
1586	0	                85.7166	          1	  1	  1	 1	 1
1587	1	                13.0103	          1	  1	  1	 1	 1
1588	1	                16.6571	          0	  1	  1	 1	 1
1589	1	                2.037	          0	  0	  0	 0	 0
1596	1	                0.9199	          1   	  1	  1	 1	 1
1601	0	                85.4209	          0	  0	  0	 0	 0
1603	0	                85.3881           0	  0	  0	 0	 0
1604	1	                44.8789	          1	  1	  1	 1	 1
1608	0	                85.0267	          0	  1	  0	 0	 0
1612	0	                84.9281	          1	  1	  0	 0	 0
1613	0	                0.1314	          0	  0	  0	 0	 0
1614	1	                25.3306	          1	  1	  1	 1	 1
1616	0	                84.7967	          0	  0	  0	 0	 0

*Counting Process;
DATA analysis.change; 
	SET analysis.smk_stime2;
		ARRAY tsmk_(*) tsmk_1-tsmk_5; *call in the time-varying smoking variables; 
			ARRAY chng(5); *the new indicator variables; 
				t=1; *initialize the position variable for the indicator variables; 
				DO i = 2 TO 5; 
				IF tsmk_(i) NE tsmk_(i-1) THEN DO; *detects whether there is a change in smoking status; 
		chng(t) = i-1; *assigns the last year the status remained constant;
		t=t+1;
		END;
	END;
RUN; 

After running my code, it turned to:

idnum	deathstatus_inhance	stime_inhance	tsmk_1	tsmk_2	tsmk_3	tsmk_4	tsmk_5	chng1	chng2	chng3	chng4	chng5	t	i
1580	    0	                   85.8809	1	1	1	0	0	3	.	.	.	.	2	6
1581	    1	                   38.7023	0	1	1	0	0	1	3	.	.	.	3	6
1582	    1	                   1.347	1	1	1	1	1	.	.	.	.	.	1	6
1585	    0	                   85.7166	0	0	0	0	0	.	.	.	.	.	1	6
1586	    0	                   85.7166	1	1	1	1	1	.	.	.	.	.	1	6
1587	    1	                   13.0103	1	1	1	1	1	.	.	.	.	.	1	6
1588	    1	                   16.6571	0	1	1	1	1	1	.	.	.	.	2	6
1589	    1	                   2.037	0	0	0	0	0	.	.	.	.	.	1	6
1596	    1	                   0.9199	1	1	1	1	1	.	.	.	.	.	1	6
1601	    0	                   85.4209	0	0	0	0	0	.	.	.	.	.	1	6
1603	    0	                   85.3881	0	0	0	0	0	.	.	.	.	.	1	6
1604	    1	                   44.8789	1	1	1	1	1	.	.	.	.	.	1	6
1608	    0	                   85.0267	0	1	0	0	0	1	2	.	.	.	3	6
1612	    0	                   84.9281	1	1	0	0	0	2	.	.	.	.	2	6
1613	    0	                   0.1314	0	0	0	0	0	.	.	.	.	.	1	6
1614	    1	                   25.3306	1	1	1	1	1	.	.	.	.	.	1	6
1616	    0	                   84.7967	0	0	0	0	0	.	.	.	.	.	1	6

Then I tried to run the below code, which was unsuccessful:

DATA analysis.count;
    SET analysis.change;
    ARRAY tsmk_(*) tsmk_1-tsmk_5;  /* call in the time-varying smoking variables */
    ARRAY chng(*) chng1-chng5;      /* call in the indicator variables */
    start = 0;                      /* initialize the beginning time for the study */
    censor2 = 0;                    /* initialize the new censor variable */
    t = 1;                          /* initialize the position variable for the indicator variables (chng1-chng5) */

    DO i=1 TO stime_inhance;        /* makes sure we only output the records that smoking status remains constant */
        IF (chng(t) > . and chng(t) < stime_inhance) or i = stime_inhance THEN do;
            /* assign the value of smoking status */
            IF chng(t) > . THEN smoking_status = tsmk_(chng(t));
            ELSE smoking_status = tsmk_(stime_inhance);  /* assign the end time */
            stop = min(chng(t), stime_inhance);          /* assign the value of the censor variable */
            IF i = stime_inhance THEN censor2 = deathstatus_inhance; /* assign the new start time */
            IF t > 1 THEN start = chng(t-1);            /* move the position variable */
            t = t + 1;
            OUTPUT;  /* output the record to the new dataset */
        end;
    END;
RUN;

Another question:  having more than one record for each individual, can I use the "PROGRAMMING STATEMENT" approach?

proc phreg data= analysis.smk_stime;
model stime_inhance*deathstatus_inhance(0)= smoking /ties=erfon rl;
	array tsmk_(*) tsmk_1-tsmk_5;
	smoking= tsmk_[stime_inhance];
	format tsmk_1-tsmk_5 tsmk.;
run;

Thanks

 

 

PaigeMiller
Diamond | Level 26

Are there errors in the log? Are there warnings in the log?

 

Repeating: 

 

please show us the ENTIRE log for data step that creates analysis.count. Please copy the log as text and then paste it into the window that appears when you click on the </> icon.

PaigeMiller_0-1715196634946.png

 

 

--
Paige Miller
em1535
Fluorite | Level 6

Here is the log:

50774
50775
50776  DATA analysis.count;
50777      SET analysis.change;
50778      ARRAY tsmk_(*) tsmk_1-tsmk_5;  /* call in the time-varying smoking variables */
50779      ARRAY chng(*) chng1-chng5;      /* call in the indicator variables */
50780      start = 0;                      /* initialize the beginning time for the study */
50781      censor2 = 0;                    /* initialize the new censor variable */
50782      t = 1;                          /* initialize the position variable for the indicator
50782! variables (chng1-chng5) */
50783
50784      DO i=1 TO stime_inhance;        /* makes sure we only output the records that smoking
50784! status remains constant */
50785          IF (chng(t) > . and chng(t) < stime_inhance) or i = stime_inhance THEN do;
50786              /* assign the value of smoking status */
50787              IF chng(t) > . THEN smoking_status = tsmk_(chng(t));
50788              ELSE smoking_status = tsmk_(stime_inhance);  /* assign the end time */
50789              stop = min(chng(t), stime_inhance);          /* assign the value of the censor
50789! variable */
50790              IF i = stime_inhance THEN censor2 = deathstatus_inhance; /* assign the new
50790! start time */
50791              IF t > 1 THEN start = chng(t-1);            /* move the position variable */
50792              t = t + 1;
50793              OUTPUT;  /* output the record to the new dataset */
50794          end;
50795      END;
50796  RUN;

ERROR: Array subscript out of range at line 50788 column 35.
idnum=2289 deathstatus_inhance=0 stime_inhance=48 tsmk_1=0 tsmk_2=0 tsmk_3=0 tsmk_4=0 tsmk_5=0
chng1=. chng2=. chng3=. chng4=. chng5=. t=1 i=48 start=0 censor2=0 smoking_status=. stop=.
_ERROR_=1 _N_=293
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 293 observations read from the data set ANALYSIS.CHANGE.
WARNING: The data set ANALYSIS.COUNT may be incomplete.  When this step was stopped there were
         159 observations and 19 variables.
WARNING: Data set ANALYSIS.COUNT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.01 seconds
PaigeMiller
Diamond | Level 26
ERROR: Array subscript out of range at line 50788 column 35.
idnum=2289 deathstatus_inhance=0 stime_inhance=48 tsmk_1=0 tsmk_2=0 tsmk_3=0 tsmk_4=0 tsmk_5=0
chng1=. chng2=. chng3=. chng4=. chng5=. t=1 i=48 start=0 censor2=0 smoking_status=. stop=.
_ERROR_=1 _N_=293

 

So line 50788 is:

 

ELSE smoking_status = tsmk_(stime_inhance);  /* assign the end time */

 

The array subscript is the value of the variable STIME_INHANCE, which for this row of the data has the value 48. How do I know that? Because it is printed in the log (I have highlighted it in red). The array allows subscripts up to 5, it has five elements, that's how you defined it, so array element 48 doesn't exist and trying to use array element 48 causes an error.

 

I have not attempted to figure out what you are trying to do with this data step, so I cannot suggest an improvement. It is always helpful to explain what you are trying to do in sufficient detail that we can help write the code without errors.

 

Please, in the future, when you get errors in the log, you need to show us the full log for the step with the errors in your first post on the subject. We can't diagnose the problem without the log.

 

 

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hi @em1535,

 

As far as I see, your DATA step creating dataset ANALYSIS.COUNT works perfectly on survival data with a discrete time scale like year 1, 2, 3, 4, 5 with corresponding variables TSMK_1, ..., TSMK_5 describing the smoking status in each of those time periods. However, you apply this program to data with time measured on a continuous scale with values ranging at least from 0.1314 to 85.8809.

 

So, the failed attempt of retrieving the smoking status at time 48 from the five-element array TSMK_ raises the question: What are the five time intervals on your continuous scale that the variables TSMK_1, ..., TSMK_5 correspond to?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2259 views
  • 4 likes
  • 4 in conversation