@CathyVI wrote:
@FreelanceReinh What is this step doing in the code? Is it counting the location of event.
if _vn<3 then event_type=' '||event_type; /* Indentation improves sort order of report columns. */ Duration=e[_vn];
The purpose of the IF-THEN statement is to indent the character values "Event1 only" and "Event2 only" by one character (resulting in " Event1 only" and " Event2 only") so that they come first in alphabetical order. The inserted leading blank is sorted before all letters. Without this trick the column order in the final PROC FREQ output would be
Case32 only, Event1 only, Event2 only, Event46 A only, Event46 B only, Event46 C only, Multiple
If this alphabetical column order is acceptable, remove that IF-THEN statement to simplify the code.
The statement
Duration=e[_vn];
assigns the non-zero value among the four values of Event1, Event2, Case32 and Event46 to variable Duration. There is exactly one non-zero value because of the IF condition _n0=3. Example: If Case32=2 and Event1=Event2=Event46=0, the concatenation _s equals "0020". The COUNTC function (here counting zeros in _s) then returns _n0=3 and the VERIFY function returns _vn=3 as the position of the first character in _s that is not equal to "0". Finally, e[_vn]=e[3] is the value of the third variable of the list event1 event2 case32 event46 in the ARRAY statement, i.e., the value of Case32 (which is 2 in the example).
what is the difference between Duration=e[_vn]; and _v=vname(e[_vn]);
The VNAME function retrieves the name of the variable corresponding to the array reference e[_vn]. In the example above (where _vn=3) the result is _v="Case32".
Then is this code going to identify only multiple of 2
Duration=1+(countc(_s,'2')>1); /* Duration=2 only in case of multiple occurrences of value 2. */
In the case event_type='Multiple' the question arises whether this observation should be counted in category "<=30 days" or in "31-60 days" if both categories occur, e.g., if Event1=0, Event2=1, Case32=2, Event46=0. From your description
"if you have 2 in more than one group you will be multi_60"
I concluded that this particular example should be counted in "<=30 days", whereas, e.g., an observation with Event1=2, Event2=1, Case32=2, Event46=0 would be counted in "31-60 days" (because value 2 occurs more than once). This rule is implemented in the assignment statement
Duration=1+(countc(_s,'2')>1);
Step-by-step explanation:
Event1=0, Event2=1, Case32=2, Event46=0 → _s="0120" → countc(_s,'2')=1 → 1>1 is FALSE (0) → Duration=1+0=1
Event1=2, Event2=1, Case32=2, Event46=0 → _s="2120" → countc(_s,'2')=2 → 2>1 is TRUE (1) → Duration=1+1=2
That assignment statement using a Boolean expression (the inequality countc(_s,'2')>1), which evaluates to TRUE (numeric value: 1) or FALSE (numeric value: 0), is an abbreviation for
if countc(_s,'2')>1 then Duration=2;
else Duration=1;
Note that this logic is applied only in the case _n0<3, not to observations with, e.g., Event1=0, Event2=0, Case32=2, Event46=0, where _n0=3 (and eventually Duration=2) as explained further above.
... View more