06-23-2017 07:39 PM
I need some guidance on the following:
My dataset contains 45 observations and 4 variables: cough, fever, sweats, flu; each variable is numerical with choices 1=<2 months, 2=3-4 months, 3=5-6 months, 4=>6 months, 9=unknown.
I need to create a new variable called 'anysymptom' where it would give me the longest duration for any of these symptoms (e.g. for observation 1 that has cough=1 fever=2 sweats=4 flu=9, the new variable would return 4, which is >6 months).
There are also missing values for some of the observations.
Is there a function I could utilize for this?
Any advise is appreciated!
06-23-2017 07:52 PM
anysymptom = max(ifn(cough eq 9, ., cough),
ifn(fever eq 9, ., fever),
ifn(sweats eq 9, ., sweats),
ifn(flu eq 9, ., flu));
If there were a lot of variables you could use an array and a loop.
06-23-2017 08:20 PM
If the 9 wasn't there this would be much easier, there's the LARGEST/MAX functions. This probably could be tweaked but I suspect @WarrenKuhfeld is probably the easiest or some form of a basic loop where you can ignore the 9.
In this solution, I use the LARGEST() function and loop if it's a 9 so you're not looping through all the variables, only until you find one the largest that's not a 9.
data test; a=1; b=2; c=3; d=9; output; a=2; b=3; c=3; d=4; output; a=9; b=9; c=9; d=9; output; run; data want; set test; array vars(*) a b c d; maxv=.; i=1; do until (maxv ne 9 or i=dim(vars)); maxV=largest(i, of vars(*)); i+1; end; if maxV=9 then maxV=.; run; proc print data=want; run;
06-23-2017 08:14 PM - edited 06-23-2017 08:19 PM
Here is one way:
proc format; value symptom 9=0 ; run; data want (drop=i);
array symptoms(*) cough--flu;
do i=1 to dim(symptoms);
Art, CEO, AnalystFinder.com
Need further help from the community? Please ask a new question.