Compare value between two different columns for entire datasets and ca...

abraham1 · Posted 07-10-2023 07:24 AM

Hello Everyone!

In my below datasets, I need to compare between two unique columns (jp1~jp2, eye1~eye2 etc) and calculate value based on condition as if both contain value of 1, then display as 1 else zero.

Once done, sum all the score from derived columns.

In my actual datasets, I have two hundred columns with different name (not end with suffix 1 or 2)

and I need to follow below approach,

Is there any alternative way to program it.

Thank you

data testing;
input pid visit jp1 jp2 eye1 eye2 ear1 ear2 nose1 nose2;
cards;
101 2 1 1 1 1 1 0 0 1
102 3 0 0 1 0 1 1 1 1
102 5 0 0 1 0 1 1 0 0
102 6 1 1 1 1 1 1 1 1
103 4 0 0 0 0 0 0 1 1
104 7 1 1 1 1 1 1 1 0
105 3 . 0 0 . . . . .
;
run;
data comp;
set testing;
by pid visit;
if jp1=1 and jp2 =1 then jp_score=1; else f_score=0;
if eye1=1 and eye2 =1 then eye_score=1 ;else eye_score=0;
if ear1=1 and ear2 =1 then ear_score=1; else ear_score=0;
if nose1=1 and nose2 =1 then nos_score=1; else nos_score=0;
sum_score=sum(f_score,eye_score,ear_score,nos_score);
run;

PaigeMiller · Posted 07-10-2023 07:35 AM

I have two hundred columns with different name (not end with suffix 1 or 2).

This is a complication. If it had not been for the above, ARRAY would work well. However, what are the real variable names? Are the two variables to compare always next to each other with relatively similar names, or are the names not similar so that one such variable is named (for example) COPN and the variable it is supposed to be compared with named FROGZ12 (which is not at all similar to the variable name COPN) and is not next to COPN? We need details here.

--
Paige Miller

abraham1 · Posted 07-10-2023 08:14 AM

Variable are totally different like SOTMIGT with TOSDIGT, DOBINEE~EDTANEE, AIKTNEE~WSTINEE, CHOSTEAD~DEENEAD etc. Most of variable last three letters are same when selected for comparison

PaigeMiller · Posted 07-10-2023 08:47 AM

@abraham1 I need to know if the variables to compare are consecutive in the data set, or can they be widely separated. Please answer.

--
Paige Miller

FreelanceReinh · Posted 07-10-2023 08:37 AM

Hello @abraham1,

As long as your sample dataset TESTING with its pairs of variables next to each other is representative for your real data, you could transpose the data from wide to long and finally back:

/* Change the data structure from wide to long */

proc transpose data=testing out=long;
by pid visit;
run;

/* Compute the índividual scores */

data complong(drop=_: col1);
set long;
by pid visit;
if first.visit then suffix=0;
_m=mod(_n_-1,2);
suffix+_m;
score = col1=1 & lag(col1)=1;
if _m;
run;

/* Create wide output dataset and compute sum score */

proc transpose data=complong out=compwide(drop=_:) prefix=score;
by pid visit;
var score;
id suffix;
run;

data want;
set compwide;
sum_score=sum(of score:);
run;

The individual scores are now named SCORE1, SCORE2, etc. for simplicity. If needed, you can rename or label them based on the names of the original variables.

(Note the inconsistency "jp_score" vs. "f_score" in your COMP dataset.)

Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Re: Compare value between two different columns for entire datasets and calculate overall score

Registration is open

SAS Training: Just a Click Away