BookmarkSubscribeRSS Feed
abraham1
Obsidian | Level 7

Hello Everyone!

In my below datasets, I need to compare between two unique columns (jp1~jp2, eye1~eye2 etc) and calculate value based on condition as  if both contain value of 1, then display as 1 else zero.

Once done, sum all the score from derived columns.

In my actual datasets, I have two hundred columns with different name (not end with suffix 1 or 2)

and I need to follow below approach,

Is there any alternative way to program it.

Thank you

 

data testing;
input pid visit jp1 jp2 eye1 eye2 ear1 ear2 nose1 nose2;
cards;
101 2 1 1 1 1 1 0 0 1
102 3 0 0 1 0 1 1 1 1
102 5 0 0 1 0 1 1 0 0
102 6 1 1 1 1 1 1 1 1
103 4 0 0 0 0 0 0 1 1
104 7 1 1 1 1 1 1 1 0
105 3 . 0 0 . . . . .
;
run;
data comp;
set testing;
by pid visit;
if jp1=1 and jp2 =1 then jp_score=1; else f_score=0;
if eye1=1 and eye2 =1 then eye_score=1 ;else eye_score=0;
if ear1=1 and ear2 =1 then ear_score=1; else ear_score=0;
if nose1=1 and nose2 =1 then nos_score=1; else nos_score=0;
sum_score=sum(f_score,eye_score,ear_score,nos_score);
run;
4 REPLIES 4
PaigeMiller
Diamond | Level 26

I have two hundred columns with different name (not end with suffix 1 or 2).

 

This is a complication. If it had not been for the above, ARRAY would work well. However, what are the real variable names? Are the two variables to compare always next to each other with relatively similar names, or are the names not similar so that one such variable is named (for example) COPN and the variable it is supposed to be compared with named FROGZ12 (which is not at all similar to the variable name COPN) and is not next to COPN? We need details here.

--
Paige Miller
abraham1
Obsidian | Level 7
Variable are totally different like SOTMIGT with TOSDIGT, DOBINEE~EDTANEE, AIKTNEE~WSTINEE, CHOSTEAD~DEENEAD etc. Most of variable last three letters are same when selected for comparison
PaigeMiller
Diamond | Level 26

@abraham1 I need to know if the variables to compare are consecutive in the data set, or can they be widely separated. Please answer.

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hello @abraham1,

 

As long as your sample dataset TESTING with its pairs of variables next to each other is representative for your real data, you could transpose the data from wide to long and finally back:

/* Change the data structure from wide to long */

proc transpose data=testing out=long;
by pid visit;
run;

/* Compute the índividual scores */

data complong(drop=_: col1);
set long;
by pid visit;
if first.visit then suffix=0;
_m=mod(_n_-1,2);
suffix+_m;
score = col1=1 & lag(col1)=1;
if _m;
run;

/* Create wide output dataset and compute sum score */

proc transpose data=complong out=compwide(drop=_:) prefix=score;
by pid visit;
var score;
id suffix;
run;

data want;
set compwide;
sum_score=sum(of score:);
run;

 

The individual scores are now named SCORE1, SCORE2, etc. for simplicity. If needed, you can rename or label them based on the names of the original variables.

(Note the inconsistency "jp_score" vs. "f_score" in your COMP dataset.)

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1122 views
  • 0 likes
  • 3 in conversation