BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Shone
Calcite | Level 5

Hi, I have a dataset that looks like shown below. 

I want to put 1 if name of colume (data1 data2 and data3 is 4 column in the example (col1, col2, col3,col4) if column doesn't match i want to put 0 

 

Dataset I have: 

 

iddata1data2data3col1col2col3col4fnamelname
100   data1 data2data3appleasdf
200    data2  ballasdf
300    data1  catasdf
400     data2 dataasdf
500       dataasdf
600    data1 data3asdfasdasdf
700       asdfasdasdf

 

 

Result I want: 

 

iddata1data2data3col1col2col3col4fnamelname
100111data1 data2data3appleasdf
200010 data2  ballasdf
300100 data1  catasdf
400010  data2 dataasdf
500000    dataasdf
600101 data1 data3asdfasdasdf
700011  data2data3asdfasdasdf

 

How can I compare few columns with data in another columns? This is a sample so data1. data2, data3 could be in 100s and col1 and col2 can also be in 100s. I am looking into millions of rows of data.  Any help or suggestion would be really great. 

 

Thank you, 

Shone

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I think this may get you started:

 

data have;
   input id data1 data2 data3 col1 $ col2 $ col3 $;
datalines ;
100 . . . data1 . data2
200 . . . .  data2 .
300 . . . .  data1 .
400 . . . data1 . data3
;
run;

data want;
   set have;
   array c  col1-col3;
   array d data1-data3;
   length v $ 32;
   do i= 1 to dim(d);
      v= vname(d[i]);
      d[i] = whichc(v, of c(*))>0;
   end;
   drop v i;
run;

You may have to play some games with the CASE of the variable v and the col values to ensure comparsion as if your data may have

 

"Data1" that does not match "data1" so you may have to ensure that all of your Col variables are all lower case or uppercase and that the V is set the same.

View solution in original post

5 REPLIES 5
ballardw
Super User

I think this may get you started:

 

data have;
   input id data1 data2 data3 col1 $ col2 $ col3 $;
datalines ;
100 . . . data1 . data2
200 . . . .  data2 .
300 . . . .  data1 .
400 . . . data1 . data3
;
run;

data want;
   set have;
   array c  col1-col3;
   array d data1-data3;
   length v $ 32;
   do i= 1 to dim(d);
      v= vname(d[i]);
      d[i] = whichc(v, of c(*))>0;
   end;
   drop v i;
run;

You may have to play some games with the CASE of the variable v and the col values to ensure comparsion as if your data may have

 

"Data1" that does not match "data1" so you may have to ensure that all of your Col variables are all lower case or uppercase and that the V is set the same.

Shone
Calcite | Level 5

Thank you so much. What does length v $ 32; do?

 

 

Thank you, 

Shone 

Shone
Calcite | Level 5

How can I set those 0s and 1 be set as number instead of character? Thank you!!

ballardw
Super User

@Shone wrote:

How can I set those 0s and 1 be set as number instead of character? Thank you!!


In the code I wrote and run then the data1 - date3 values are numeric. You would have to show your actual data and code to see why you are getting character values. Likely your initial variable data1-data3 are character.

 

The Length v $ 32; sets the maximum length of the variable to hold a single variable name to 32 characters, the maximum possible currently in SAS. The function VNAME would assign a length of 200 by default. Which if you keep the value, such as during debugging, can mean much wasted space in whatever tool is used to examine the values.

 

Generally it is a good idea to set lengths of character variables as the default assignments could lead to truncation. Examine the code below. Determine what you think the final value of X is going to be. Then run the code and view or print the result. Does X meet that expected value?

data example;
   y = 3;
   x = 'abc';
   if y=3 then x='Should be 3';
run;
Reeza
Super User

Fully dynamic approach that doesn't require you to know anything about the data beyond the variable names. 

 

data have;
    input id data1 data2 data3 col1 $ col2 $ col3 $;
    datalines;
100 . . . data1 . data2
200 . . . .  data2 .
300 . . . .  data1 .
400 . . . data1 . data3
;
run;

*flip to long;
proc transpose data=have (Drop=data1-data3) out=long;
    by Id;
    var col1-col3;
run;

*remove missing and add 1;
data long2;
    set long (where=(not missing(col1)));
    val=1;
run;

proc sort data=long2;
    by id col1;
run;

proc transpose data=long2 out=wide;
    by id;
    var val;
    id col1;
run;

@Shone wrote:

Hi, I have a dataset that looks like shown below. 

I want to put 1 if name of colume (data1 data2 and data3 is 4 column in the example (col1, col2, col3,col4) if column doesn't match i want to put 0 

 

Dataset I have: 

 

id data1 data2 data3 col1 col2 col3 col4 fname lname
100       data1   data2 data3 apple asdf
200         data2     ball asdf
300         data1     cat asdf
400           data2   data asdf
500               data asdf
600         data1   data3 asdfasd asdf
700               asdfasd asdf

 

 

Result I want: 

 

id data1 data2 data3 col1 col2 col3 col4 fname lname
100 1 1 1 data1   data2 data3 apple asdf
200 0 1 0   data2     ball asdf
300 1 0 0   data1     cat asdf
400 0 1 0     data2   data asdf
500 0 0 0         data asdf
600 1 0 1   data1   data3 asdfasd asdf
700 0 1 1     data2 data3 asdfasd asdf

 

How can I compare few columns with data in another columns? This is a sample so data1. data2, data3 could be in 100s and col1 and col2 can also be in 100s. I am looking into millions of rows of data.  Any help or suggestion would be really great. 

 

Thank you, 

Shone


 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2762 views
  • 1 like
  • 3 in conversation