BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
indox
Obsidian | Level 7

Hello,

 

I would like to compare multiple character variable values let say c1,c2,c3 and c4

the result should be 1 if all the values are not equal.

The result should'nt be 1 if one of the variable is empty.

 

Thanks for the help!

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

OK, so a blank is not meant to establish an inequality.  Then:

 

data want (drop=i word);

  set have;
  array c{4} $ c1-c4;

  word=scan(catx(' ',of c{*}),1);

  result='same';
  do I=1 to dim(c) while (result='same');
    if NOT(c{i}=word or c{i}='') then result='diff';
  end;

run;

This strategy here is the same as my previous submission.  Default result to 'same' until proven otherwise.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

11 REPLIES 11
mkeintz
PROC Star

Show us what you have tried so far.  Then we can help you help yourself.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
indox
Obsidian | Level 7
data want(drop=count i);
 set have;
 array _v{*} c1 c2 c3 c4;
 count=0;
 do i=1 to dim(_v);
  count+(_v{1}=_v{i});
 end;
flag=ifc(count=dim(_v),'Same','Diff');
run;

This is what i tried so far.

mkeintz
PROC Star

Here's an approach that is coded to avoid the loop if there are missing values, and to stop stepping through the loop once a non-equality is encountered:

 

data want (drop=I);

  set have;

  array c{*} $ c1 c2 c3 c4;

  if cmiss(of c{*})=0 then result='same';
  else result='diff';

  do I=2 to dim(c) while (result='same');
    if c{i}^=c{i-1} then result='diff';
  end;

run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

So if C1=C2 then the output should be? 

Do you only have 4 or does this need to be expanded?

Untested but perhaps something like this. 


Use CMISS() to check for missing and then loop through the data and check if each value is unique. The last value doesn't need to be checked.

 

 

data want;

set have;

 

array _c(4) $ c1-c4;

if cmiss(of c1-c4) = 0 then do i=1 to dim(_c)-1;

if whichc(_c(i) of _c(*)) >1 then do;

flag=1;

leave;

end;

end;

 

if flag ne 1 then results=1;

else flag=0;

 

run;

 

 

indox
Obsidian | Level 7

c1--c2--c3--c4--result

a--a--a--a--same

a--a--{empty}--a--same

a--b--a--a--not same

a--b--{empty}--a--not same

 

This is what I want.

Thanks

mkeintz
PROC Star

OK, so a blank is not meant to establish an inequality.  Then:

 

data want (drop=i word);

  set have;
  array c{4} $ c1-c4;

  word=scan(catx(' ',of c{*}),1);

  result='same';
  do I=1 to dim(c) while (result='same');
    if NOT(c{i}=word or c{i}='') then result='diff';
  end;

run;

This strategy here is the same as my previous submission.  Default result to 'same' until proven otherwise.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
indox
Obsidian | Level 7

awesome! thank you.
could you please explain this:

if NOT(c{i}=word or c{i}='') then result='diff';


thanks

mkeintz
PROC Star

The logical expression  NOT(a or b) is the same as   NOT(a) and NOT(b), so

 

if NOT(c{i}=word or c{i}='') then result='diff';

is the same as

 

if c{i}^=word and c{i}^=' ' then result='diff';

Since WORD is the first non-blank among c{1}-c{4}, this IF expression tests for whether there is inequality among non-blank values in your array.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
HarryB
Obsidian | Level 7

Hi Mkeintz,

 

The following code for the comparison of multiple character variables was very helpful. Thank you. However, this code only picked up when there was only one word in the entry (for eg., in race variable: White, Black, Asian and so on), but it did not pick up races like African American/Black, Alaska Natives. Could you please help me with this? Thank you in advance!

 

data want (drop=i word);

  set have;
  array c{4} $ c1-c4;

  word=scan(catx(' ',of c{*}),1);

  result='same';
  do I=1 to dim(c) while (result='same');
    if NOT(c{i}=word or c{i}='') then result='diff';
  end;

run;

mkeintz
PROC Star

Instead of using a blank as the separator in the CATX function, use some other character that is not in any of the values -say'!':

 

data want (drop=i word);

  set have;
  array c{4} $ c1-c4;

  word=scan(catx('!',of c{*}),1);

  result='same';
  do I=1 to dim(c) while (result='same');
    if NOT(c{i}=word or c{i}='') then result='diff';
  end;

run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
HarryB
Obsidian | Level 7

Thank you Mkeintz!

 

Unfortunately, it did not work for my data. I received the following info in the log; could you please help me to figure this out. Thank you SO much!

data want (drop=i word);

  set have;
  array c{87}  $ Race_171--Race_1787;

  word=scan(catx('!',of c{*}),1);
  result='same';
  do I=1 to dim(c) while (result='same');
    if NOT(c{i}=word or c{i}='') then result='diff';
  end;

run;

INFO: Character variables have defaulted to a length of 200 at the places given by:

(Line):(Column). Truncation can result.

210:3 word

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 5181 views
  • 0 likes
  • 4 in conversation