BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
biglerc
Obsidian | Level 7

Hello!

 

Just wanted to see if there is a straight-forward way to do this without having a ton of if-then statements.  I have a dataset like this:

 

Observation 1:

Var1 =S

Var2 = D

Var3 = S

Var4 = <missing value>

 

Observation 2:

Var1 =<missing value>

Var2 = D

Var3 = D

Var4 = <missing value>

 

In my dataset there can be every combination of 'S', 'D' or empty values among the four variables.  What I need to do, row by row, is check if the populated variables are equal and ignore the variables that are empty . 

 

I started writing code with if-then statements that checks if var1 ne '' and var2 eq '' and var3 eq '' and var4 eq ''....blah blah blah but I'm going to have to check for every possible combination of populated/empty vars and it's getting too long. 

 

Is there an easier way to do this? 

 

Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You can sort the character variables var1 through var4, by declaring them as part of an array, and using SORTC.  Blanks will sort low.

 

If the first non-blank is identical to the last non-blank (and you have 2 or more non-blanks) then all your non-blanks are the same.  Use that info to make your new variable, and re-read the observation to unsort the var1-var4 values:

 

data want ;
  set have;

  array v {*} $ var1-var4;
  call sortc(of v{*});
  do i=1 to 4 until(v{i}^=' ');  end;  /*Find first non-blank */
  n_nonblank=5-i;
  if n_nonblank<=1 then all_the_same='N.A.';  /*Not Applicable */
  else if v{i}=v{4} then all_the_same='Yes';  /*First non-blank equals last */
  else all_the_same='No';

  set have;   /*Re-read var1-var4 to undo the sort */
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

8 REPLIES 8
ballardw
Super User

You need to show us what the result looks like. Such as show what you want for the cases when variables are equal.

 

Also, are your values actually single character as shown or did you simplify. This may be an important difference as there is a function COUNTC that works with single characters:

data example;
   string= 'abdssdeabd';
   numa = countc(string,'a');
   numb = countc(string,'b');
   numd = countc(string,'d');
   nums = countc(string,'s');
run;

and it is easy to make a string that doesn't contain spaces to worry about: cats(var1,var2,var3,var4);

biglerc
Obsidian | Level 7

Thank you for the reply!  I'm sorry my example didn't provide enough information - I've attached an Excel file with a portion of my dataset so you can see exactly what I'm working with.  

 

My task if to check if all values populated in var1-var4 match (ignoring blanks).  If populated values match then the Result is 'Reviewer'.  If any of them are different then the Result is 'Chair'.

 

Thanks again!

Tom
Super User Tom
Super User

Are you literally look for those two character values? Or are S and D short had for some longer string?  Or perhaps the variables are actually numeric?

 

And what exactly are trying to detect?

 

If you want to find the observations where there are DIFFERENT values.  Like your first observations then test if both values appear.

different = indexc(cats(of var1-var4),'S') and indexc(cats(of var1-var4),'D');

Or if S and D are really something longer like 'SAM' and "DOUG' then perhaps use WHICHC()?

different = whichc('SAM',of var1-var4) and whichc('DOUG', of var1-var4);

If you want to find cases where S appear more than once you could use WHICHC() for that also.  Look for cases where you find a different variable when searching from left than from the right.

multiple_S = whichc('S', of  var1-var4) ne 5-whichc('S', of var4-var1);

Do the same thing to detect multiple values of 'D'.

biglerc
Obsidian | Level 7

Hi Tom,

 

Thanks for your reply!  My initial example of 'S' and 'D' were just me trying to give a simplified example to see if someone could just help me with how the logic would go.  Attached is some actual data and what my result needs to be, if this helps.  My task is to check if all populated variables in var1-var4 are equal or not.  If they are equal then my result should be 'Reviewer'.  If Any differ than my result should be 'Chair'.  

 

Thanks!

biglerc
Obsidian | Level 7

PS it is also important I note that var1 may be empty as well

 

Thanks for your help, this is saving me!

mkeintz
PROC Star

You can sort the character variables var1 through var4, by declaring them as part of an array, and using SORTC.  Blanks will sort low.

 

If the first non-blank is identical to the last non-blank (and you have 2 or more non-blanks) then all your non-blanks are the same.  Use that info to make your new variable, and re-read the observation to unsort the var1-var4 values:

 

data want ;
  set have;

  array v {*} $ var1-var4;
  call sortc(of v{*});
  do i=1 to 4 until(v{i}^=' ');  end;  /*Find first non-blank */
  n_nonblank=5-i;
  if n_nonblank<=1 then all_the_same='N.A.';  /*Not Applicable */
  else if v{i}=v{4} then all_the_same='Yes';  /*First non-blank equals last */
  else all_the_same='No';

  set have;   /*Re-read var1-var4 to undo the sort */
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
biglerc
Obsidian | Level 7

@mkeintz This worked perfectly, THANK YOU, I appreciate your help and I learned something today!!!

Patrick
Opal | Level 21

Below another coding option that doesn't use call sortc() and though won't change the values in your source variables.

data have;
  infile datalines truncover dsd;
  input (var1 var2 var3 var4 RESULT) (:$50.);
  datalines;
SD - 03-AUG-2018,SD - 03-AUG-2018,SD - 03-AUG-2018,,Reviewer
NE - 07-SEP-2018,,NE - 07-SEP-2018,,Reviewer
SD - 08-AUG-2019,SD - 08-AUG-2019,SD - 08-AUG-2019,,Reviewer
CR - 12-SEP-2019,SD - 12-SEP-2019,SD - 12-SEP-2019,,Chair
,,,,
,CR - 12-SEP-2019,,NE - 07-SEP-2018,Chair
,NE - 07-SEP-2018,,NE - 07-SEP-2018,Reviewer
,,NE - 07-SEP-2018,,Reviewer,,
CR - 12-SEP-2019,,,NE - 07-SEP-2018,Chair
;

data want(drop=_:);
  set have;
  length result_derived $8 _first_nonblank $50;
  array _a_vars {*} var1 - var4;

  _first_nonblank=coalescec(of _a_vars[*]);

  if not missing(_first_nonblank) then
    do;
      result_derived='Reviewer';
      do _i=2 to dim(_a_vars);
        if not missing(_a_vars[_i]) and _first_nonblank ne _a_vars[_i] then
          do;
            result_derived='Chair';
            leave;
          end;
      end;
    end;

run;

proc print data=want;
run;

Or below a variant of above logic that avoids comparing a value with itself in case the  first array element is missing.

data want(drop=_:);
  set have;
  length result_derived $8;
  array _a_vars {*} var1 - var4;

  do _i=2 to dim(_a_vars);
    if missing(_k) and not missing(_a_vars[_i-1]) then _k=_i-1;
    else
    if not missing(_k) then 
      if _a_vars[_k] ne _a_vars[_i] then
      do;
        result_derived='Chair';
        leave;
      end;
  end;
  if missing(result_derived) and not missing(_k) then result_derived='Reviewer';
run;

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2430 views
  • 0 likes
  • 5 in conversation