BookmarkSubscribeRSS Feed
s100
Calcite | Level 5

Hello all,

Could anyone help me to figure out this problem? Thanks in advance!

My original dataset is like this:

IDABC
1234
25.3
3951
45..
5...

I'd like to create a new variable that indicates the order pattern from the minimum to the maximum. So the new dataset should look like this:

IDABCPattern
1234A_B_C
25.3C_A
3951C_B_A
45..A
5....

Any suggestion is appreciated! Thanks!

9 REPLIES 9
ballardw
Super User

A couple of details may need some clarification. Is this for only 3 variables or do you need something that works for an abitrary number of values? Are the actual variable names A B C or are you using a placeholder names for example? This is somewhat important as the length of Pattern needs to be set large enough to hold the longest possible string.

Also, rules for tie breaking? What is the result if A and C are the same value? What if all variables have the same value? And do you want an actual period '.' or a missing for pattern when all values are missing?

s100
Calcite | Level 5

Ballardw, thanks for bring these to my attention! You are absolutely right. I just made this as a simple example. In my real dataset, I have 8 variables and the actual names are 2-4 letters. To make the 'pattern' variable short, I could use numbers to replace the letters, say if pattern is A_B_C then I could use 1_2_3 instead.

And for the equal value, if A, B, and C are 1, 1, and 2 respectively, then the pattern could be AB_C.

Altal
Calcite | Level 5

ballardw is right; more info is helpful... sounds like you want to retain names of nonmissing variables; maybe  you can build on this...

data have;

input id a b c d;

cards ;

1 2 3 4 .

2 . 5 2 5

3 . . . .

4 . 3 2 .

5 . . 1 7

;

run;

data want;

     set have;

     array _vars a b c d;

     array _varN $ x1-x4;

     do over _varN;

       if _vars ne . then _varN= VNAME (_vars);    

     end;

     pattern=catx ("_",OF X1-x4);

run;

proc print ;

var id a b c d pattern;

run;

s100
Calcite | Level 5

Hello Altal,

Thanks for your answer. It is helpful. The new variable 'pattern' you make indicates the pattern of present/absent. But how to make the 'pattern' variable reflect the order?

Altal
Calcite | Level 5

There may be a smarter way, but you can use something like this... still some coded is needed for tied values..

data have;

     input id a b c d;

cards ;

1 2 3 4 .

2 . 5 2 5

3 . . . .

4 . 3 2 .

5 . . 1 7

6 1 1 . .

7 6 5 4 1

; run;

data want;  

     set have;  

     array _vars a b c d;  

     array _varN $ x1-x4;  

     do over _varN;     

          if _vars ne . then _varN= VNAME (_vars);           

     end;

run;

proc transpose data=want out=temp1;

     var a b c d;

     by id;

run;

proc transpose data=want out=temp2;

     var X1-X4;

by id;

run;

data a;

     merge temp1 temp2 (rename=(col1=col2));

     by id;

run;

proc sort data=a;

     by id col1;

run;

proc transpose data=a out=temp3 (drop=_NAME_) prefix=col;

     var col2; by id;

run;

data temp3;

     set temp3; pattern=catx ("_",OF col1-col4);

run;

proc print ;run;

s100
Calcite | Level 5

Hi Altal,

Your codes are very helpful! I still have one question about the equal value, for instance, id=2 the pattern should be 'c_bd' or 'c_db', because b=d=5. Is there an easier way to deal with that? Thanks!

data_null__
Jade | Level 19

I would probably take the obvious approach.

data have;
  input id a b c d;
cards;
1 2 3 4 .
2 . 5 2 5
3 . . . .
4 . 3 2 .
5 . . 1 7
6 1 1 . .
7 6 5 4 1
;;;;
   run;
proc transpose data=have out=have2;
   by id;
   run;
proc rank data=have2 out=ranks ties=low;
   by id;
   var col1;
   run;
proc sort data=ranks;
   by id col1;
   where not missing(col1);
   run;
data pattern;
   do until(last.id);
      do until(last.col1);
         set ranks;
         by id col1;
         length p pattern $128;
         p = cats(p,_name_);
        
end;
     
pattern = catx('_',pattern,p);
      call missing(p);
      end;
  
drop col1 _name_ p;
   run;
data pattern;
   merge have pattern;
   by id;
   run;
proc print;
  
run;

1234.a_b_c
2.525c_bd
3....
4.32.c_b
5..17c_d
611..ab
76541d_c_b_a


Patrick
Opal | Level 21

Just another approach with a single pass through the data:

data have;
  input id a b c d;
cards;
1 2 3 4 .
2 . 5 2 5
3 . . . .
4 . 3 2 .
5 . . 1 7
6 1 1 . .
7 6 5 4 1
;;;;
run;


data want(drop= _:);
  set have;

  length pattern $32.;
  array myarr {*} a b c d;

  if _n_=1 then
    do;
      length _value 8 _varname $32;
      dcl hash h (multidata:'y', ordered:'y', hashexp:3);
      dcl hiter hit ('h');
      _rc=h.defineKey('_value');
      _rc=h.defineData('_varname','_value');
      _rc=h.defineDone();
    end;

  _rc=h.clear();

  do _i=1 to dim(myarr);
    _value=myarr[_i];
    if missing(_value) then continue;
    _varname=vname(myarr[_i]);
    _rc=h.add();
  end;

  _rc = hit.first();
  _value_old=_value;

  do while (_rc = 0);
    if  _value_old=_value then
      do;
        pattern=cats(pattern,_varname);
      end;
    else 
      pattern=cats(pattern,'_',_varname);

    _value_old=_value;
    _rc = hit.next();
  end;

run;

yaswanthj
Calcite | Level 5

Hi ..

Small update to the @data_null_; code..

data pattern;
   do until(last.id);
      do until(last.col1);
         set ranks;
         by id col1;
         length p pattern $128;
         p = catX('_',p,_name_);/*Updated cats(p,_name_) to
catX('_',p,_name_)*/


        
end;
     
pattern = catx('_',pattern,p);
      call missing(p);
      end;
  
drop col1 _name_ p;
   run;


Hope this will helps..

Thanks,Yash.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1322 views
  • 3 likes
  • 6 in conversation