DATA Step, Macro, Functions and more

Array while combining two datasets?

Reply
Frequent Contributor
Posts: 138

Array while combining two datasets?

Hi all,

I have two datasets. Dataset 1 is at the person level, and each person between 1 and 15 variables containing values of a categorical variable. Basically, it's like this:

ID     Var1     Var2............Var15

A       3          40                 9

B     14           8                  .

The other dataset is essentially a map connecting values of variables 1-15 to a description.

Value      Desc

1             abc

2             def

3             ghi

...

100         zzz

How can I combine the datasets such that for each person's variables 1-15, there will be a description next to it of what that value means?

EG:

ID     Var1     Var1_desc     Var2     Var2_desc..........Var15     Var15_desc

A        3          ghi               40          xyz                    9          jkl

...and so on for each person.

Super User
Posts: 19,768

Re: Array while combining two datasets?

Posted in reply to Walternate

1. Create a format from your second data set.

2. Apply the format using put statements to get the new variables.

3. Consider just using the formatted values perhaps?

Frequent Contributor
Posts: 138

Re: Array while combining two datasets?

Would I have to do that by hand? The second dataset has over 1,000 variable descriptions.

PROC Star
Posts: 7,467

Re: Array while combining two datasets?

Posted in reply to Walternate

No, you wouldn't have to create it by hand and, like Fareeza said, you may not have to or even want to create the extra 15 variables.  The following is one way to create and apply a format:

data codes;

  informat value 12.;

  informat Desc $50.;

  input Value Desc &;

  cards;

1             abc

2             def

3             ghi

8             something or other

9             ijk

14            something else

40            something else altogether

100           zzz

;

data ctrl;                                                                                                                            

  set codes (rename=(value=start

  Desc=label)) end=last;

  retain fmtname 'codes' type 'n';

  output;

  if last then do;

    hlo='O';

    label='***ERROR***';

    output;

  end;

run;

proc format library=work cntlin=ctrl;

run;

data have;

  input ID $ Var1-Var15;

  cards;

A  3 40 9  2 . . . . . . . . . . .

B 14  8 . 14 . . . . . . . . . . .

;

data want;

  set have;

  array vars(*) var1-var15;

  array var_desc(15) $50.;

  do _n_=1 to dim(vars);

    if not missing(vars{_n_}) then

     var_desc{_n_}=put(vars{_n_},codes.);

  end;

run;

data want;

  retain

  var1 var_desc1

  var2 var_desc2

  var3 var_desc3

  var4 var_desc4

  var5 var_desc5

  var6 var_desc6

  var7 var_desc7

  var8 var_desc8

  var9 var_desc9

  var10 var_desc10

  var11 var_desc11

  var12 var_desc12

  var13 var_desc13

  var14 var_desc14

  var15 var_desc16

  ;

  set want;

run;

Super User
Posts: 10,018

Re: Array while combining two datasets?

Posted in reply to Walternate

or Hash Table . and reorder variables is also a problem.

data codes;
  informat value 12.;
  informat Desc $50.;
  input Value Desc &;
  cards;
1             abc
2             def
3             ghi
8             something or other
9             ijk
14            something else
40            something else altogether
100           zzz
;
data have;
  input ID $ Var1-Var15;
  cards;
A  3 40 9  2 . . . . . . . . . . .
B 14  8 . 14 . . . . . . . . . . .
;
run;
data x;
 set have;
if _n_ eq 1  then do;
  if 0 then set codes;
  declare hash h(dataset:'codes');
   h.definekey('value');
   h.definedata('desc');
   h.definedone();
end;
 array v{*} var1-var15;
 n=0;
 do i=1 to dim(v);
  n+1;value=v{i};call missing(desc);rc=h.find();output;     
 end;
 drop var: i rc;
 run;
proc sql noprint;
 select distinct catt('x(where=(n=',put(n,best8. -l),') rename=(value=value_',put(n,best8. -l),' desc=desc_',put(n,best8. -l),'))') into : list separated by ' ' 
  from x
   order by n;
quit;
%put &list ;
data want;
 merge &list ;
 by id ;
 drop n;
run;

Xia Keshan

Ask a Question
Discussion stats
  • 4 replies
  • 246 views
  • 0 likes
  • 4 in conversation