Help using Base SAS procedures

Selecting Alternating Sets of Variables

Reply
Frequent Contributor
Posts: 142

Selecting Alternating Sets of Variables

Hi all--

A have a series of datasets with variables that start with a prefix such as  Q30_ , Q41_ , Q52_.  The variables all have a value tagged onto the end. Example: Q30_ 1, Q30_ 2, Q30_ 3 and so on. The amount of varibles will be different in each dataset -- so each dataset can be different depending on how many variables with a certain prefix are generated.

The dataset could look like this:

Q30_1Q30_2Q30_3Q30_4Q30_5Q41_1Q41_2Q41_3








Or the dataset could look like:

Q30_1 Q30_2Q30_3Q41_1Q41_2Q41_3Q41_4Q41_5Q41_6Q41_7










I need to merge these variables with prefixes into other datasets.

Example: For Dataset A, I need to bring in all the variables which start with Q52_, Q3_ and Q17_ from Dataset B which still need to be in numerical order by the value they have tagged on to the prefix. And then I need to bring in the variables  LastName, Agency and Status from Dataset C.

So the Final Dataset could look like this:

FirstNameLastNameQ52_1Q52_2 Q52_3Agency Q3_1Q3_2Q3_3Q3_4Status Q17_1Q17_2Q17_3Q17_4

The amount of variables that can be brought in could be different every time. So in another situation  Q52_ goes up to Q52_14 and Q17_ only goes up to Q17_2

Can a create an array using each variable Prefix (im not sure how to do that)?  Is there a way to select only the variables that start with "Q30_" and then select only the variables which start with "Q55_" and so on ?

Any assistance is greatly appreciated. Thanks.

Super Contributor
Posts: 1,636

Selecting Alternating Sets of Variables

Yes, you can.

example:

data have;

input q30_1 q30_2 q30_3 aa bb;

cards;

1 2 3 4 5

;

data want;

  set have (keep=q30_Smiley Happy;

run;

proc print;run;

Frequent Contributor
Posts: 142

Selecting Alternating Sets of Variables

Thanks.  I have tried that and it does not seem to work with multiple questions.

Super Contributor
Posts: 1,636

Re: Selecting Alternating Sets of Variables

Using the sample data created by Patrick:

data have_A;

  id=1;

  array  Q3_ {5} 8. (5*2);

  array  Q17_ {9} 8. (9*20);

  array  Q41_ {10} 8. (10*1);

  array  Q52_ {10} 8. (10*1);

  someOtherVar=0;

  output;

run;

data have_B;

  id=1;

  FirstName='fn';

  LastName='ln';

  Agency='A';

  Status='1';

run;

data want;

merge have_a have_b;

by id;

run;

proc sql noprint;

  select name into :q3 separated by ' '

  from dictionary.columns

    where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q3_'

      order by name;

  select name into :q17 separated by ' '

    from dictionary.columns

      where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q17'

      order by name;

   select name into :q41 separated by ' '

    from dictionary.columns

      where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q41'

       order by name;

    select name into :q52 separated by ' '

  from dictionary.columns

    where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q52'

       order by name;

  quit;

  data want;

     retain firstname lastname &q52 agency &q3 status &q52 &q41;

          set want(drop=id);

  run;

Respected Advisor
Posts: 3,908

Selecting Alternating Sets of Variables

Not the most efficient way from a processing point of view - but easy to code.

For the code to work there must always be at least one variabe Q3_..., Q17_... and Q52_... If this is not the case then an approach querying the dictionary tables and creating a attrib statement (or a SQL Select) would be an option. There are already heaps of such examples in the forums as you are not the first who wants variables in a specific order in a dataset.

data have_A;
  id=1;
  array  Q3_ {5} 8. (5*2);
  array  Q17_ {9} 8. (9*20);
  array  Q41_ {10} 8. (10*1);
  array  Q52_ {10} 8. (10*1);
  someOtherVar=0;
  output; output;
run;

data have_B;
  id=1;
  FirstName='fn';
  LastName='ln';
  Agency='A';
  Status='1';
run;

data want;
  merge
    have_B (keep=ID FirstName LastName)
    have_A (keep=ID Q52_Smiley Happy
    have_B (keep=ID Agency)
    have_A (keep=ID Q3_Smiley Happy
    have_B (keep=ID Status)
    have_A (keep=ID Q17_Smiley Happy
    ;
  by id;
run;

proc print data=want;
run;

Frequent Contributor
Posts: 142

Selecting Alternating Sets of Variables

Thanks you.

Question: 

I'm not sure what the parts in the code ive bolded mean.....

  array  Q3_ {5} 8. (5*2);

  array  Q17_ {9} 8. (9*20);

  array  Q41_ {10} 8. (10*1);

  array  Q52_ {10} 8. (10*1);

Respected Advisor
Posts: 3,908

Selecting Alternating Sets of Variables

That's just the part creating some sample data.

If you look up in the docu the array statement you'll find there that this is a way to create variables and assign them a value (5*2: 5 array elements getting value of 2 assigned).

Ask a Question
Discussion stats
  • 6 replies
  • 131 views
  • 3 likes
  • 3 in conversation