BookmarkSubscribeRSS Feed
Mgarret
Obsidian | Level 7

Hi all--

A have a series of datasets with variables that start with a prefix such as  Q30_ , Q41_ , Q52_.  The variables all have a value tagged onto the end. Example: Q30_ 1, Q30_ 2, Q30_ 3 and so on. The amount of varibles will be different in each dataset -- so each dataset can be different depending on how many variables with a certain prefix are generated.

The dataset could look like this:

Q30_1Q30_2Q30_3Q30_4Q30_5Q41_1Q41_2Q41_3








Or the dataset could look like:

Q30_1 Q30_2Q30_3Q41_1Q41_2Q41_3Q41_4Q41_5Q41_6Q41_7










I need to merge these variables with prefixes into other datasets.

Example: For Dataset A, I need to bring in all the variables which start with Q52_, Q3_ and Q17_ from Dataset B which still need to be in numerical order by the value they have tagged on to the prefix. And then I need to bring in the variables  LastName, Agency and Status from Dataset C.

So the Final Dataset could look like this:

FirstNameLastNameQ52_1Q52_2 Q52_3Agency Q3_1Q3_2Q3_3Q3_4Status Q17_1Q17_2Q17_3Q17_4

The amount of variables that can be brought in could be different every time. So in another situation  Q52_ goes up to Q52_14 and Q17_ only goes up to Q17_2

Can a create an array using each variable Prefix (im not sure how to do that)?  Is there a way to select only the variables that start with "Q30_" and then select only the variables which start with "Q55_" and so on ?

Any assistance is greatly appreciated. Thanks.

6 REPLIES 6
Linlin
Lapis Lazuli | Level 10

Yes, you can.

example:

data have;

input q30_1 q30_2 q30_3 aa bb;

cards;

1 2 3 4 5

;

data want;

  set have (keep=q30_:);

run;

proc print;run;

Mgarret
Obsidian | Level 7

Thanks.  I have tried that and it does not seem to work with multiple questions.

Linlin
Lapis Lazuli | Level 10

Using the sample data created by Patrick:

data have_A;

  id=1;

  array  Q3_ {5} 8. (5*2);

  array  Q17_ {9} 8. (9*20);

  array  Q41_ {10} 8. (10*1);

  array  Q52_ {10} 8. (10*1);

  someOtherVar=0;

  output;

run;

data have_B;

  id=1;

  FirstName='fn';

  LastName='ln';

  Agency='A';

  Status='1';

run;

data want;

merge have_a have_b;

by id;

run;

proc sql noprint;

  select name into :q3 separated by ' '

  from dictionary.columns

    where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q3_'

      order by name;

  select name into :q17 separated by ' '

    from dictionary.columns

      where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q17'

      order by name;

   select name into :q41 separated by ' '

    from dictionary.columns

      where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q41'

       order by name;

    select name into :q52 separated by ' '

  from dictionary.columns

    where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q52'

       order by name;

  quit;

  data want;

     retain firstname lastname &q52 agency &q3 status &q52 &q41;

          set want(drop=id);

  run;

Patrick
Opal | Level 21

Not the most efficient way from a processing point of view - but easy to code.

For the code to work there must always be at least one variabe Q3_..., Q17_... and Q52_... If this is not the case then an approach querying the dictionary tables and creating a attrib statement (or a SQL Select) would be an option. There are already heaps of such examples in the forums as you are not the first who wants variables in a specific order in a dataset.

data have_A;
  id=1;
  array  Q3_ {5} 8. (5*2);
  array  Q17_ {9} 8. (9*20);
  array  Q41_ {10} 8. (10*1);
  array  Q52_ {10} 8. (10*1);
  someOtherVar=0;
  output; output;
run;

data have_B;
  id=1;
  FirstName='fn';
  LastName='ln';
  Agency='A';
  Status='1';
run;

data want;
  merge
    have_B (keep=ID FirstName LastName)
    have_A (keep=ID Q52_:)
    have_B (keep=ID Agency)
    have_A (keep=ID Q3_:)
    have_B (keep=ID Status)
    have_A (keep=ID Q17_:)
    ;
  by id;
run;

proc print data=want;
run;

Mgarret
Obsidian | Level 7

Thanks you.

Question: 

I'm not sure what the parts in the code ive bolded mean.....

  array  Q3_ {5} 8. (5*2);

  array  Q17_ {9} 8. (9*20);

  array  Q41_ {10} 8. (10*1);

  array  Q52_ {10} 8. (10*1);

Patrick
Opal | Level 21

That's just the part creating some sample data.

If you look up in the docu the array statement you'll find there that this is a way to create variables and assign them a value (5*2: 5 array elements getting value of 2 assigned).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 852 views
  • 3 likes
  • 3 in conversation