Hi all--
A have a series of datasets with variables that start with a prefix such as Q30_ , Q41_ , Q52_. The variables all have a value tagged onto the end. Example: Q30_ 1, Q30_ 2, Q30_ 3 and so on. The amount of varibles will be different in each dataset -- so each dataset can be different depending on how many variables with a certain prefix are generated.
The dataset could look like this:
Q30_1 | Q30_2 | Q30_3 | Q30_4 | Q30_5 | Q41_1 | Q41_2 | Q41_3 |
---|---|---|---|---|---|---|---|
Or the dataset could look like:
Q30_1 | Q30_2 | Q30_3 | Q41_1 | Q41_2 | Q41_3 | Q41_4 | Q41_5 | Q41_6 | Q41_7 |
---|---|---|---|---|---|---|---|---|---|
I need to merge these variables with prefixes into other datasets.
Example: For Dataset A, I need to bring in all the variables which start with Q52_, Q3_ and Q17_ from Dataset B which still need to be in numerical order by the value they have tagged on to the prefix. And then I need to bring in the variables LastName, Agency and Status from Dataset C.
So the Final Dataset could look like this:
FirstName | LastName | Q52_1 | Q52_2 | Q52_3 | Agency | Q3_1 | Q3_2 | Q3_3 | Q3_4 | Status | Q17_1 | Q17_2 | Q17_3 | Q17_4 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
The amount of variables that can be brought in could be different every time. So in another situation Q52_ goes up to Q52_14 and Q17_ only goes up to Q17_2
Can a create an array using each variable Prefix (im not sure how to do that)? Is there a way to select only the variables that start with "Q30_" and then select only the variables which start with "Q55_" and so on ?
Any assistance is greatly appreciated. Thanks.
Yes, you can.
example:
data have;
input q30_1 q30_2 q30_3 aa bb;
cards;
1 2 3 4 5
;
data want;
set have (keep=q30_:);
run;
proc print;run;
Thanks. I have tried that and it does not seem to work with multiple questions.
Using the sample data created by Patrick:
data have_A;
id=1;
array Q3_ {5} 8. (5*2);
array Q17_ {9} 8. (9*20);
array Q41_ {10} 8. (10*1);
array Q52_ {10} 8. (10*1);
someOtherVar=0;
output;
run;
data have_B;
id=1;
FirstName='fn';
LastName='ln';
Agency='A';
Status='1';
run;
data want;
merge have_a have_b;
by id;
run;
proc sql noprint;
select name into :q3 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q3_'
order by name;
select name into :q17 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q17'
order by name;
select name into :q41 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q41'
order by name;
select name into :q52 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q52'
order by name;
quit;
data want;
retain firstname lastname &q52 agency &q3 status &q52 &q41;
set want(drop=id);
run;
Not the most efficient way from a processing point of view - but easy to code.
For the code to work there must always be at least one variabe Q3_..., Q17_... and Q52_... If this is not the case then an approach querying the dictionary tables and creating a attrib statement (or a SQL Select) would be an option. There are already heaps of such examples in the forums as you are not the first who wants variables in a specific order in a dataset.
data have_A;
id=1;
array Q3_ {5} 8. (5*2);
array Q17_ {9} 8. (9*20);
array Q41_ {10} 8. (10*1);
array Q52_ {10} 8. (10*1);
someOtherVar=0;
output; output;
run;
data have_B;
id=1;
FirstName='fn';
LastName='ln';
Agency='A';
Status='1';
run;
data want;
merge
have_B (keep=ID FirstName LastName)
have_A (keep=ID Q52_:)
have_B (keep=ID Agency)
have_A (keep=ID Q3_:)
have_B (keep=ID Status)
have_A (keep=ID Q17_:)
;
by id;
run;
proc print data=want;
run;
Thanks you.
Question:
I'm not sure what the parts in the code ive bolded mean.....
array Q3_ {5} 8. (5*2);
array Q17_ {9} 8. (9*20);
array Q41_ {10} 8. (10*1);
array Q52_ {10} 8. (10*1);
That's just the part creating some sample data.
If you look up in the docu the array statement you'll find there that this is a way to create variables and assign them a value (5*2: 5 array elements getting value of 2 assigned).
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.