Hi all--
A have a series of datasets with variables that start with a prefix such as Q30_ , Q41_ , Q52_. The variables all have a value tagged onto the end. Example: Q30_ 1, Q30_ 2, Q30_ 3 and so on. The amount of varibles will be different in each dataset -- so each dataset can be different depending on how many variables with a certain prefix are generated.
The dataset could look like this:
Q30_1 | Q30_2 | Q30_3 | Q30_4 | Q30_5 | Q41_1 | Q41_2 | Q41_3 |
---|---|---|---|---|---|---|---|
Or the dataset could look like:
Q30_1 | Q30_2 | Q30_3 | Q41_1 | Q41_2 | Q41_3 | Q41_4 | Q41_5 | Q41_6 | Q41_7 |
---|---|---|---|---|---|---|---|---|---|
I need to merge these variables with prefixes into other datasets.
Example: For Dataset A, I need to bring in all the variables which start with Q52_, Q3_ and Q17_ from Dataset B which still need to be in numerical order by the value they have tagged on to the prefix. And then I need to bring in the variables LastName, Agency and Status from Dataset C.
So the Final Dataset could look like this:
FirstName | LastName | Q52_1 | Q52_2 | Q52_3 | Agency | Q3_1 | Q3_2 | Q3_3 | Q3_4 | Status | Q17_1 | Q17_2 | Q17_3 | Q17_4 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
The amount of variables that can be brought in could be different every time. So in another situation Q52_ goes up to Q52_14 and Q17_ only goes up to Q17_2
Can a create an array using each variable Prefix (im not sure how to do that)? Is there a way to select only the variables that start with "Q30_" and then select only the variables which start with "Q55_" and so on ?
Any assistance is greatly appreciated. Thanks.
Yes, you can.
example:
data have;
input q30_1 q30_2 q30_3 aa bb;
cards;
1 2 3 4 5
;
data want;
set have (keep=q30_:);
run;
proc print;run;
Thanks. I have tried that and it does not seem to work with multiple questions.
Using the sample data created by Patrick:
data have_A;
id=1;
array Q3_ {5} 8. (5*2);
array Q17_ {9} 8. (9*20);
array Q41_ {10} 8. (10*1);
array Q52_ {10} 8. (10*1);
someOtherVar=0;
output;
run;
data have_B;
id=1;
FirstName='fn';
LastName='ln';
Agency='A';
Status='1';
run;
data want;
merge have_a have_b;
by id;
run;
proc sql noprint;
select name into :q3 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q3_'
order by name;
select name into :q17 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q17'
order by name;
select name into :q41 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q41'
order by name;
select name into :q52 separated by ' '
from dictionary.columns
where libname='WORK' and memname='WANT' and upcase(substr(name,1,3))='Q52'
order by name;
quit;
data want;
retain firstname lastname &q52 agency &q3 status &q52 &q41;
set want(drop=id);
run;
Not the most efficient way from a processing point of view - but easy to code.
For the code to work there must always be at least one variabe Q3_..., Q17_... and Q52_... If this is not the case then an approach querying the dictionary tables and creating a attrib statement (or a SQL Select) would be an option. There are already heaps of such examples in the forums as you are not the first who wants variables in a specific order in a dataset.
data have_A;
id=1;
array Q3_ {5} 8. (5*2);
array Q17_ {9} 8. (9*20);
array Q41_ {10} 8. (10*1);
array Q52_ {10} 8. (10*1);
someOtherVar=0;
output; output;
run;
data have_B;
id=1;
FirstName='fn';
LastName='ln';
Agency='A';
Status='1';
run;
data want;
merge
have_B (keep=ID FirstName LastName)
have_A (keep=ID Q52_:)
have_B (keep=ID Agency)
have_A (keep=ID Q3_:)
have_B (keep=ID Status)
have_A (keep=ID Q17_:)
;
by id;
run;
proc print data=want;
run;
Thanks you.
Question:
I'm not sure what the parts in the code ive bolded mean.....
array Q3_ {5} 8. (5*2);
array Q17_ {9} 8. (9*20);
array Q41_ {10} 8. (10*1);
array Q52_ {10} 8. (10*1);
That's just the part creating some sample data.
If you look up in the docu the array statement you'll find there that this is a way to create variables and assign them a value (5*2: 5 array elements getting value of 2 assigned).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.