Hi all—
I have two issues. Any assistance on either will be greatly appreciated
#1
I've posted regarding this issue a once already but I’m still having some trouble.
I have to merge two datasets together and the variables need to go into a certain order for presentation purposes. There are variables which I have to bring in using the wildcard symbol “:” because the number of variables with various question prefixes will vary from dataset to dataset (Q52a_1 Q52a_2 Q52a_3 Q52a_4……)
Below was one suggestion. I have been playing on variations of this and I can’t get it to work
data Want;
merge
Have1 (keep= Case_Number)
Have2 (keep=ID )
Have1 (keep=FirstName)
Have1 (keep=LastName)
Have2 (keep=Q52a:)
Have1 (keep=Agency_Name)
Have2 (keep=Q57a:)
Have2 (keep=Q5)
Have1 (keep=Program_Name)
Have1 (keep=Q4)
Have2 (keep=Q64:)
;
by ID ;
run;
#2
In addition, I need to define the variables that are brought into the dataset Want by the value of a variable (in this case Age). I know the example below is TOTALLY wrong but I’m just trying illustrate the issue.
data want;
merge Have1 Have2;
if Have1 age=1 then do;
Have1 (keep= Case_Number)
Have2 (keep=ID )
Have1 (keep=FirstName)
Have1 (keep=LastName)
Have2 (keep=Q52a:)
Have1 (keep=Agency_Name)
Have2 (keep=Q57a:)
Have2 (keep=Q5)
Have1 (keep=Program_Name)
Have1 (keep=Q4)
Have2 (keep=Q64:)
Have1 (keep=Assessent1);
Output want;
if Have1 age=2 then do;
Have1 (keep= Case_Number)
Have2 (keep=ID )
Have1 (keep=FirstName)
Have1 (keep=LastName)
Have2 (keep=Q52a:)
Have1 (keep=Agency_Name)
Have2 (keep=Q57a:)
Have2 (keep=Q5)
Have1 (keep=Program_Name)
Have1 (keep=Q4)
Have2 (keep=Q64:)
Have1 (keep=Assessent1)
Have1 (keep=Assessent2);
Output want;
if Have1 age=3 then do;
Have1 (keep= Case_Number)
Have2 (keep=ID )
Have1 (keep=FirstName)
Have1 (keep=LastName)
Have2 (keep=Q52a:)
Have1 (keep=Agency_Name)
Have2 (keep=Q57a:)
Have2 (keep=Q5)
Have1 (keep=Program_Name)
Have1 (keep=Q4)
Have2 (keep=Q64:)
Have1 (keep=Assessent1)
Have1 (keep=Assessent2)
Have1 (keep=Assessent3);
Output want;
if Have1 age=4 then do;
Have1 (keep= Case_Number)
Have2 (keep=ID )
Have1 (keep=FirstName)
Have1 (keep=LastName)
Have2 (keep=Q52a:)
Have1 (keep=Agency_Name)
Have2 (keep=Q57a:)
Have2 (keep=Q5)
Have1 (keep=Program_Name)
Have1 (keep=Q4)
Have2 (keep=Q64:)
Have1 (keep=Assessent1)
Have1 (keep=Assessent2)
Have1 (keep=Assessent3)
Have1 (keep=Assessent4);
Output want;
by ID ;
run;
Again, thanks!
Another way is changing the order of variables after you merging them. Like (not tested code)
data want; merge have1 have2; by id; run; proc sql; select name into : list1 separated by ' ' from dictionary.columns where name like 'Q52a%' and libname='WORK' and memname='WANT'; select name into : list2 separated by ' ' from dictionary.columns where name like 'Q57a%' and libname='WORK' and memname='WANT' ; select name into : list2 separated by ' ' from dictionary.columns where name like 'Q64%' and libname='WORK' and memname='WANT' ; quit; data want; retain Case_Number id FirstName LastName &list1 Agency_Name &list2 Q5 program_name Q4 &list3; set want; run;
Ksharp
Matthew,
I don't like to be blunt, but I really don't know what you are asking.
However, that said, I'll offer the following advice anyhow.
1. Forget about the order until you have all of your data in your final file. Then, at that point, run the following step to put the data in the precise order you want:
data want;
set want;
retain (desiredvar1 desiredvar2 etc);
run;
2. Since you only want one resulting file, all of the variables will be in it. I think that the best you can do is set the variables you don't want (for a particular file) to missing. There is a datastep option you can use to identify which file data comes from, and the call missing function to set any value within a record to missing.
Maybe I just don't understand.
You're not being blunt at all. I'm new to SAS and I REALLY appreciate your help.
I see, yes, I need to Merge Have1 and Have2 first to produce the FinalData set.
Then I can order the variables.
data Agency1
set FinalData
retain( Case_Number
ID
FirstName
LastName
Q52a:
Agency_Name
Q57a:
Q5
Program_Name
Q4
Q64:
Age);
run;
2.
Yes, this is a little confusing (sorry).
The data I use comes from excel spreadsheets generated from another database, so I don't have control over how the raw data organized when it comes to me.
I have a variable Age which can go up to a value of 10. This variable determines how many extra variables will be included in the dataset.
If the max value of Age is 10, 10 Assessment variables will be in the dataset: Assessment1 to Assessment10.
If the max value of Age is 4, 4 Assessment variables will be in the dataset: Assessment1 to Assessment4.
When i recive the dataset if the max value of Age is 4 there won't be Assessment5, Assessment6, Assessment7, Assessment9, Assessment10 in that particular set.
So, there are 10 different possible versions.
I wanted to try and write the program so i would not have to change the code to reflect how many Assessment Variables are brought in.
Does this make more sense?
Thanks so much.
Why can't you just do the same thing with ASSESSMENT variables than you already do with all the Q variables, so: retain ..... ASSESSMENT: ....; ?
Another way is changing the order of variables after you merging them. Like (not tested code)
data want; merge have1 have2; by id; run; proc sql; select name into : list1 separated by ' ' from dictionary.columns where name like 'Q52a%' and libname='WORK' and memname='WANT'; select name into : list2 separated by ' ' from dictionary.columns where name like 'Q57a%' and libname='WORK' and memname='WANT' ; select name into : list2 separated by ' ' from dictionary.columns where name like 'Q64%' and libname='WORK' and memname='WANT' ; quit; data want; retain Case_Number id FirstName LastName &list1 Agency_Name &list2 Q5 program_name Q4 &list3; set want; run;
Ksharp
This looks closer to what I'm trying to accomplish. i will try this tomorrow and let you know. Thanks!
Hi Ksharp--
After this step:
data want;
retain Case_Number id FirstName LastName
&list1 Agency_Name &list2 Q5 program_name Q4 &list3;
set want;
run;
I got these errors:
WARNING: Apparent symbolic reference LIST1 not resolved.
WARNING: Apparent symbolic reference LIST2 not resolved.
WARNING: Apparent symbolic reference LIST3 not resolved.
I am not sure how to define the Macro in this case. I am fimiliar with %let but I'm not sure how to utilize it here.
Matt
Hold on I see what I did. This works perfectly!!!! Thanks so much!
Hi KSharp--
Thanks again. I do have one more question.
I want to try and use Proc SQL as oppssed to the Data Step to make the final organized data steps. Its easier when outputting to spreadsheets.
I got an error message I don't understand.
Proc SQL;
Create table FINAL as
select
Quest_,
Case_Number,
Case_FirstName,
Case_FirstName,
&list1,
Agency_Name,
&list2,
cra5,
program_name,
cra14,
&list3
from work.want;
quit;
NOTE: Line generated by the macro variable "LIST1".
1 cra52a1 cra52a_1 cra52a_2 cra52a_3 cra52a_4 cra52a_5 cra52a_6 cra52a_7 cra52a_8
--------
22
76
ERROR 22-322: Syntax error, expecting one of the following: a quoted string, !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, AS,
BETWEEN, CONTAINS, EQ, EQT, FORMAT, FROM, GE, GET, GT, GTT, IN, INFORMAT, INTO, IS, LABEL, LE, LEN, LENGTH, LET, LIKE, LT, LTT, NE,
NET, NOT, NOTIN, OR, TRANSCODE, ^, ^=, |, ||, ~, ~=.
When you created your list variables you separated the strings with spaces when, for sql, they would have to be separated by commas
Hi Art--
I'm not sure how I would add ',' to this statment.
select name into : list1 separated by ' '
from dictionary.columns
where name like 'cra52a_%' and libname='WORK' and memname='WANT';
replace the separated by ' ' with separated by ','
GREAT THANKS!
If you know the intended variables then define them before the merge statement. This is useful when reading data from Excel spreadsheets as Excel is NOT a database and so SAS has to guess at the intended type and length of the variables (columns) in the Excel file. If the list of variables varies in the source data you will probably want to just define ALL variables for the maximum possible case. That way you will have a consistent structure that you can use for all cases. The other annoying thing that SAS does when importing data from Excel (and other databases) is to attach formats and informats to the character variables. You should remove those as you will see truncation in your listings of the variables when the format is shorter than the variable length.
data want ;
length
Case_Number 8
ID 8
FirstName LastName $50
Q52a1 - Q52a10 8
Agency_Name $50
Q57a1 - Q57a10 8
Q5 8
Program_Name $50
;
merge have1 have2 ;
by id;
format _character_;
informat _character_;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.