Hi all--
I am trying to merge two datasets together. When the datasets are merged the variables need to go in a certain order. For dataset Have2 I have series questions which all start with the same prefix like Q15_1 Q15_2 Q15_3...... or like Q55_1 Q55_2 Q55_3 Q55_4 Q55_5....... The value _# tagged on the end of the variable can change. To take this into account im useing the wildcard ":" to generate lists of variables with these prefixes.
Please see my code below.
Here is my issue: Variables that start with Q55 or Q15 wich are also contained in Have1 are being pulled in to Want, which they should not be. These types of variables need to only be pulled in from Have2. Again, the variables brought in from Have1 and Have2 must fall into a certain order for Want.
data Want;
merge Have1 Have2;
keep
CASE_NO /*from Have1 */
Agency_Name /*from Have1 */
Program_Name /*from Have1 */
Q55: /*from Have2 */
CONNX_CaseID /*from Have1 */
Q15: /*from Have2 */
Case_Name /*from Have1 */
Case_Age /*from Have1 */
by CASE_NO;
run;
Any help is greatly aperciated. Thank you.
change your code to:
data Want;
merge Have1(keep=case_: Agency_Name Program_Name CONNX_CaseID)
Have2(keep=case_no q55: q15:);
by CASE_NO;
run;
Linlin
Ok. I see... and then use another Data step to organize the variables in their correct order, right?
Why? As long as you aren't doing any data manipulation other than the merge, why not include a retain statement within the same datastep as the merge .. right after the initial data statement?
Ok. Thanks. I am not too familar with the retain statement so it might be easier for me to just make another datastep.
You will still need to use a retain, length of other statement that can affect variable order. The only time a separate datastep is needed is when you are doing other things like computes, if then computes, etc.
Its use is simply the word retain, followed by a space, followed by all of the variables you want to put at the left most side of the record, in the order that you want them to appear, separated by spaces, and ending the statement with a semicolon.
The only peculiarity, but this goes with any other statement you might use to reorder your data, is that the statement must appear BEFORE the set (or in your case merge) statement.
Why do you need the variables in a certain order. Just becuase it's nice?
Any time when you query data can specify a desired variable order, or you can have a view (or information map) on top of the table.
/Linus
Why not simply drop them in your merge statement? I.e.,
data Want;
merge Have1 (drop=q55: q15:) Have2;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.