BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mgarret
Obsidian | Level 7

Hi all—

I have two issues. Any assistance on either will be greatly appreciated

#1

I've posted regarding this issue a once already but  I’m still having some trouble.

I have to merge two datasets together and the variables need to go into a certain order for presentation purposes. There are variables which I have to bring in using the wildcard symbol “:” because the number of variables with various question prefixes will vary from dataset to dataset (Q52a_1 Q52a_2 Q52a_3 Q52a_4……)

Below was one suggestion. I have been playing on variations of this and I can’t get it to work

data Want;                                                                                                                       

merge                                                                                                                                 

Have1 (keep= Case_Number)                                                                   

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

;                                                                             

by ID ;                                                                                                                           

run;

#2

In addition, I need to define the variables that are brought into the dataset Want by the value of a variable (in this case Age). I know the example below is TOTALLY wrong but I’m just trying illustrate the issue. 

data want;

merge Have1 Have2;

if Have1 age=1 then do;

Have1 (keep= Case_Number)                                                                   

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1);

Output want;

if Have1 age=2 then do;

Have1 (keep= Case_Number)                                                                   

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1)

Have1 (keep=Assessent2);

Output want;

if Have1 age=3 then do;

Have1 (keep= Case_Number)                                                                   

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1)

Have1 (keep=Assessent2)

Have1 (keep=Assessent3);

Output want;

if Have1 age=4 then do;

Have1 (keep= Case_Number)                                                                   

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1)

Have1 (keep=Assessent2)

Have1 (keep=Assessent3)

Have1 (keep=Assessent4);

Output want;

by ID ;                                                                                                                           

run;

Again, thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Another way is changing  the order of variables after you merging them. Like (not tested code)

data want;
 merge have1 have2;
 by id;
run;
proc sql;
 select name into : list1 separated by ' '
  from dictionary.columns
   where name like 'Q52a%' and libname='WORK' and memname='WANT';

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q57a%' and libname='WORK' and memname='WANT' ;

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q64%' and libname='WORK' and memname='WANT' ;
quit;

data want;
 retain Case_Number id FirstName LastName
 &list1 Agency_Name &list2 Q5 program_name Q4 &list3;
 set want;
run;



Ksharp

View solution in original post

13 REPLIES 13
art297
Opal | Level 21

Matthew,

I don't like to be blunt, but I really don't know what you are asking.

However, that said, I'll offer the following advice anyhow.

1. Forget about the order until you have all of your data in your final file.  Then, at that point, run the following step to put the data in the precise order you want:

data want;

   set want;

   retain (desiredvar1 desiredvar2 etc);

run;

2. Since you only want one resulting file, all of the variables will be in it.  I think that the best you can do is set the variables you don't want (for a particular file) to missing.  There is a datastep option you can use to identify which file data comes from, and the call missing function to set any value within a record to missing.

Maybe I just don't understand. 

Mgarret
Obsidian | Level 7

You're not being blunt at all. I'm new to SAS and I REALLY appreciate your help.

I see, yes, I need to Merge Have1 and Have2 first to produce the FinalData set.

Then I can order the variables.

data Agency1

set FinalData

retain( Case_Number                                                                 

ID 

FirstName

LastName

Q52a:

Agency_Name

Q57a:

Q5

Program_Name

Q4

Q64:

Age);

run;

2.

Yes, this is a little confusing (sorry).

The data I use comes from excel spreadsheets generated from another database, so I don't have control over how the raw data organized when it comes to me.

I have a variable Age which can go up to a value of 10. This variable determines how many extra variables will be included in the dataset.

If the max value of Age is 10, 10 Assessment variables will be in the dataset: Assessment1 to Assessment10.

If the max value of Age is 4, 4 Assessment variables will be in the dataset: Assessment1 to Assessment4.

When i recive the dataset if the max value of Age is 4 there won't be Assessment5, Assessment6, Assessment7, Assessment9, Assessment10 in that particular set.

So, there are 10 different possible versions.

I wanted to try and write the program so i would not have to change the code to reflect how many Assessment Variables are brought in.

Does this make more sense?

Thanks so much.

Patrick
Opal | Level 21

Why can't you just do the same thing with ASSESSMENT variables than you already do with all the Q variables, so: retain ..... ASSESSMENT: ....;   ?

Ksharp
Super User

Another way is changing  the order of variables after you merging them. Like (not tested code)

data want;
 merge have1 have2;
 by id;
run;
proc sql;
 select name into : list1 separated by ' '
  from dictionary.columns
   where name like 'Q52a%' and libname='WORK' and memname='WANT';

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q57a%' and libname='WORK' and memname='WANT' ;

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q64%' and libname='WORK' and memname='WANT' ;
quit;

data want;
 retain Case_Number id FirstName LastName
 &list1 Agency_Name &list2 Q5 program_name Q4 &list3;
 set want;
run;



Ksharp

Mgarret
Obsidian | Level 7

This  looks closer to what I'm trying to accomplish. i will try this tomorrow and let you know. Thanks!

Mgarret
Obsidian | Level 7

Hi Ksharp--

After this step:

data want;

retain Case_Number id FirstName LastName

&list1 Agency_Name &list2 Q5 program_name Q4 &list3;

set want;

run;

I got these errors:

WARNING: Apparent symbolic reference LIST1 not resolved.

WARNING: Apparent symbolic reference LIST2 not resolved.

WARNING: Apparent symbolic reference LIST3 not resolved.

I am not sure how to define the Macro in this case. I am fimiliar with %let but I'm not sure how to utilize it here.

Matt

Mgarret
Obsidian | Level 7

Hold on I see what I did. This works perfectly!!!! Thanks so much!

Mgarret
Obsidian | Level 7

Hi KSharp--

Thanks again. I do have one more question.

I want to try and use Proc SQL as oppssed to the Data Step to make the final organized data steps.  Its easier when outputting to spreadsheets.

I got an error message I don't understand.

Proc SQL;

Create table FINAL as

select

Quest_,

Case_Number, 

Case_FirstName,

Case_FirstName,

&list1,

Agency_Name,

&list2,

cra5,

program_name,

cra14,

&list3

from work.want;

quit;

NOTE: Line generated by the macro variable "LIST1".

1      cra52a1 cra52a_1 cra52a_2 cra52a_3 cra52a_4 cra52a_5 cra52a_6 cra52a_7 cra52a_8

               --------

               22

               76

ERROR 22-322: Syntax error, expecting one of the following: a quoted string, !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, AS,

              BETWEEN, CONTAINS, EQ, EQT, FORMAT, FROM, GE, GET, GT, GTT, IN, INFORMAT, INTO, IS, LABEL, LE, LEN, LENGTH, LET, LIKE, LT, LTT, NE,

              NET, NOT, NOTIN, OR, TRANSCODE, ^, ^=, |, ||, ~, ~=.

art297
Opal | Level 21

When you created your list variables you separated the strings with spaces when, for sql, they would have to be separated by commas

Mgarret
Obsidian | Level 7

Hi Art--

I'm not sure how I would add ',' to this statment.

select name  into : list1 separated by ' '

  from dictionary.columns

   where name like 'cra52a_%' and libname='WORK' and memname='WANT';

art297
Opal | Level 21

replace the separated by ' '   with   separated by ','

Mgarret
Obsidian | Level 7

GREAT THANKS!

Tom
Super User Tom
Super User

If you know the intended variables then define them before the merge statement.  This is useful when reading data from Excel spreadsheets as Excel is NOT a database and so SAS has to guess at the intended type and length of the variables (columns) in the Excel file. If the list of variables varies in the source data you will probably want to just define ALL variables for the maximum possible case.  That way you will have a consistent structure that you can use for all cases.  The other annoying thing that SAS does when importing data from Excel (and other databases) is to attach formats and informats to the character variables.  You should remove those as you will see truncation in your listings of the variables when the format is shorter than the variable length.

data want ;

   length

    Case_Number 8

    ID  8

    FirstName LastName $50

    Q52a1 - Q52a10 8

    Agency_Name $50

    Q57a1 - Q57a10 8

    Q5 8

    Program_Name $50

   ;

   merge have1 have2 ;

   by id;

   format _character_;

   informat _character_;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1029 views
  • 6 likes
  • 5 in conversation