Solved: Forming Datasets Conditionally

Mgarret · Posted 01-10-2012 10:20 PM

Hi all—

I have two issues. Any assistance on either will be greatly appreciated

#1

I've posted regarding this issue a once already but I’m still having some trouble.

I have to merge two datasets together and the variables need to go into a certain order for presentation purposes. There are variables which I have to bring in using the wildcard symbol “:” because the number of variables with various question prefixes will vary from dataset to dataset (Q52a_1 Q52a_2 Q52a_3 Q52a_4……)

Below was one suggestion. I have been playing on variations of this and I can’t get it to work

data Want;

merge

Have1 (keep= Case_Number)

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

;

by ID ;

run;

#2

In addition, I need to define the variables that are brought into the dataset Want by the value of a variable (in this case Age). I know the example below is TOTALLY wrong but I’m just trying illustrate the issue.

data want;

merge Have1 Have2;

if Have1 age=1 then do;

Have1 (keep= Case_Number)

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1);

Output want;

if Have1 age=2 then do;

Have1 (keep= Case_Number)

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1)

Have1 (keep=Assessent2);

Output want;

if Have1 age=3 then do;

Have1 (keep= Case_Number)

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1)

Have1 (keep=Assessent2)

Have1 (keep=Assessent3);

Output want;

if Have1 age=4 then do;

Have1 (keep= Case_Number)

Have2 (keep=ID )

Have1 (keep=FirstName)

Have1 (keep=LastName)

Have2 (keep=Q52a:)

Have1 (keep=Agency_Name)

Have2 (keep=Q57a:)

Have2 (keep=Q5)

Have1 (keep=Program_Name)

Have1 (keep=Q4)

Have2 (keep=Q64:)

Have1 (keep=Assessent1)

Have1 (keep=Assessent2)

Have1 (keep=Assessent3)

Have1 (keep=Assessent4);

Output want;

by ID ;

run;

Again, thanks!

Ksharp · Posted 01-11-2012 12:18 AM

Another way is changing the order of variables after you merging them. Like (not tested code)

data want;
 merge have1 have2;
 by id;
run;
proc sql;
 select name into : list1 separated by ' '
  from dictionary.columns
   where name like 'Q52a%' and libname='WORK' and memname='WANT';

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q57a%' and libname='WORK' and memname='WANT' ;

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q64%' and libname='WORK' and memname='WANT' ;
quit;

data want;
 retain Case_Number id FirstName LastName
 &list1 Agency_Name &list2 Q5 program_name Q4 &list3;
 set want;
run;

Ksharp

View solution in original post

art297 · Posted 01-10-2012 11:06 PM

Matthew,

I don't like to be blunt, but I really don't know what you are asking.

However, that said, I'll offer the following advice anyhow.

1. Forget about the order until you have all of your data in your final file. Then, at that point, run the following step to put the data in the precise order you want:

data want;

set want;

retain (desiredvar1 desiredvar2 etc);

run;

2. Since you only want one resulting file, all of the variables will be in it. I think that the best you can do is set the variables you don't want (for a particular file) to missing. There is a datastep option you can use to identify which file data comes from, and the call missing function to set any value within a record to missing.

Maybe I just don't understand.

Mgarret · Posted 01-10-2012 11:49 PM

You're not being blunt at all. I'm new to SAS and I REALLY appreciate your help.

I see, yes, I need to Merge Have1 and Have2 first to produce the FinalData set.

Then I can order the variables.

data Agency1

set FinalData

retain( Case_Number

ID

FirstName

LastName

Q52a:

Agency_Name

Q57a:

Q5

Program_Name

Q4

Q64:

Age);

run;

2.

Yes, this is a little confusing (sorry).

The data I use comes from excel spreadsheets generated from another database, so I don't have control over how the raw data organized when it comes to me.

I have a variable Age which can go up to a value of 10. This variable determines how many extra variables will be included in the dataset.

If the max value of Age is 10, 10 Assessment variables will be in the dataset: Assessment1 to Assessment10.

If the max value of Age is 4, 4 Assessment variables will be in the dataset: Assessment1 to Assessment4.

When i recive the dataset if the max value of Age is 4 there won't be Assessment5, Assessment6, Assessment7, Assessment9, Assessment10 in that particular set.

So, there are 10 different possible versions.

I wanted to try and write the program so i would not have to change the code to reflect how many Assessment Variables are brought in.

Does this make more sense?

Thanks so much.

Patrick · Posted 01-11-2012 12:02 AM

Why can't you just do the same thing with ASSESSMENT variables than you already do with all the Q variables, so: retain ..... ASSESSMENT: ....; ?

Ksharp · Posted 01-11-2012 12:18 AM

Another way is changing the order of variables after you merging them. Like (not tested code)

data want;
 merge have1 have2;
 by id;
run;
proc sql;
 select name into : list1 separated by ' '
  from dictionary.columns
   where name like 'Q52a%' and libname='WORK' and memname='WANT';

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q57a%' and libname='WORK' and memname='WANT' ;

 select name into : list2 separated by ' '
  from dictionary.columns
   where name like 'Q64%' and libname='WORK' and memname='WANT' ;
quit;

data want;
 retain Case_Number id FirstName LastName
 &list1 Agency_Name &list2 Q5 program_name Q4 &list3;
 set want;
run;

Ksharp

Mgarret · Posted 01-11-2012 12:31 AM

This looks closer to what I'm trying to accomplish. i will try this tomorrow and let you know. Thanks!

Mgarret · Posted 01-11-2012 09:32 AM

Hi Ksharp--

After this step:

data want;

retain Case_Number id FirstName LastName

&list1 Agency_Name &list2 Q5 program_name Q4 &list3;

set want;

run;

I got these errors:

WARNING: Apparent symbolic reference LIST1 not resolved.

WARNING: Apparent symbolic reference LIST2 not resolved.

WARNING: Apparent symbolic reference LIST3 not resolved.

I am not sure how to define the Macro in this case. I am fimiliar with %let but I'm not sure how to utilize it here.

Matt

Mgarret · Posted 01-11-2012 09:38 AM

Hold on I see what I did. This works perfectly!!!! Thanks so much!

Mgarret · Posted 01-11-2012 10:37 AM

Hi KSharp--

Thanks again. I do have one more question.

I want to try and use Proc SQL as oppssed to the Data Step to make the final organized data steps. Its easier when outputting to spreadsheets.

I got an error message I don't understand.

Proc SQL;

Create table FINAL as

select

Quest_,

Case_Number,

Case_FirstName,

&list1,

Agency_Name,

&list2,

cra5,

program_name,

cra14,

&list3

from work.want;

quit;

NOTE: Line generated by the macro variable "LIST1".

1 cra52a1 cra52a_1 cra52a_2 cra52a_3 cra52a_4 cra52a_5 cra52a_6 cra52a_7 cra52a_8

--------

22

76

ERROR 22-322: Syntax error, expecting one of the following: a quoted string, !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, AS,

BETWEEN, CONTAINS, EQ, EQT, FORMAT, FROM, GE, GET, GT, GTT, IN, INFORMAT, INTO, IS, LABEL, LE, LEN, LENGTH, LET, LIKE, LT, LTT, NE,

NET, NOT, NOTIN, OR, TRANSCODE, ^, ^=, |, ||, ~, ~=.

art297 · Posted 01-11-2012 10:41 AM

When you created your list variables you separated the strings with spaces when, for sql, they would have to be separated by commas

Mgarret · Posted 01-11-2012 11:10 AM

Hi Art--

I'm not sure how I would add ',' to this statment.

select name into : list1 separated by ' '

from dictionary.columns

where name like 'cra52a_%' and libname='WORK' and memname='WANT';

art297 · Posted 01-11-2012 11:15 AM

replace the separated by ' ' with separated by ','

Mgarret · Posted 01-11-2012 11:24 AM

GREAT THANKS!

Tom · Posted 01-11-2012 09:00 AM

If you know the intended variables then define them before the merge statement. This is useful when reading data from Excel spreadsheets as Excel is NOT a database and so SAS has to guess at the intended type and length of the variables (columns) in the Excel file. If the list of variables varies in the source data you will probably want to just define ALL variables for the maximum possible case. That way you will have a consistent structure that you can use for all cases. The other annoying thing that SAS does when importing data from Excel (and other databases) is to attach formats and informats to the character variables. You should remove those as you will see truncation in your listings of the variables when the format is shorter than the variable length.

data want ;

length

Case_Number 8

ID 8

FirstName LastName $50

Q52a1 - Q52a10 8

Agency_Name $50

Q57a1 - Q57a10 8

Q5 8

Program_Name $50

;

merge have1 have2 ;

by id;

format _character_;

informat _character_;

run;

Forming Datasets Conditionally

Re: Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Re: Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Forming Datasets Conditionally

Re: Forming Datasets Conditionally

Register Today!

SAS Training: Just a Click Away