BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
katblack
Fluorite | Level 6

My data is from a survey where members of a household were entered into blocks of variables. For example, if someone meets X criteria they are entered into the variables 1-3, starting with variable 1; if someone meets Y criteria then they are entered into the variables 4-6, starting with variable 4; if someone meets Z criteria they are entered into variables 7-9, starting with variable 7. I am now tasked with creating one set of variables to hold all the information, but I am not sure how to restructure the data. 

 

Example of current data structure:  

Fname_1Lname_1Age_1Gender_1Fname_2Lname_2Age_2Gender_2Fname_3Lname_3Age_3Gender_3Fname_4Lname_4Age_4Gender_4Fname_5Lname_5Age_5Gender_5Fname_6Lname_6Age_6Gender_6Fname_7Lname_7Age_7Gender_7Fname_8Lname_8Age_8Gender_8Fname_9Lname_9Age_9Gender_9
KateBlack20F........LouisWhite35MKittyWhite36F................
LucyRed18FTomRed17.................PaulRed76MSarahRed77F....
............PatriciaYellow60FFrankYellow62M....KateYellow11F........
Jeff Blue15M....................LillianBlue25F........

 

This leaves a lot of missing data. I want to be able to make one household roster, removing the missing data. So that it looks like this: 

Fname_1Lname_1Age_1Gender_1Fname_2Lname_2Age_2Gender_2Fname_3Lname_3Age_3Gender_3Fname_4Lname_4Age_4Gender_4Fname_5Lname_5Age_5Gender_5Fname_6Lname_6Age_6Gender_6Fname_7Lname_7Age_7Gender_7Fname_8Lname_8Age_8Gender_8Fname_9Lname_9Age_9Gender_9
KateBlack20FLouisWhite35MKittyWhite36F........................
LucyRed18FTomRed17PaulRed76MSarahRed77F.....................
PatriciaYellow60FFrankYellow62MKateYellow11F........................
Jeff Blue15MLillianBlue25F............................

 

My question is: how do I pull out the data that exists and put it into variables as per my example. 

 

I am familiar with arrays and do-loops, and have tried some manipulations with these, without success. 

 

Any guidance, advice, or suggestions are much appreciated!

Thank you!

 

1 ACCEPTED SOLUTION

Accepted Solutions
learsaas
Quartz | Level 8
data work.a;
input 
Fname_1 $ Lname_1 $ Age_1 Gender_1 $
Fname_2 $ Lname_2 $ Age_2 Gender_2 $ 
Fname_3 $ Lname_3 $ Age_3 Gender_3 $
Fname_4 $ Lname_4 $ Age_4 Gender_4 $
Fname_5 $ Lname_5 $ Age_5 Gender_5 $
Fname_6 $ Lname_6 $ Age_6 Gender_6 $
Fname_7 $ Lname_7 $ Age_7 Gender_7 $
Fname_8 $ Lname_8 $ Age_8 Gender_8 $
Fname_9 $ Lname_9 $ Age_9 Gender_9 $
;
cards;
Kate Black 20 F . . . . . . . . Louis White 35 M Kitty White 36 F . . . . . . . . . . . . . . . . 
Lucy Red 18 F Tom Red 17 . . . . . . . . . . . . . . . . . Paul Red 76 M Sarah Red 77 F . . . . 
. . . . . . . . . . . . Patricia Yellow 60 F Frank Yellow 62 M . . . . Kate Yellow 11 F . . . . . . . . 
Jeff  Blue 15 M Lillian Blue 25 F . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
;
run;

data work.result;
	set work.a;
	array array_Fname  Fname:;
	array array_Lname  Lname:;
	array array_Age    Age:;
	array array_Gender Gender:;
	do i=1 to dim(array_Fname);
		if missing(array_Fname(i)) then do j=i+1 to dim(array_Fname);
			if not missing(array_Fname(j)) then do;
				array_Fname(i)=array_Fname(j);array_Lname(i)=array_Lname(j);array_Age(i)=array_Age(j);array_Gender(i)=array_Gender(j);
				call missing(array_Fname(j),array_Lname(j),array_Age(j),array_Gender(j));
				leave;
			end;
		end;
	end;
	drop i j;
run;

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

Why? 

Looks like a report to me. Try the PANELS= option on PROC REPORT to display the records in multiple columns on the page.

Astounding
PROC Star

You would be well advised to change the plan.  Create a data set with one observation per person:  

 

fname, lname, age, gender

 

And add two variables:  

 

family, criteria

 

That would be much more flexible and would support both analysis and reporting.

ballardw
Super User

One way to start implementing @Astounding's suggestion:

data roster;
   set have;
   array fn fname_: ;/* or fname_1 - fname_9*/
   array ln lname_: ;
   array af age_:  ;
   /* similar for the repeated variables*/
   do i= 1 to dim(fn);
      if not missing(fn[i] ) then do;
         firstname=fn[i];
         lastname =ln[i];
         age      =ag[i];
         /*continue with other arrays*/
         output;
      end;
   end;
   keep firstname lastname age ;
run;

the keep list should include any variables you want. Note the location of the OUTPUT. Any other modified/created/ manipulated variables you need for each individual need to be set at that point.

 

learsaas
Quartz | Level 8
data work.a;
input 
Fname_1 $ Lname_1 $ Age_1 Gender_1 $
Fname_2 $ Lname_2 $ Age_2 Gender_2 $ 
Fname_3 $ Lname_3 $ Age_3 Gender_3 $
Fname_4 $ Lname_4 $ Age_4 Gender_4 $
Fname_5 $ Lname_5 $ Age_5 Gender_5 $
Fname_6 $ Lname_6 $ Age_6 Gender_6 $
Fname_7 $ Lname_7 $ Age_7 Gender_7 $
Fname_8 $ Lname_8 $ Age_8 Gender_8 $
Fname_9 $ Lname_9 $ Age_9 Gender_9 $
;
cards;
Kate Black 20 F . . . . . . . . Louis White 35 M Kitty White 36 F . . . . . . . . . . . . . . . . 
Lucy Red 18 F Tom Red 17 . . . . . . . . . . . . . . . . . Paul Red 76 M Sarah Red 77 F . . . . 
. . . . . . . . . . . . Patricia Yellow 60 F Frank Yellow 62 M . . . . Kate Yellow 11 F . . . . . . . . 
Jeff  Blue 15 M Lillian Blue 25 F . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
;
run;

data work.result;
	set work.a;
	array array_Fname  Fname:;
	array array_Lname  Lname:;
	array array_Age    Age:;
	array array_Gender Gender:;
	do i=1 to dim(array_Fname);
		if missing(array_Fname(i)) then do j=i+1 to dim(array_Fname);
			if not missing(array_Fname(j)) then do;
				array_Fname(i)=array_Fname(j);array_Lname(i)=array_Lname(j);array_Age(i)=array_Age(j);array_Gender(i)=array_Gender(j);
				call missing(array_Fname(j),array_Lname(j),array_Age(j),array_Gender(j));
				leave;
			end;
		end;
	end;
	drop i j;
run;
katblack
Fluorite | Level 6

Thanks learsaas this is exactly what I was looking for. This solution works perfectly!!