Hi,
I have a dataset containing individual data that looks like this. The variable HHID identify household, while Indiv_ID is the unique identifier for individuals:
HHID | Indiv_ID | reln_HH | sex | age |
Q1V1501011101 | 1 | Married child | Male | 25 |
Q1V1501011101 | 2 | Spouse of married child | Female | 28 |
Q1V1501011101 | 3 | Self | Male | 52 |
Q1V1501011102 | 4 | Self | Female | 63 |
Q1V1501011102 | 5 | Grandchild | Female | 2 |
I want to transpose it into a household dataset in which each line is a household. It should look like this.
HHID | Indiv_ID1 | reln_HH1 | sex1 | age1 | Indiv_ID2 | reln_HH2 | sex2 | age2 | Indiv_ID3 | reln_HH3 | sex3 | age3 |
Q1V1501011101 | 1 | Married child | Male | 25 | 2 | Spouse of married child | Female | 28 | 3 | Self | Male | 52 |
Q1V1501011102 | 4 | Self | Female | 63 | 5 | Grandchild | Female | 2 |
I did some attempts with proc tranpose but it always gives something very different from what I want.
I don't think proc transpose is well suited to the "block transpose" task you want to do.
This program will likely do what you want.
data have;
input HHID $13. Indiv_ID reln_HH &$24. sex :$6. age;
datalines;
Q1V1501011101 1 Married child Male 25
Q1V1501011101 2 Spouse of married child Female 28
Q1V1501011101 3 Self Male 52
Q1V1501011102 4 Self Female 23
Q1V1501011102 5 Grandchild Female 2
run;
filename tmp temp ;
data need (drop=_:) ;
do seq=1 by 1 until (last.hhid);
set have end=end_of_have;
by hhid;
output;
end;
retain _maxseq .;
_maxseq=max(_maxseq,seq);
if end_of_have;
file tmp;
do _s=1 to _maxseq;
put 'need (where=(seq=' _s ') rename=(indiv_id=indiv_id' _s 'reln_hh=reln_hh' _s 'sex=sex' _s 'age=age' _s '))' ;
end;
run;
data want;
merge
%include tmp / source2 ;
;
by hhid;
run;
Take a look at dataset NEED to see how the DATA WANT step works.
I don't think proc transpose is well suited to the "block transpose" task you want to do.
This program will likely do what you want.
data have;
input HHID $13. Indiv_ID reln_HH &$24. sex :$6. age;
datalines;
Q1V1501011101 1 Married child Male 25
Q1V1501011101 2 Spouse of married child Female 28
Q1V1501011101 3 Self Male 52
Q1V1501011102 4 Self Female 23
Q1V1501011102 5 Grandchild Female 2
run;
filename tmp temp ;
data need (drop=_:) ;
do seq=1 by 1 until (last.hhid);
set have end=end_of_have;
by hhid;
output;
end;
retain _maxseq .;
_maxseq=max(_maxseq,seq);
if end_of_have;
file tmp;
do _s=1 to _maxseq;
put 'need (where=(seq=' _s ') rename=(indiv_id=indiv_id' _s 'reln_hh=reln_hh' _s 'sex=sex' _s 'age=age' _s '))' ;
end;
run;
data want;
merge
%include tmp / source2 ;
;
by hhid;
run;
Take a look at dataset NEED to see how the DATA WANT step works.
Posting data in usable form and the code you have makes it easier to provide a useful answer.
Why do you need to transpose the data from long to wide format? You may have noticed, that in almost all other discussion in the community using data in a long format is recommended. Or do you want a report?
Since family sizes vary so much you are going to have some fun using sex1, sex2, sex3, sex4 ... sex25 in any form of analysis. When it comes time to do anything with "adults" or "children" you will spend a LOT of time futzing around with which is which for each family.
Please, in some detail, describe what you expect to do with that data set once you have made it "wide" like that. Specific analysis or report examples would be nice.
If you want something for people to look at then a report such as
proc report data=have; columns hhid indiv_id,(rein_hh sex age); define hhid / group; define indiv_id/ across; define rein_hh /display; define sex/display; define age/display max; run;
might be helpful.
The way you have shown the output for HHID Q1V1501011101 and Q1V1501011102 makes me suspect that you actually want two variables, one with Q1V15010111 and a second with 01 or 02 (and similar).
data have;
input HHID $13. Indiv_ID reln_HH &$24. sex :$6. age;
datalines;
Q1V1501011101 1 Married child Male 25
Q1V1501011101 2 Spouse of married child Female 28
Q1V1501011101 3 Self Male 52
Q1V1501011102 4 Self Female 23
Q1V1501011102 5 Grandchild Female 2
;
run;
data temp;
set have;
by HHID ;
if first.HHID then n=0;
n+1;
run;
proc sql noprint;
select distinct catt('temp(where=(n=',n,') rename=(
Indiv_ID=Indiv_ID_',n,' reln_HH=reln_HH_',n,' sex=sex_',n,' age=age_',n,' ))') into :merge separated by ' '
from temp ;
quit;
%put &merge ;
data want;
merge &merge;
by HHID;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.