We can easily create flat files using proc transpose or by using arrays.
What I am trying to do is to "unflatten" a file that is currently flat.
Is it possible?
So what I have looks like this....
participant ID | contact 1 | contact 2 | type of relationship 1 | type of relationship 2 | gender of contact 1 | gender of contact 2 |
1 | KS | RS | 1 | 1 | M | F |
2 | LB | GM | 2 | 1 | F | F |
3 | JR | DD | 2 | 2 | M | M |
and I want to make it look like this...
Participant ID | Contact | type of relationship | gender |
1 | KS | 1 | M |
1 | RS | 1 | F |
2 | LB | 2 | F |
2 | GM | 1 | F |
3 | JR | 2 | M |
3 | DD | 2 | M |
Also note that PROC TRANSPOSE can work in both directions: wide to narrow, and narrow to wide.
yes, that is probably the way to go. I posted what I have and what I want to end up with in reply to another poster. Would be great if you could look at it and suggest a code to use.
Thanks in advance 🙂
In that post, is the sample data your actual variables, or just representative of your actual variables? If it's just representative, how is the actual data different? There would be "many more" of something, but it's difficult to tell what.
Nope, not my actual variables or data (REB wouldn't be happy about that!).
My actual dataset looks just like the example I provided, but it is 'wider'.
I have ~ 250 participants and then I have 10 contacts per participant and all kinds of data about the relationship between the participants and each one of their 10 contacts. Does this help?
Having only two contacts in a row can be held by next code.
If the are more contacts in a row you can use either arrays or a macro like demostrated:
Without a macro:
data have;
input part_ID
contact_1 $ contact_2 $
type_of_rel_1 type_of_rel_2
gender_1 $ gender_2 $ ;
datalines;
1 KS RS 1 1 M F
2 LB GM 2 1 F F
3 JR DD 2 2 M M
; run;
data want(keep=part_id contact type_of_rel gender);
set have;
contact = contact_1;
type_of_rel = type_of_rel_1;
gender = gender_1;
output;
contact = contact_2;
type_of_rel = type_of_rel_2;
gender = gender_2;
output;
run;
Using a macro:
%macro out(n); contact = contact_&n; type_of_rel = type_of_rel_&n; gender = gender_&n; output; %mend out;
data want(keep=part_id contact type_of_rel gender); set have;
%out(1);
%out(2);
run;
You can even do last step dynamical:
%macro doit(loop);
data want(keep=part_id contact type_of_rel gender);
set have;
%do i=1 %to &loop;
%out(&i);
%end;
run;
%mend doit;
%doit(2); /* 2 for two contact per row */
You have several good suggestions so far. I would lean toward a DATA step as well, but using arrays:
data want;
set have;
array con {10} contact1-contact10;
array typ {10} type_of_relationship1- type_of_relationship10;
array gen {10} gender1-gender10;
do _n_=1 to 10;
contact = con{_n_};
type_of_relationship = typ{_n_};
gender = gen{_n_};
output;
end;
keep id contact type_of_relationship gender;
run;
You will need more array statements, one for each set of variables in your data.
hmmm...it didn't work...
It did flip the dataset, but all the values were missing...
That's strange, since the technique is very straightforward. Can you share the log? It may be a simple thing like changing the spelling of existing variable names.
(If possible, it wouldn't hurt to share the contents of a single observation from your incoming data.)
there were no error messages in the log.
Thanks very much for the code though. You gave me enough breadcrumbs to find my way, I think.
I'll spend some time thinking about it and if I still can't figure it out, I'll let you know 😉
pay attention - contact and gender are character type.
using array need add $ sign:
array con {10} $ contact1-contact10;
array typ {10} type_of_relationship1- type_of_relationship10;
array gen {10} $ gender1-gender10;
the macros worked beautifully!!! Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.