I am trying to merge two data sets in what I think is a one to many merge. It's a bit complex but I am trying to keep as much of Have1 because I have other calculations that depend on that dataset format. My goal is to get Height1 Weight1 Head1 Feet1 BMI in the long format.
data have1;
input Patient $ Variable $ Value Visit $;
cards;
1 Height 1 a
1 Weight 2 a
1 Head 1 a
1 Height 3 b
1 Weight 4 b
1 Head 5 b
1 Feet 6 b
2 Height 3 c
2 Weight 2 c
2 Head 6 c
;<code></code><code></code>
data have2;
input Patient $ Visit $ Height1 Weight1 Head1 Feet1 BMI;
datalines;
1 a 1 2 1 . 5
1 b 3 4 5 6 5
2 c 3 2 6 . 4
;
data want;
input Patient $ Variable $ Value Visit $ Height1 Weight1 Head1 Feet1 BMI;
cards;
1 Height 1 a 1 . . . .
1 Weight 2 a . 2 . . .
1 Head 1 a . . 1 . .
1 BMI . a . . . . 5
1 Height 3 b 3 . . . .
1 Weight 4 b . 4 . . .
1 Head 5 b . . 5 . .
1 Feet 6 b . . . 6 .
1 BMI . b . . . . 5
2 Height 3 c 3 . . . .
2 Weight 2 c . 2 . . .
2 Head 6 c . . . 6 .
2 BMI . c . . . . 4
;
In theory my attempt would be the following but I'm not sure how to get the BMI added to the Variable list:
data want;
merge have1 have2;
by Patient Visit;
run;
Any insight would help! Thanks!
Hi @serena13lee Try this
data have1;
input Patient $ Variable $ Value Visit $;
cards;
1 Height 1 a
1 Weight 2 a
1 Head 1 a
1 Height 3 b
1 Weight 4 b
1 Head 5 b
1 Feet 6 b
2 Height 3 c
2 Weight 2 c
2 Head 6 c
;
data have2;
input Patient $ Visit $ Height1 Weight1 Head1 Feet1 BMI;
datalines;
1 a 1 2 1 . 5
1 b 3 4 5 6 5
2 c 3 2 6 . 4
;
data temp;
merge have1 have2;
by Patient Visit;
run;
data bmi;
set have2;
keep patient visit bmi;
run;
data want;
set temp(drop=bmi) bmi(in=_bmi);
by patient visit;
array t(*) Height1--Feet1;
do i=1 to dim(t);
k=strip(vname(t(i))) ;
if not index(k,strip(variable)) then call missing(t(I));
end;
if _bmi then variable='BMI';
drop k i;
run;
Edit your original post with self-contained data steps. Don't make us do your work by forcing us to convert your post into usable code.
It's nothing personal, there are just a lot of new users here that just whack any old data into the editor, then expect the respondent (eg. me) to get their data working before I can even work on a solution. I'm just over it, and won't contribute to those posts where I have to undertake more effort in getting the data working than the original poster bothered with.
As to your edits...nope, they don't work. Cut-and-paste your code into SAS and get them to the point where the cut-and-paste works without error. Do this for both your source dataset(s) ("have(s)") and target results ("want").
The "want" results allow us to run a proc compare on our results to easily tell if our results match your desired results.
If you can't get that to work then you have bigger problems than how to format your question on these forums, and likely need to spend more time with the SAS documentation.
Also post your code as a SAS code block (the "running man" icon in the editor - it says "Insert SAS Code" when you hover over it).
P.S.: Read the documentation on PROC TRANSPOSE.
Hi! Totally understand. I've updated the data so that it is readable. Please let me know if there's anything else I should edit.
data Have1 ;
input Patient Variable $ Value Visit $;
cards;
1 Height 1 a
1 Weight 2 a
1 Head 1 a
1 Height 3 b
1 Weight 4 b
1 Head 5 b
1 Feet 6 b
2 Height 3 c
2 Weight 2 c
2 Head 6 c
;
data Have2;
input Patient Visit $ Height1 Weight1 Head1 Feet1 BMI;
cards;
1 a 1 2 1 . 5
1 b 3 4 5 6 5
2 c 3 2 6 . 4
;
data temp;
merge have1 have2;
by Patient Visit;
run;
data want;
set temp;
array t(*) Height1--bmi;
do i=1 to dim(t);
k=strip(vname(t(i))) ;
if not index(k,strip(variable)) then call missing(t(I));
end;
drop k i;
run;
Hi! Thanks so much for your quick input. I was looking at the want output and I was wondering how I could get the BMI values for both the rows and columns? Should I add a blank BMI column to and then merge and use your method? Thanks in advance!
@serena13lee wrote:It's a bit complex but I am trying to keep as much of Have1 because I have other calculations that depend on that dataset format.
I assume your 2nd "have1" is really your "want"? (Minor: also clean up your "want" proposed code).
IMO that's a really strange format that I rarely see. Can you give details on your "...other calculations that depend on that dataset format"?
Edit: My "gut feeling" is you want to store and process your data like this. But without knowing the details of your "other calculations" I can't be sure.
data have;
input Patient $ Variable $ Value Visit $;
cards;
1 Height 1 a
1 Weight 2 a
1 Head 1 a
1 BMI 5 a <<<
1 Height 3 b
1 Weight 4 b
1 Head 5 b
1 Feet 6 b
1 BMI 5 b <<<
2 Height 3 c
2 Weight 2 c
2 Head 6 c
2 BMI 4 c <<<
;
run;
proc transpose data=have out=want;
by patient visit;
var value;
id variable;
run;
I'm also curious how you are creating your "have2" table?
I use them to create summary tables. For example, using Variable and Value, I can pull it into a macro and get a sum by each variable
Are you familiar with proc summary and BY group processing?
Post your macro and desired output from that macro.
And yes I have edited it so that my 'want' output is more clear.
data want;<br />merge have1 have2;<br />by Patient Visit;<br />run;<code></code><code></code>
Really? (This is minor, just attention to detail.)
Hi @serena13lee Try this
data have1;
input Patient $ Variable $ Value Visit $;
cards;
1 Height 1 a
1 Weight 2 a
1 Head 1 a
1 Height 3 b
1 Weight 4 b
1 Head 5 b
1 Feet 6 b
2 Height 3 c
2 Weight 2 c
2 Head 6 c
;
data have2;
input Patient $ Visit $ Height1 Weight1 Head1 Feet1 BMI;
datalines;
1 a 1 2 1 . 5
1 b 3 4 5 6 5
2 c 3 2 6 . 4
;
data temp;
merge have1 have2;
by Patient Visit;
run;
data bmi;
set have2;
keep patient visit bmi;
run;
data want;
set temp(drop=bmi) bmi(in=_bmi);
by patient visit;
array t(*) Height1--Feet1;
do i=1 to dim(t);
k=strip(vname(t(i))) ;
if not index(k,strip(variable)) then call missing(t(I));
end;
if _bmi then variable='BMI';
drop k i;
run;
You are welcome!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.