I am combining five datasets:
HypTabs.Src_Utah_Vitals_2010
HypTabs.Src_Utah_Vitals_2011
HypTabs.Src_Utah_Vitals_2012
HypTabs.Src_Utah_Vitals_2013
HypTabs.Src_Utah_Vitals_2014
All the datasets have variables in this format:
I am trying to create the four variables HtIn, WtLb, SBP, DBP from this data set. In other words I need to for example combine all the systolic BP into one column.
I guess I have no idea even how to get started with combining all the variables this way. Any ideas?
But...you should also look at the ID statement within PROC TRANSPOSE to ID your variables.
Try adding the following. I don't think the names are exactly waht you want though, so you could either create a new variable earlier - the preferred approach in my opinion, or rename after. The rename after, relies on knowing the order but there's a small chance that could break.
ID Measure;
idlabel measure;
PROC TRANSPOSE.
Is it possible to do it with a MERGE statement? We just haven't learned about PROC TRANSPOSE in class yet, so I think we are expected to use MERGE, but I'm not super good at it yet.
Not easily....you somehow have to reformat your data first. You could split it multiple times and merge but that seems silly when there's a proc that will do it in one step.
It may be I'm not interpreting your question properly though. I assume you want something like the following:
Year Height Weight BP_Systolic BP_Diastolic
2010
2011
2012
2013
2010
....
This it the format I ultimately want to end up with. We have learned arrays, so that could be an option. I just remember hearing something about using MERGE, so I think that's why I thought of that.
MERGE is for combining datasets with different variables (but maybe sharing one or more key variables) side-by-side.
If you have identically structured datasets with data from different ranges, you use a simple SET statement to "stack" them.
If you want to convert to a wide format, use a data step with by processing and retain. The details depend on the structure of the initial dataset. Right now I'm not sure at all what your dataset looks like.
Post your existing dataset in a data step for testing. Use the macro from https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to provide the datastep.
Merge is not the method to use for this. This can be done very simply by:
/* Step 1 - set all data together, and put year in as a data item */ data inter; set hyptabs.src_utah_vitals: indsname=tmp; year=scan(tmp,4,"_"); run; /* Transpose the data */ proc transpose data=inter out=want; by year; var value; idlabel measure; run;
This shows a couple of things:
1) Why putting data into dataset names/variables is not a good idea. If you keep data in the body of the table you can use it, manipulate it etc. From a programmer point of view, data in names is really bad as it makes all programming far more difficult.
2) Using : means all items with a prefix of
3) Transpose procedure - this is designed to move data from long to wide or from wide to long. Unless your data has multiple variables, it is the ideal solution for this (for multiple variables array approach is ideal).
Ok, that makes sense, but I still can't get it to work. Just to give you a better idea, here are all the columns of my data. Even though the SSNs are fake, I just grayed them out so no one gets worried.
I guess I am confused by the fact that we are using "by year".
Here is the code I tried, but maybe this is not what you were thinking.
I was eventually able to get the right format with this code:
data work.Vitals_UtPre;
set HypTabs.Src_Utah_Vitals_2010
HypTabs.Src_Utah_Vitals_2011
HypTabs.Src_Utah_Vitals_2012
HypTabs.Src_Utah_Vitals_2013
HypTabs.Src_Utah_Vitals_2014;
run;
proc sort;
by SSN ApptDate;
run;
proc transpose
data = WORK.Vitals_UtPre
out= WORK.Vitals_Ut;
by SSN ApptDate;
var Value;
run;
My only question now is where would I put my rename statement to get these renames done. I have tried some options and it didn't work.
(RENAME (ApptDate = VisitDt
SSN = SSN1
COL1 = HtIn
COL2 = WtLb
COL3 = SBP
COL4 = DBP));
That's in a the format of a data set option.
Data set options go after the data set you want to modify, so place it after the name of the output data set.
proc transpose
data = WORK.Vitals_UtPre
out= WORK.Vitals_Ut (RENAME (ApptDate = VisitDt
SSN = SSN1
COL1 = HtIn
COL2 = WtLb
COL3 = SBP
COL4 = DBP));
by SSN ApptDate;
var Value;
run;
But...you should also look at the ID statement within PROC TRANSPOSE to ID your variables.
Try adding the following. I don't think the names are exactly waht you want though, so you could either create a new variable earlier - the preferred approach in my opinion, or rename after. The rename after, relies on knowing the order but there's a small chance that could break.
ID Measure;
idlabel measure;
DATA WORK.Vitals_UtPre; SET HypTabs.Src_Utah_Vitals_2010 HypTabs.Src_Utah_Vitals_2011 HypTabs.Src_Utah_Vitals_2012 HypTabs.Src_Utah_Vitals_2013 HypTabs.Src_Utah_Vitals_2014; RUN; PROC SORT; BY SSN ApptDate; RUN; PROC TRANSPOSE DATA = WORK.Vitals_UtPre OUT= WORK.Vitals_UT; ID Measure; IDLABEL Measure; BY SSN ApptDate; VAR Value; RUN; DATA WORK.Vitals_UT (RENAME (ApptDate = VisitDt SSN = SSN1 Height__In_ = HtIn Weight__Lb_ = WtLb Systolic_BP = SBP Dystolic_BP = DBP)); RETAIN SSN VisitDt HtIn WtLb SBP DBP; SSN2 = INPUT(SSN1, 12.); SSN = PUT(SSN1, SSN11.); KEEP SSN VisitDt HtIn WtLb SBP DBP; LABEL SSN = "Social Security Number" VisitDt = "Visit Date" HtIn = "Height (In)" WtLb = "Weight (Lb)" SBP = "Systolic BP (mmHg)" DBP = "Diastolic BP (mmHg)"; KEEP SSN VisitDt HtIn WtLb SBP DBP; PROC SORT; BY SSN VisitDt; RUN;
This is my whole code right now. I am just not able to get the rename statement to work. The ID statement works, but ultimately does not make the variable names I want, so I still included the rename statement.
Your RENAME is in the wrong place. It's on the output data set but you likely want it in your code.
See the two different ways to have rename in the code below. If you use it as a data set option, it only applies at the end, but you're trying to rename it before...so either include it on the missing(?) SET statement or as a statement itself.
data class (rename=age=age_years);
set sashelp.class;
run;
data class;
set sashelp.class;
rename age=age_years;
run;
Got it to work. Just had to add a SET statement and it was fine. Thanks so much for your help!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.