BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
marianhabesland
Calcite | Level 5

I am combining five datasets: 

HypTabs.Src_Utah_Vitals_2010
HypTabs.Src_Utah_Vitals_2011
HypTabs.Src_Utah_Vitals_2012
HypTabs.Src_Utah_Vitals_2013
HypTabs.Src_Utah_Vitals_2014

 

All the datasets have variables in this format:

Screen Shot 2017-11-08 at 20.34.34.png

 

I am trying to create the four variables HtIn, WtLb, SBP, DBP from this data set. In other words I need to for example combine all the systolic BP into one column. 

 

I guess I have no idea even how to get started with combining all the variables this way. Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

But...you should also look at the ID statement within PROC TRANSPOSE to ID your variables.

 

Try adding the following. I don't think the names are exactly waht you want though, so you could either create a new variable earlier - the preferred approach in my opinion, or rename after. The rename after, relies on knowing the order but there's a small chance that could break.

 

ID Measure;
idlabel measure;

View solution in original post

15 REPLIES 15
Reeza
Super User

PROC TRANSPOSE.

marianhabesland
Calcite | Level 5

Is it possible to do it with a MERGE statement? We just haven't learned about PROC TRANSPOSE in class yet, so I think we are expected to use MERGE, but I'm not super good at it yet.

Reeza
Super User

Not easily....you somehow have to reformat your data first. You could split it multiple times and merge but that seems silly when there's a proc that will do it in one step. 

 

It may be I'm not interpreting your question properly though. I assume you want something like the following:

 

Year Height Weight BP_Systolic BP_Diastolic 

2010

2011

2012

2013

2010

....

Reeza
Super User
This is the other way to 'transpose' without using PROC TRANSPOSE, if you've been learning arrays maybe this works for you: https://stats.idre.ucla.edu/sas/modules/reshaping-data-long-to-wide-using-the-data-step/
marianhabesland
Calcite | Level 5

 Screen Shot 2017-11-08 at 20.58.27.png

This it the format I ultimately want to end up with. We have learned arrays, so that could be an option. I just remember hearing something about using MERGE, so I think that's why I thought of that.

Kurt_Bremser
Super User

MERGE is for combining datasets with different variables (but maybe sharing one or more key variables) side-by-side.

If you have identically structured datasets with data from different ranges, you use a simple SET statement to "stack" them.

If you want to convert to a wide format, use a data step with by processing and retain. The details depend on the structure of the initial dataset. Right now I'm not sure at all what your dataset looks like.

Post your existing dataset in a data step for testing. Use the macro from https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to provide the datastep.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Merge is not the method to use for this. This can be done very simply by:

/* Step 1 - set all data together, and put year in as a data item */
data inter;
  set hyptabs.src_utah_vitals: indsname=tmp;
  year=scan(tmp,4,"_");
run;

/* Transpose the data */
proc transpose data=inter out=want;
  by year;
  var value;
  idlabel measure;
run;

This shows a couple of things:
1) Why putting data into dataset names/variables is not a good idea.  If you keep data in the body of the table you can use it, manipulate it etc.  From a programmer point of view, data in names is really bad as it makes all programming far more difficult.

2) Using : means all items with a prefix of

3) Transpose procedure - this is designed to move data from long to wide or from wide to long.  Unless your data has multiple variables, it is the ideal solution for this (for multiple variables array approach is ideal).

 

marianhabesland
Calcite | Level 5

Ok, that makes sense, but I still can't get it to work. Just to give you a better idea, here are all the columns of my data. Even though the SSNs are fake, I just grayed them out so no one gets worried.

 

 

Screen Shot 2017-11-09 at 08.45.47.png

 

I guess I am confused by the fact that we are using "by year".

 

Here is the code I tried, but maybe this is not what you were thinking.

 
data WORK.Vitals_Utah;
set HypTabs.Src_Utah_Vitals_2010
HypTabs.Src_Utah_Vitals_2011
HypTabs.Src_Utah_Vitals_2012
HypTabs.Src_Utah_Vitals_2013
HypTabs.Src_Utah_Vitals_2014: indsname=tmp;
year=scan(tmp,4,"_");
run;
 
proc transpose data=WORK.Vitals_Utah out=WORK.Vitals_Ut;
by year;
var value;
idlabel measure;
run;
marianhabesland
Calcite | Level 5

I was eventually able to get the right format with this code:

 

data work.Vitals_UtPre;
set HypTabs.Src_Utah_Vitals_2010
HypTabs.Src_Utah_Vitals_2011
HypTabs.Src_Utah_Vitals_2012
HypTabs.Src_Utah_Vitals_2013
HypTabs.Src_Utah_Vitals_2014;
run;

 

proc sort;
by SSN ApptDate;
run;

 

proc transpose
data = WORK.Vitals_UtPre
out= WORK.Vitals_Ut;
by SSN ApptDate;
var Value;
run;

 

My only question now is where would I put my rename statement to get these renames done. I have tried some options and it didn't work.

 

(RENAME (ApptDate = VisitDt
SSN = SSN1
COL1 = HtIn
COL2 = WtLb
COL3 = SBP
COL4 = DBP));

 

 

Reeza
Super User

That's in a the format of a data set option. 

Data set options go after the data set you want to modify, so place it after the name of the output data set. 

 

proc transpose
data = WORK.Vitals_UtPre
out= WORK.Vitals_Ut (RENAME (ApptDate = VisitDt
SSN = SSN1
COL1 = HtIn
COL2 = WtLb
COL3 = SBP
COL4 = DBP));
by SSN ApptDate; var Value; run;
Reeza
Super User

But...you should also look at the ID statement within PROC TRANSPOSE to ID your variables.

 

Try adding the following. I don't think the names are exactly waht you want though, so you could either create a new variable earlier - the preferred approach in my opinion, or rename after. The rename after, relies on knowing the order but there's a small chance that could break.

 

ID Measure;
idlabel measure;
marianhabesland
Calcite | Level 5

 

DATA WORK.Vitals_UtPre;
  SET HypTabs.Src_Utah_Vitals_2010
      HypTabs.Src_Utah_Vitals_2011
	  HypTabs.Src_Utah_Vitals_2012
	  HypTabs.Src_Utah_Vitals_2013
	  HypTabs.Src_Utah_Vitals_2014;
RUN;

PROC SORT;
	BY SSN ApptDate;
RUN;

PROC TRANSPOSE
	DATA = WORK.Vitals_UtPre
	OUT= WORK.Vitals_UT;
  	ID Measure;
	IDLABEL Measure;
  	BY SSN ApptDate;
  	VAR Value;
  
RUN;

DATA	WORK.Vitals_UT (RENAME	(ApptDate = VisitDt
		    SSN = SSN1
		    Height__In_ = HtIn
		    Weight__Lb_ = WtLb
		    Systolic_BP = SBP
		   	Dystolic_BP = DBP));
	RETAIN SSN VisitDt HtIn WtLb SBP DBP;
	
	SSN2	= INPUT(SSN1, 12.);
	SSN	= PUT(SSN1, SSN11.);
	
	KEEP SSN VisitDt HtIn WtLb SBP DBP;

	LABEL SSN = "Social Security Number"
		  VisitDt = "Visit Date"
		  HtIn = "Height (In)"
		  WtLb = "Weight (Lb)"
		  SBP = "Systolic BP (mmHg)"
		  DBP = "Diastolic BP (mmHg)";
	
	KEEP SSN VisitDt HtIn WtLb SBP DBP;

PROC SORT;
	BY SSN VisitDt;
RUN;

This is my whole code right now. I am just not able to get the rename statement to work. The ID statement works, but ultimately does not make the variable names I want, so I still included the rename statement.

 

Reeza
Super User

Your RENAME is in the wrong place. It's on the output data set but you likely want it in your code.

 

See the two different ways to have rename in the code below. If you use it as a data set option, it only applies at the end, but you're trying to rename it before...so either include it on the missing(?) SET statement or as a statement itself.

 

data class (rename=age=age_years);

set sashelp.class;
run;


data class;
set sashelp.class;

rename age=age_years;
run;
marianhabesland
Calcite | Level 5

Got it to work. Just had to add a SET statement and it was fine. Thanks so much for your help!

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 2656 views
  • 0 likes
  • 4 in conversation