BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
joesmama
Obsidian | Level 7

Hello,

 

I am trying to use Proc Transpose to restructure my dataset. I've been given an Excel file that contains dates, blood pressures and notes (fictitious data). The data read the long way (across) but I need to get the file to read down. The same date appears three times to because each blood pressure is recorded as systolic, diastolic and notes. 

Name 22-Apr 22-Apr 22-Apr 22-May 22-May May-22
  SBP DBP TI SBP DBP TI
AAA 120 60        
BBB 130 70   140 80 ACT
CCC 165 90 start ACE      
DDD 115 70        
EEE 135 75       BCT

 

This is how I'd like the dataset to look. 

 

Name Date SBP
Name Date DBP
Name Date TI
Name Date SBP
Name Date DBP
Name Date TI

 

I'm a bit stuck because of the data headers being in two rows. I've been looking at numerous examples but can't figure out how to do this. And help would be appreciated. Thank you. 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
Declare an array with the list of variables and loop through the columns to transpose the data.

data want;
set have;
array _dates(*) startVar--endVar; /*start of dates and end of dates*/
length measure $15.;
do i=1 to dim(_dates) by 3;
date = vname(_dates(i));
measure = "Systolic";
value = _dates(i); output;
measure = "Diastolic";
value = _dates(i+1); output;
measure = "Notes"; values=_dates(I+2); output;
end;
run;

I would probably actually recommend this structure myself so you're storing numeric and character data in different columns and can do calculations within a measurement.

data want;
set have;
array _dates(*) startVar -- endVar;
do i=1 to dim(_dates) by 3;
Date = vname(_dates(I));
Systolic = _dates(i);
Diastolic = _dates(i+1);
Notes = _dates(i+2);
output;
end;
run;

View solution in original post

6 REPLIES 6
Reeza
Super User
Take your first two rows into a separate data set. Transpose it to a long format. Create new variable names that have the date and measure, ie SBP_2022_04
Rename the variables in the file. Then transpose it.

If you can trust that it's always SBP, DBP, TI in that order you can simplify this a bit but I'm not a trusting person when it comes to badly formatted data.
joesmama
Obsidian | Level 7

Thanks for you response. Assuming the SBP, DBP and TI are in the same order, how can I simplify this? 

Reeza
Super User
Declare an array with the list of variables and loop through the columns to transpose the data.

data want;
set have;
array _dates(*) startVar--endVar; /*start of dates and end of dates*/
length measure $15.;
do i=1 to dim(_dates) by 3;
date = vname(_dates(i));
measure = "Systolic";
value = _dates(i); output;
measure = "Diastolic";
value = _dates(i+1); output;
measure = "Notes"; values=_dates(I+2); output;
end;
run;

I would probably actually recommend this structure myself so you're storing numeric and character data in different columns and can do calculations within a measurement.

data want;
set have;
array _dates(*) startVar -- endVar;
do i=1 to dim(_dates) by 3;
Date = vname(_dates(I));
Systolic = _dates(i);
Diastolic = _dates(i+1);
Notes = _dates(i+2);
output;
end;
run;
joesmama
Obsidian | Level 7

Thanks Reeza. I used your second suggestion and it worked nicely. Appreciate the help. 

 

 

Tom
Super User Tom
Super User

An actual XLSX file?  Or a text file like a CSV file?  If the later it is much easier as you can then just read it in the way you want to start with.

 

With an XLSX file read the file twice.

Once to get the first row with the dates.  If you know how many columns (or can at least put an upper bound on how many you will accept) you can use the RANGE option to limit it to reading just the first row.  For example to get the first 26 columns use

proc import file="&fname" dbms=xlsx out=dates_wide replace;
  range='$A1:Z1' ;
  getnames=NO;
run;

To read the first 702 columns use $A1:ZZ1. To get the first 18,278 columns use $A1:ZZZ1.

The second time to get the actual data.

proc import file="&fname" dbms=xlsx out=wide replace;
  range='$A2:' ;
  getnames=YES;
run;

Now transpose both of them:

proc transpose data=dates_wide out=dates(keep=col1 rename=(col1=date));
  var _all_;
run;
proc transpose data=wide out=tall(rename=(col1=value)) ;
   by A ;
   var _all_;
run;

Now you can merge them together copying the DATE value from the first row onto every observation in the TALL dataset by using the POINT= option of the SET statement.

data want;
  set tall;
  by A ;
  point+1;
  if first.A then point=1;
  set dates point=point;
run;

 

joesmama
Obsidian | Level 7

Thank you Tom. I'm going to give this one a try as well. 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1009 views
  • 3 likes
  • 3 in conversation