Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Proc transpose for Excel file with two rows for headers

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-08-2022 12:21 PM
(874 views)

Hello,

I am trying to use Proc Transpose to restructure my dataset. I've been given an Excel file that contains dates, blood pressures and notes (fictitious data). The data read the long way (across) but I need to get the file to read down. The same date appears three times to because each blood pressure is recorded as systolic, diastolic and notes.

Name | 22-Apr | 22-Apr | 22-Apr | 22-May | 22-May | May-22 |

SBP | DBP | TI | SBP | DBP | TI | |

AAA | 120 | 60 | ||||

BBB | 130 | 70 | 140 | 80 | ACT | |

CCC | 165 | 90 | start ACE | |||

DDD | 115 | 70 | ||||

EEE | 135 | 75 | BCT |

This is how I'd like the dataset to look.

Name | Date | SBP |

Name | Date | DBP |

Name | Date | TI |

Name | Date | SBP |

Name | Date | DBP |

Name | Date | TI |

I'm a bit stuck because of the data headers being in two rows. I've been looking at numerous examples but can't figure out how to do this. And help would be appreciated. Thank you.

- Tags:
- excel
- proc transpose

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Declare an array with the list of variables and loop through the columns to transpose the data.

data want;

set have;

array _dates(*) startVar--endVar; /*start of dates and end of dates*/

length measure $15.;

do i=1 to dim(_dates) by 3;

date = vname(_dates(i));

measure = "Systolic";

value = _dates(i); output;

measure = "Diastolic";

value = _dates(i+1); output;

measure = "Notes"; values=_dates(I+2); output;

end;

run;

I would probably actually recommend this structure myself so you're storing numeric and character data in different columns and can do calculations within a measurement.

data want;

set have;

array _dates(*) startVar -- endVar;

do i=1 to dim(_dates) by 3;

Date = vname(_dates(I));

Systolic = _dates(i);

Diastolic = _dates(i+1);

Notes = _dates(i+2);

output;

end;

run;

data want;

set have;

array _dates(*) startVar--endVar; /*start of dates and end of dates*/

length measure $15.;

do i=1 to dim(_dates) by 3;

date = vname(_dates(i));

measure = "Systolic";

value = _dates(i); output;

measure = "Diastolic";

value = _dates(i+1); output;

measure = "Notes"; values=_dates(I+2); output;

end;

run;

I would probably actually recommend this structure myself so you're storing numeric and character data in different columns and can do calculations within a measurement.

data want;

set have;

array _dates(*) startVar -- endVar;

do i=1 to dim(_dates) by 3;

Date = vname(_dates(I));

Systolic = _dates(i);

Diastolic = _dates(i+1);

Notes = _dates(i+2);

output;

end;

run;

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Take your first two rows into a separate data set. Transpose it to a long format. Create new variable names that have the date and measure, ie SBP_2022_04

Rename the variables in the file. Then transpose it.

If you can trust that it's always SBP, DBP, TI in that order you can simplify this a bit but I'm not a trusting person when it comes to badly formatted data.

Rename the variables in the file. Then transpose it.

If you can trust that it's always SBP, DBP, TI in that order you can simplify this a bit but I'm not a trusting person when it comes to badly formatted data.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

data want;

set have;

array _dates(*) startVar--endVar; /*start of dates and end of dates*/

length measure $15.;

do i=1 to dim(_dates) by 3;

date = vname(_dates(i));

measure = "Systolic";

value = _dates(i); output;

measure = "Diastolic";

value = _dates(i+1); output;

measure = "Notes"; values=_dates(I+2); output;

end;

run;

I would probably actually recommend this structure myself so you're storing numeric and character data in different columns and can do calculations within a measurement.

data want;

set have;

array _dates(*) startVar -- endVar;

do i=1 to dim(_dates) by 3;

Date = vname(_dates(I));

Systolic = _dates(i);

Diastolic = _dates(i+1);

Notes = _dates(i+2);

output;

end;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks Reeza. I used your second suggestion and it worked nicely. Appreciate the help.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

An actual XLSX file? Or a text file like a CSV file? If the later it is much easier as you can then just read it in the way you want to start with.

With an XLSX file read the file twice.

Once to get the first row with the dates. If you know how many columns (or can at least put an upper bound on how many you will accept) you can use the RANGE option to limit it to reading just the first row. For example to get the first 26 columns use

```
proc import file="&fname" dbms=xlsx out=dates_wide replace;
range='$A1:Z1' ;
getnames=NO;
run;
```

To read the first 702 columns use $A1:ZZ1. To get the first 18,278 columns use $A1:ZZZ1.

The second time to get the actual data.

```
proc import file="&fname" dbms=xlsx out=wide replace;
range='$A2:' ;
getnames=YES;
run;
```

Now transpose both of them:

```
proc transpose data=dates_wide out=dates(keep=col1 rename=(col1=date));
var _all_;
run;
proc transpose data=wide out=tall(rename=(col1=value)) ;
by A ;
var _all_;
run;
```

Now you can merge them together copying the DATE value from the first row onto every observation in the TALL dataset by using the POINT= option of the SET statement.

```
data want;
set tall;
by A ;
point+1;
if first.A then point=1;
set dates point=point;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you Tom. I'm going to give this one a try as well.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.