Hello, I am trying to transpose some data, which, on its face sounds easy. Though please before you grumble at me, know that I have searched around and in the SAS manual and I cant seem to find exactly what I am looking for; the process of discovery is much more important to learning and long-term retention, but in this case I am stumped.
The data are currently in the following format:
StudyID Variable database_value
00001 STE_TEST 12.1
00002 PAC_RES Yes
00003 NST_RES 15.6
00003 STE_TEST 91.2
00004 PAC_RES No
I need the data to look like this:
StudyID STE_TEST PAC_RES NST_RES
00001 12.1
00002 Yes
00003 91.2 15.6
00004 No
I have several (~30) datasets that need transforming all in the same way. An example of one dataset (named albuminresult) shows how the data are currently formatted, which I have converted to a data step using the SAS macro here.
data WORK.ALBUMINRESULT;
infile datalines dsd truncover;
input StudyID:$5. VisitType:$1. Variable:$17. database_value:$30.;
label StudyID="StudyID" VisitType="VisitType" Variable="Variable" database_value="database_value";
datalines4;
K6002,,LabNo,PED0225
K6004,,ReqdDilution,2
K6048,,ReqdDilution,0
K6073,,ReasDilutNotReqd,n/a
K6076,,ResultmgL,12
K6086,,ReasDilutNotReqd,n/a
K6163,,ReasDilutNotReqd,n/a
K6165,,ReasDilutNotReqd,n/a
K6467,,ResultDate,04MAR2015:0:00:00.00
K6548,,Result,3
;;;;
As you can see from the above examples, values in the column "variable" need to be turned into variables, and the values of the variable "database_value" needs to be the value of the newly created variable. An example of what I need the data to look like (called "need") if the above albumin result dataset were properly formatted is below (data stepped):
data WORK.NEED;
infile datalines dsd truncover;
input StudyID:$5. LabNo:$7. ReqdDilution:32. ReasDilutNotReqd:$3. ResultmgL:32. ResultDate:$20. Result:$1.;
label StudyID="StudyID" LabNo="LabNo" ReqdDilution="ReqdDilution" ReasDilutNotReqd="ReasDilutNotReqd" ResultmgL="ResultmgL" ResultDate="ResultDate" Result="Result";
datalines4;
K6002,PED0225,,,,,
K6004,,2,,,,
K6048,,0,,,,
K6073,,,n/a,,,
K6076,,,,12,,
K6086,,,n/a,,,
K6163,,,n/a,,,
K6165,,,n/a,,,
K6467,,,,,04MAR2015:0:00:00.00,
K6548,,,,,,3
;;;;
Obviously a PROC TRANSPOSE can put the variable values up in the columns, but (1) repeated values also get transposed and (2) I am unsure how to correctly match/tie in the values from the "database_value" variable.
Any help with even this one dataset would be amazing. But as I mentioned there are ~30 "datasets." Though to be more specific, they are all tables in an Access database and I pull them in using the "libname x Access 'C://example.accdb'; Proc XYZ data=x.tablename;..." route. Moreover, the variable values will be different in each table. Ultimately, the goal is to compare an access database with the incorrectly formatted data with a master database that has the correctly formatted data. So my plan of action is to correctly reformat all of the Access tables into individual SAS datasets, then create a new Access database using PROC COPY, which I have successfully done before. Of course, I am open to suggestions.
Thank you for your help!
I apologize if this me simply missing something elementary.
Jonathan Smith
PROC TRANSPOSE does contain a statement to place the data in the columns that you want. Try:
proc transpose data=have out=want;
var database_value;
id variable;
by StudyId;
run;
PROC TRANSPOSE does contain a statement to place the data in the columns that you want. Try:
proc transpose data=have out=want;
var database_value;
id variable;
by StudyId;
run;
I may have marked this completed too soon - it appears that for that one table it worked, but for many of the tables being converted, errors pop up (I thought I had tried this, and now remember that it didn't work, at least for me).
Since it is longer, I am attaching another macro-created datastep to provide an example of when it didnt work, and the error log that goes with it when I run it (note: I had to change the extension from ".log" to ".txt").
Any suggestions?
You'll have to actually show the results. Few people will trust an attachment.
The most common problem with this type of program is a data issue, not a programming issue. What should the result be, if there are two observations that both have the same value for StudyID and Variable? You need to decide how to handle that situation when there is only one spot available to hold two values.
Thank you so much for your quick response - it does seem to be a data issue, your suggested code works perfectly. I had gotten close on my own so I cant thank you enough for helping me figure this out!
Just to note, after looking into the SAS documentation a bit more, adding a LET statement in the proc transpose line solved the specific issue of duplicates I was referring to in my follow up. It may not solve other people's specific situation, but a good tool to have in the toolbar. Thanks again for your help!
Jonathan
If others are interested, more info on the LET statement and dealing with duplicate values in proc transpose:
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.