Dear SAS community,
I am a new user of SAS and I guess this is a fairly easy question but I have been searching the web and I can't solve it.
I receive a file on a weekly basis containing trainings that employees have to take and I need to pivot some of its columns. The first 4 variables are always called the same (they are basic employees info), but the rest varies (the variables names change and I have a different number of columns each time, as each week different trainings have to be taken). Here is my code:
PROC IMPORT OUT= IMPORT.Mandatory_Trainings
DATAFILE= "Excel file path"
DBMS=EXCEL REPLACE;
Range="A1:AAA100000";
GETNAMES=Yes;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
proc transpose Data=IMPORT.Mandatory_Trainings OUT=Base.Mandatory_Trainings_Pivot;
BY ID Employee_Name Overall__Training_Status General_Exclusion__Status;
VAR XXXXX;
run;
The line "VAR XXXXX;" is where I need to change my code.
I put the name of the first and last training and separated them with "--" ("First_Training_Name -- Last_Training_Name_") and it worked perfectly fine but I want a more automated way where SAS select and pivot all variables that are between column 5 and the last one. Any advice on how to do this?
Thanks in advance for the help.
You could put this DATA step between the import and the transpose - in which you insert 2 sentinel variables (_PRE and _POST) in location 5 and location "last".
data need;
set IMPORT.Mandatory_Trainings (keep=ID Employee_Name Overall__Training_Status General_Exclusion__Status);
retain _PRE .;
set IMPORT.Mandatory_Trainings;
retain _POST .;
run;
The "trick" here is to understand how SAS compiler builds the program data vector (PDV) in a data step.
Then you can modify the proc transpose to always have "var _PRE--_POST" - no changing varname awareness required. And just insert a "(where=(not(name in ("_PRE","_POST"))))" dataset-name-parameter in the "out=" clause:
proc transpose Data=need
OUT=Base.Mandatory_Trainings_Pivot (where=(not(_name_ in ('_PRE','_POST'))));
BY ID Employee_Name Overall__Training_Status General_Exclusion__Status;
var _pre -- _post;
run;
Well, your best options would be to drop Excel as a data transfer medium - this would fix all your problems, make a far more robust data transfer system, and make your life easier. However you will say that you can't and so your process will always be sub-par. You can get a list of varaible names via:
proc sql; select NAME into :VLIST separated by " " from DICTIONARY.COLUMNS where LIBNAME="<thelibname>" and MEMNAME="<dataste name>"; quit; ... var &varlist.; ...
I can think of at least a few issues with the process - transposing, what happens when there are mixed number/character and if you let SAS auto convert it will it be right. How will you validate such a process? What happens when the spreadsheet changes - i.e. your four always there variables - you don't have a signed import agreement so they could send anything they like. Etc.
Select excel variables based on their column number to piviot
I assume variables to transform are after 'ID' variables.
HAVE (Two workbooks)
d:/xls/class.xlsx
+-------------------------+------+-----------+------------
| | A | B | B | C | D |
+------+-----------+------+------+-----------+-----------+
| | | | | | |
| 1 | NAME | AGE| SEX | HEIGHT | WEIGHT |
| 2 | Alfred | 14 | M | 55 | 96 |
| 3 | Alice | 13 | F | 44 | 87 |
| 4 | Barbara | 13 | F | 48 | 88 |
| 5 | Carol | 14 | F | 56 | 99 |
| 6 | Henry | 14 | M | 52 | 84 |
| ... | ... | ... | ... | .. | ... |
+------------------+------+------+-----------+-----------+
[CLASS}
d:/xls/class.xlsx
+-------------------------+------+-----------+-----------+-----------+
| | A | B | B | C | D | E |
+------+-----------+------+------+-----------+-----------+-----------+
| | | | | | | |
| 1 | NAME | AGE| SEX | HEIGHT | WEIGHT | PREDICT |
| 2 | Alfred | 14 | M | 55 | 96 | 91 |
| 3 | Alice | 13 | F | 44 | 87 | 82 |
| 4 | Barbara | 13 | F | 48 | 88 | 83 |
| 5 | Carol | 14 | F | 56 | 99 | 94 |
| 6 | Henry | 14 | M | 52 | 84 | 85 |
| ... | ... | ... | ... | .. | ... | ... |
+------------------+------+------+-----------+-----------+-----------+
[CLASSFIT}
WANT
Up to 40 obs from WORK.CLASSXPO total obs=95
Obs NAME _NAME_ COL1
1 Alfred NAME Alfred
2 Alfred SEX M
3 Alfred AGE 14
4 Alfred HEIGHT 69
5 Alfred WEIGHT 112.5
6 Alice NAME Alice
7 Alice SEX F
8 Alice AGE 13
9 Alice HEIGHT 56.5
10 Alice WEIGHT 84
...
Obs NAME _NAME_ COL1
1 Alfred NAME Alfred
2 Alfred SEX M
3 Alfred AGE 14
4 Alfred HEIGHT 69
5 Alfred WEIGHT 112.5
6 Alfred PREDICT 126
11 Alice NAME Alice
12 Alice SEX F
13 Alice AGE 13
14 Alice HEIGHT 56.5
15 Alice WEIGHT 84
16 Alice PREDICT 77
....
WORKING CODE
============
proc transpose data=xel.&bok out=&bok.xpo(drop=_label_);
by name;
var _all_;
FULL SOLUTION
=============
* create some code;
data _null_;
do wrkbok="classfit","class";
call symputx('bok',wrkbok);
rc=dosubl('
%*utlfkil(d:/xls/&bok..xlsx);
libname xel "d:/xls/&bok..xlsx";
data xel.&bok;
set sashelp.&bok;
run;quit;
libname xel clear;
');
end;
run;quit;
* transpose;
data _null_;
do wrkbok="classfit","class";
call symputx('bok',wrkbok);
rc=dosubl('
libname xel "d:/xls/&bok..xlsx";
proc transpose data=xel.&bok out=&bok.xpo(drop=_label_);
by name;
var _all_;
run;quit;
libname xel clear;
');
end;
run;quit;
You could put this DATA step between the import and the transpose - in which you insert 2 sentinel variables (_PRE and _POST) in location 5 and location "last".
data need;
set IMPORT.Mandatory_Trainings (keep=ID Employee_Name Overall__Training_Status General_Exclusion__Status);
retain _PRE .;
set IMPORT.Mandatory_Trainings;
retain _POST .;
run;
The "trick" here is to understand how SAS compiler builds the program data vector (PDV) in a data step.
Then you can modify the proc transpose to always have "var _PRE--_POST" - no changing varname awareness required. And just insert a "(where=(not(name in ("_PRE","_POST"))))" dataset-name-parameter in the "out=" clause:
proc transpose Data=need
OUT=Base.Mandatory_Trainings_Pivot (where=(not(_name_ in ('_PRE','_POST'))));
BY ID Employee_Name Overall__Training_Status General_Exclusion__Status;
var _pre -- _post;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.