Hi all,
Is there a simple way to split my columns so that I can get File 2 starting from File 1? Please, see the attachment.
Thank you in advance,
Steve.
FILE 1: Initial | ||||||||||||||
Employee ID | Appointment date | Classification | Level | Salary | Number of appointments | |||||||||
1 | 1 | AS | 2 | 40 000 | 2 | |||||||||
1 | 2 | AS | 3 | 50 000 | 2 | |||||||||
2 | 1 | EC | 6 | 100 000 | 1 | |||||||||
3 | 1 | CS | 1 | 50 000 | 3 | |||||||||
3 | 2 | CS | 2 | 60 000 | 3 | |||||||||
3 | 3 | CS | 3 | 80 000 | 3 | |||||||||
4 | 1 | EX | 3 | 150 000 | 1 | |||||||||
FILE 2: Expected | ||||||||||||||
Employee ID | Appointment date_1 | Class_1 | Level_1 | Salary_1 | Appointment date_2 | Class_2 | Level_2 | Salary_2 | Appointment date_3 | Class_3 | Level_3 | Salary_3 | Number of appointments | |
1 | 1 | AS | 2 | 40 000 | 2 | AS | 3 | 55 000 | 2 | |||||
2 | 1 | EC | 6 | 100 000 | 1 | |||||||||
3 | 1 | CS | 1 | 50 000 | 2 | CS | 2 | 60 000 | 3 | CS | 3 | 80 000 | 3 | |
4 | 1 | EX | 1 | 150 000 | 1 | |||||||||
Sorry, please post test data as a datastep in the body of the post. Most people will not open files from the web.
To split columns you can use the split() function and supply a delimiter.
What you want can be done with a combination of a data step and a couple of proc transpose steps, OR, if you download and run the macro at http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset , you can do it with the following call to the macro:
%transpose(data=file1, out=file2, by=employee_id, guessingrows=1000, delimiter=_, var=appointment_date classification level salary, copy=number_of_appointments)
OR, if your data isn't already in employee_id order:
%transpose(data=file1, out=file2, by=employee_id, guessingrows=1000, delimiter=_, var=appointment_date classification level salary, copy=number_of_appointments, sort=yes)
HTH,
Art, CEO, AnalystFinder.com
You're welcome! But take another look at my post. I just modified my post to include the parameter to presort your data in case you need it.
Art, CEO, AnalystFinder.com
Why would you want to do this? Except for the calculated column (generally you should not keep calculated data like that) the first table is effectively normalized and a good data store. It will be easy to work with.
The second table is not normalized, not a good data store and not easy to work with. Is it just for presentation?
I just want to be able to extract carreer paths (longitudinal database) with arrays after. Might be easier to work with columns than rows as everone can have a different number of positions over their carreer.
"I just want to be able to extract carreer paths (longitudinal database) with arrays after. Might be easier to work with columns than rows as everone can have a different number of positions over their carreer."
So
proc sql; create table career_path as select * from your_first_table where employee_id = 1
order by appointment_date;
quit;
is not the career path of employee 1?
SAS Forum: Splitting Columns
this message
https://goo.gl/Zp9ntH
https://communities.sas.com/t5/Base-SAS-Programming/Splitting-columns/m-p/348756
HAVE
====
Up to 40 obs WORK.HAVE total obs=7
APPOINTMENT_ NUMBER_
Obs EMPLOYEEID DATE CLASS LEVEL SALARY APPOINTMENTS
1 1 1 AS 2 40000 2
2 1 2 AS 3 50000 2
3 2 1 EC 6 100000 1
4 3 1 CS 1 50000 3
5 3 2 CS 2 60000 3
6 3 3 CS 3 80000 3
7 4 1 EX 3 150000 1
WANT
====
Up to 40 obs WORK.WANT total obs=4
APPOINTMENT
EMPLOYEEID DATE_1 DATE_2 DATE_3 CLASS_1 CLASS_2 CLASS_3 LEVEL_1 LEVEL_2 LEVEL_3 SALARY_1 SALARY_2 SALARY_3
1 1 2 . AS AS 2 3 . 40000 50000 .
2 1 . . EC 6 . . 100000 . .
3 1 2 3 CS CS CS 1 2 3 50000 60000 80000
4 1 . . EX 3 . . 150000 . .
WORKING CODE
============
If uncomfortable or if this is production you can move the DOSUBL out
DOSUBL
proc sql;
select max(grpcnt) into :grpCnt separated by "" from
(select count(employeeid) as grpcnt from have group by employeeID)
;quit;
array cls[&grpcnt] $ Class_1 - Class_&grpCnt;
* _ _ _
_ __ ___ __ _| | _____ __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| | < __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_|
;
data have;
input
EmployeeID Appointment_date Class $ Level Salary Number_appointments;
cards4;
1 1 AS 2 40000 2
1 2 AS 3 50000 2
2 1 EC 6 100000 1
3 1 CS 1 50000 3
3 2 CS 2 60000 3
3 3 CS 3 80000 3
4 1 EX 3 150000 1
;;;;
run;quit;
* _ _ _
___ ___ | |_ _| |_(_) ___ _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|
;
%symdel rc grpcnt / nowarn;
data want;
* meta data;
if _n_ =0 then do;
%let rc=%sysfunc(dosubl('
proc sql;
select max(grpcnt) into :grpCnt separated by "" from
(select count(employeeid) as grpcnt from have group by employeeID)
;quit;
'));
retain rc &rc;
end;
* if sql fails;
if rc ne 0 then stop;
retain
cnt 0
employeeid
Appointment_date_1-Appointment_date_&grpCnt
Class_1 - Class_&grpCnt
Level_1 - Level_&grpCnt
Salary_1 - Salary_&grpCnt;
;
set have;
by employeeid;
array dte[&grpcnt] Appointment_date_1-Appointment_date_&grpCnt;
array cls[&grpcnt] $ Class_1 - Class_&grpCnt;
array lvl[&grpcnt] Level_1 - Level_&grpCnt;
array pay[&grpcnt] Salary_1 - Salary_&grpCnt;
cnt=cnt+1;
dte[cnt]=Appointment_date;
cls[cnt]=Class;
lvl[cnt]=Level;
pay[cnt]=Salary;
if last.employeeid then do;
output;
call missing(of _all_);
cnt=0;
rc=0;
end;
drop rc cnt Appointment_date Class Level Salary Number_appointments;
run;quit;
SAS Forum: Splitting Columns
formatting problem above
this message
https://goo.gl/Zp9ntH
https://communities.sas.com/t5/Base-SAS-Programming/Splitting-columns/m-p/348756
HAVE
====
Up to 40 obs WORK.HAVE total obs=7
APPOINTMENT_ NUMBER_
Obs EMPLOYEEID DATE CLASS LEVEL SALARY APPOINTMENTS
1 1 1 AS 2 40000 2
2 1 2 AS 3 50000 2
3 2 1 EC 6 100000 1
4 3 1 CS 1 50000 3
5 3 2 CS 2 60000 3
6 3 3 CS 3 80000 3
7 4 1 EX 3 150000 1
WANT
====
Up to 40 obs WORK.WANT total obs=4
APPOINTMENT
EMPLOYEEID DATE_1 DATE_2 DATE_3 CLASS_1 CLASS_2 CLASS_3 LEVEL_1 LEVEL_2 LEVEL_3 SALARY_1 SALARY_2 SALARY_3
1 1 2 . AS AS 2 3 . 40000 50000 .
2 1 . . EC 6 . . 100000 . .
3 1 2 3 CS CS CS 1 2 3 50000 60000 80000
4 1 . . EX 3 . . 150000 . .
WORKING CODE
============
If uncomfortable or if this is production you can move the DOSUBL out
DOSUBL
proc sql;
select max(grpcnt) into :grpCnt separated by "" from
(select count(employeeid) as grpcnt from have group by employeeID)
;quit;
array cls[&grpcnt] $ Class_1 - Class_&grpCnt;
* _ _ _
_ __ ___ __ _| | _____ __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| | < __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_|
;
data have;
input
EmployeeID Appointment_date Class $ Level Salary Number_appointments;
cards4;
1 1 AS 2 40000 2
1 2 AS 3 50000 2
2 1 EC 6 100000 1
3 1 CS 1 50000 3
3 2 CS 2 60000 3
3 3 CS 3 80000 3
4 1 EX 3 150000 1
;;;;
run;quit;
* _ _ _
___ ___ | |_ _| |_(_) ___ _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|
;
%symdel rc grpcnt / nowarn;
data want;
* meta data;
if _n_ =0 then do;
%let rc=%sysfunc(dosubl('
proc sql;
select max(grpcnt) into :grpCnt separated by "" from
(select count(employeeid) as grpcnt from have group by employeeID)
;quit;
'));
retain rc &rc;
end;
* if sql fails;
if rc ne 0 then stop;
retain
cnt 0
employeeid
Appointment_date_1-Appointment_date_&grpCnt
Class_1 - Class_&grpCnt
Level_1 - Level_&grpCnt
Salary_1 - Salary_&grpCnt;
;
set have;
by employeeid;
array dte[&grpcnt] Appointment_date_1-Appointment_date_&grpCnt;
array cls[&grpcnt] $ Class_1 - Class_&grpCnt;
array lvl[&grpcnt] Level_1 - Level_&grpCnt;
array pay[&grpcnt] Salary_1 - Salary_&grpCnt;
cnt=cnt+1;
dte[cnt]=Appointment_date;
cls[cnt]=Class;
lvl[cnt]=Level;
pay[cnt]=Salary;
if last.employeeid then do;
output;
call missing(of _all_);
cnt=0;
rc=0;
end;
drop rc cnt Appointment_date Class Level Salary Number_appointments;
run;quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.