DATA Step, Macro, Functions and more

Splitting columns

Reply
New Contributor
Posts: 4

Splitting columns

[ Edited ]

Hi all,

Is there a simple way to split my columns so that I can get File 2 starting from File 1? Please, see the attachment.

Thank you in advance,

Steve.

FILE 1: Initial              
               
Employee IDAppointment dateClassificationLevelSalaryNumber of appointments         
11AS240 0002         
12AS350 0002         
21EC6100 0001         
31CS150 0003         
32CS260 0003         
33CS380 0003         
41EX3150 0001         
               
FILE 2: Expected             
               
Employee IDAppointment date_1Class_1Level_1Salary_1Appointment date_2Class_2Level_2Salary_2Appointment date_3Class_3Level_3Salary_3Number of appointments 
11AS240 0002AS355 000    2 
21EC6100 000        1 
31CS150 0002CS260 0003CS380 0003 
41EX1150 000        1 
               
               
Super User
Super User
Posts: 7,401

Re: Splitting columns

Sorry, please post test data as a datastep in the body of the post.  Most people will not open files from the web.

To split columns you can use the split() function and supply a delimiter.

PROC Star
Posts: 7,363

Re: Splitting columns

[ Edited ]

What you want can be done with a combination of a data step and a couple of proc transpose steps, OR, if you download and run the macro at http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset , you can do it with the following call to the macro:

%transpose(data=file1,
           out=file2, by=employee_id, guessingrows=1000,
           delimiter=_,
           var=appointment_date classification level salary,
           copy=number_of_appointments)

OR, if your data isn't already in employee_id order:

%transpose(data=file1,
           out=file2, by=employee_id, guessingrows=1000,
           delimiter=_,
           var=appointment_date classification level salary,
           copy=number_of_appointments,
           sort=yes)

 

HTH,

Art, CEO, AnalystFinder.com

 

New Contributor
Posts: 4

Re: Splitting columns

Thank you!
PROC Star
Posts: 7,363

Re: Splitting columns

You're welcome! But take another look at my post. I just modified my post to include the parameter to presort your data in case you need it.

 

Art, CEO, AnalystFinder.com

 

Frequent Contributor
Frequent Contributor
Posts: 89

Re: Splitting columns

Why would you want to do this?  Except for the calculated column (generally you should not keep calculated data like that) the first table is effectively normalized and a good data store. It will be easy to work with.

 

The second table is not normalized, not a good data store and not easy to work with.  Is it just for presentation? 

New Contributor
Posts: 4

Re: Splitting columns

I just want to be able to extract carreer paths (longitudinal database) with arrays after. Might be easier to work with columns than rows as everone can have a different number of positions over their carreer. 

Frequent Contributor
Frequent Contributor
Posts: 89

Re: Splitting columns

"I just want to be able to extract carreer paths (longitudinal database) with arrays after. Might be easier to work with columns than rows as everone can have a different number of positions over their carreer."

 

So

 

proc sql;
   create table career_path as 
        select *
        from your_first_table
        where employee_id = 1
order by appointment_date;
quit;

 

is not the career path of employee 1?

Valued Guide
Posts: 505

Re: Splitting columns

SAS Forum: Splitting Columns

this message
https://goo.gl/Zp9ntH
https://communities.sas.com/t5/Base-SAS-Programming/Splitting-columns/m-p/348756

HAVE
====

Up to 40 obs WORK.HAVE total obs=7

APPOINTMENT_ NUMBER_
Obs EMPLOYEEID DATE CLASS LEVEL SALARY APPOINTMENTS

1 1 1 AS 2 40000 2
2 1 2 AS 3 50000 2
3 2 1 EC 6 100000 1
4 3 1 CS 1 50000 3
5 3 2 CS 2 60000 3
6 3 3 CS 3 80000 3
7 4 1 EX 3 150000 1

WANT
====

Up to 40 obs WORK.WANT total obs=4

APPOINTMENT
EMPLOYEEID DATE_1 DATE_2 DATE_3 CLASS_1 CLASS_2 CLASS_3 LEVEL_1 LEVEL_2 LEVEL_3 SALARY_1 SALARY_2 SALARY_3

1 1 2 . AS AS 2 3 . 40000 50000 .
2 1 . . EC 6 . . 100000 . .
3 1 2 3 CS CS CS 1 2 3 50000 60000 80000
4 1 . . EX 3 . . 150000 . .

WORKING CODE
============

If uncomfortable or if this is production you can move the DOSUBL out

DOSUBL
proc sql;
select max(grpcnt) into :grpCnt separated by "" from
(select count(employeeid) as grpcnt from have group by employeeID)
;quit;

array cls[&grpcnt] $ Class_1 - Class_&grpCnt;

* _ _ _
_ __ ___ __ _| | _____ __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| | < __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_|

;

data have;
input
EmployeeID Appointment_date Class $ Level Salary Number_appointments;
cards4;
1 1 AS 2 40000 2
1 2 AS 3 50000 2
2 1 EC 6 100000 1
3 1 CS 1 50000 3
3 2 CS 2 60000 3
3 3 CS 3 80000 3
4 1 EX 3 150000 1
;;;;
run;quit;

* _ _ _
___ ___ | |_ _| |_(_) ___ _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

%symdel rc grpcnt / nowarn;
data want;

* meta data;
if _n_ =0 then do;
%let rc=%sysfunc(dosubl('
proc sql;
select max(grpcnt) into :grpCnt separated by "" from
(select count(employeeid) as grpcnt from have group by employeeID)
;quit;
'));
retain rc &rc;
end;

* if sql fails;
if rc ne 0 then stop;

retain
cnt 0
employeeid
Appointment_date_1-Appointment_date_&grpCnt
Class_1 - Class_&grpCnt
Level_1 - Level_&grpCnt
Salary_1 - Salary_&grpCnt;
;

set have;

by employeeid;
array dte[&grpcnt] Appointment_date_1-Appointment_date_&grpCnt;
array cls[&grpcnt] $ Class_1 - Class_&grpCnt;
array lvl[&grpcnt] Level_1 - Level_&grpCnt;
array pay[&grpcnt] Salary_1 - Salary_&grpCnt;
cnt=cnt+1;
dte[cnt]=Appointment_date;
cls[cnt]=Class;
lvl[cnt]=Level;
pay[cnt]=Salary;
if last.employeeid then do;
output;
call missing(of _all_);
cnt=0;
rc=0;
end;
drop rc cnt Appointment_date Class Level Salary Number_appointments;

run;quit;






Valued Guide
Posts: 505

Re: Splitting columns

SAS Forum: Splitting Columns

formatting problem above

this message
https://goo.gl/Zp9ntH
https://communities.sas.com/t5/Base-SAS-Programming/Splitting-columns/m-p/348756

HAVE
====

Up to 40 obs WORK.HAVE total obs=7

                     APPOINTMENT_                                   NUMBER_
Obs    EMPLOYEEID        DATE        CLASS    LEVEL    SALARY    APPOINTMENTS

 1          1              1          AS        2       40000          2
 2          1              2          AS        3       50000          2
 3          2              1          EC        6      100000          1
 4          3              1          CS        1       50000          3
 5          3              2          CS        2       60000          3
 6          3              3          CS        3       80000          3
 7          4              1          EX        3      150000          1

WANT
====

Up to 40 obs WORK.WANT total obs=4

               APPOINTMENT
EMPLOYEEID DATE_1 DATE_2 DATE_3 CLASS_1 CLASS_2 CLASS_3 LEVEL_1 LEVEL_2 LEVEL_3 SALARY_1 SALARY_2 SALARY_3

     1        1      2      .     AS      AS               2       3       .      40000    50000        .
     2        1      .      .     EC                       6       .       .     100000        .        .
     3        1      2      3     CS      CS      CS       1       2       3      50000    60000    80000
     4        1      .      .     EX                       3       .       .     150000        .        .

WORKING CODE
============

  If uncomfortable or if this is production you can move the DOSUBL out

  DOSUBL
    proc sql;
      select max(grpcnt) into :grpCnt separated by "" from
      (select count(employeeid) as grpcnt from have group by employeeID)
    ;quit;

    array cls[&grpcnt] $ Class_1  -  Class_&grpCnt;

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;

data have;
input
EmployeeID Appointment_date Class $ Level Salary Number_appointments;
cards4;
1 1 AS 2 40000 2
1 2 AS 3 50000 2
2 1 EC 6 100000 1
3 1 CS 1 50000 3
3 2 CS 2 60000 3
3 3 CS 3 80000 3
4 1 EX 3 150000 1
;;;;
run;quit;

*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

%symdel rc grpcnt / nowarn;
data want;

  * meta data;
  if _n_ =0 then do;
   %let rc=%sysfunc(dosubl('
       proc sql;
         select max(grpcnt) into :grpCnt separated by "" from
         (select count(employeeid) as grpcnt from have group by employeeID)
       ;quit;
     '));
     retain rc &rc;
  end;

  * if sql fails;
  if rc ne 0 then stop;

  retain
    cnt 0
    employeeid
    Appointment_date_1-Appointment_date_&grpCnt
    Class_1  -  Class_&grpCnt
    Level_1  -  Level_&grpCnt
    Salary_1 -  Salary_&grpCnt;
  ;

  set have;

  by employeeid;
  array dte[&grpcnt]   Appointment_date_1-Appointment_date_&grpCnt;
  array cls[&grpcnt] $ Class_1  -  Class_&grpCnt;
  array lvl[&grpcnt]   Level_1  -  Level_&grpCnt;
  array pay[&grpcnt]   Salary_1 -  Salary_&grpCnt;
  cnt=cnt+1;
  dte[cnt]=Appointment_date;
  cls[cnt]=Class;
  lvl[cnt]=Level;
  pay[cnt]=Salary;
  if last.employeeid then do;
         output;
         call missing(of _all_);
         cnt=0;
         rc=0;
  end;
  drop rc cnt  Appointment_date Class  Level Salary Number_appointments;

run;quit;






New Contributor
Posts: 4

Re: Splitting columns

Thank you very much!
Ask a Question
Discussion stats
  • 10 replies
  • 227 views
  • 0 likes
  • 5 in conversation