Hi All,
I want to create a table that looks like the below from the following. Can I do this from a Proc Tabulate? Or is there a better way of doing it?
Data Work.Toowoomba_Transfers;
set Data.Dataset1;
Keep WEEK_END_DT Logon NAME ORIGINATING_SKILL TARGET_SKILL_NM TRANSFER_CNT;
Where WEEK_END_DT >= '26Jun2017.'d
and logon_id in ('ADB860' 'ACW710' 'LCM387' 'SGU331' 'CGJ575' 'MMY391' 'JML872' 'CJP922' 'AHN860' 'LSA902' 'TSM178' 'ASO417' 'OCW711' 'RKV507' 'RLP137');
Run;
Data work.ADB860;
Set work.toowoomba_Transfers;
Where Logon_ID = 'ADB860';
RUN;
Proc Sort data=work.adb860;
By WEEK_END_DT ORIGINATING_SKILL TARGET_SKILL_NM;
RUN;
Proc Summary data=work.adb860 sum;
By WEEK_END_DT ORIGINATING_SKILL TARGET_SKILL_NM;
VAR TRANSFER_CNT;
Output out=work.adb860_SUM (DROP=_FREQ_ _TYPE_) SUM=;
RUN;
Proc Tabulate data=work.adb860_sum format=comma9.0;
class week_end_dt ORIGINATING_SKILL target_skill_nm;
VAR TRANSFER_CNT;
Table
target_skill_nm = ""
,week_end_dt = " " *Transfer_cnt=""
/ box= "Transfers"
;
run;
data looks like this
WEEK_END_DT | ORIGINATING_SKILL | TARGET_SKILL_NM | TRANSFER_CNT |
2017-07-16 | ABC_DSC | ABC_D13 | 7 |
2017-07-16 | ABC_DSC | ABC_E13 | 1 |
2017-07-16 | ABC_EMS | ABC_D13 | 3 |
2017-07-16 | ABC_EMS | ABC_E13 | 3 |
2017-07-16 | ABC_EMS | ABC_R16 | 1 |
2017-07-16 | ABC_FAO | ABC_E13 | 1 |
2017-07-16 | ABC_FAO | ABC_F24 | 1 |
2017-07-16 | ABC_FAO | ABC_F43 | 1 |
2017-07-16 | ABC_FAO | ABC_F67 | 2 |
2017-07-16 | ABC_FAO | ABC_R18 | 1 |
2017-07-16 | ABC_FAO | ABC_Y13 | 1 |
2017-07-16 | ABC_PAZ | ABC_E13 | 1 |
2017-07-16 | ABC_RET | ABC_R18 | 1 |
2017-07-16 | ABC_TSE | ABC_E13 | 1 |
2017-07-16 | ABC_UPD | ABC_D13 | 1 |
2017-07-16 | ABC_UPE | ABC_E13 | 1 |
2017-07-16 | ABC_YAS | ABC_ALY | 1 |
2017-07-16 | ABC_YAS | ABC_E13 | 1 |
2017-07-16 | ABC_YAS | ABC_Y11 | 1 |
2017-07-16 | ABC_YAS | ABC_Y13 | 3 |
2017-07-23 | ABC_DRU | ABC_E13 | 1 |
2017-07-23 | ABC_DSC | ABC_D13 | 2 |
2017-07-23 | ABC_EMS | ABC_E13 | 2 |
2017-07-23 | ABC_FAO | ABC_D13 | 1 |
2017-07-23 | ABC_FAO | ABC_F24 | 1 |
2017-07-23 | ABC_TSR | ABC_R16 | 1 |
2017-07-23 | ABC_UPD | ABC_D13 | 2 |
2017-07-23 | ABC_UPD | ABC_E13 | 1 |
2017-07-23 | ABC_UPE | ABC_E13 | 1 |
2017-07-23 | ABC_YAS | ABC_E13 | 1 |
2017-07-23 | ABC_YAS | ABC_Y13 | 2 |
2017-07-30 | ABC_CID | ABC_CID | 1 |
2017-07-30 | ABC_TSP | ABC_TSP | 1 |
2017-08-06 | ABC_DSC | ABC_D13 | 1 |
2017-08-06 | ABC_TSP | ABC_TSP | 1 |
2017-08-20 | ABC_CID | ABC_CID | 1 |
2017-08-20 | ABC_CID | ABC_R16 | 1 |
2017-08-20 | ABC_FAO | ABC_EAF | 1 |
2017-08-20 | ABC_RET | ABC_RET | 1 |
final output to look like this.
ORIGINATING_SKILL | TARGET_SKILL_NM | 2017-07-16 | 2017-07-23 | 2017-07-30 | 2017-08-06 | 2017-08-20 |
ABC_DSC | ABC_D13 | 7 | 2 | 0 | 1 | 0 |
ABC_DSC | ABC_E13 | 1 | 0 | 0 | 0 | 0 |
ABC_EMS | ABC_D13 | 3 | 0 | 0 | 0 | 0 |
ABC_EMS | ABC_E13 | 3 | 2 | 0 | 0 | 0 |
ABC_EMS | ABC_R16 | 1 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_E13 | 1 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_F24 | 1 | 1 | 0 | 0 | 0 |
ABC_FAO | ABC_F43 | 1 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_F67 | 2 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_R18 | 1 | 0 | 0 | 0 | 0 |
Yes, you can likely simplify this code.
Some ideas:
1. Combine the data steps to one. If it's only filtering and keeping variables (WHERE/KEEP) you can do those as a data set option rather than as a separate step. Why does the IN have all int he first data step and then a subset in the second, are you attempting to automate this for many different values?
2. You don't need a SORT if you don't have a BY statement and I don't see a BY statement in your procs.
3. You may be able to combine the summary and tabulate into one step but I'm not 100% sure of that without trying it, but I'd consider it worth an attempt at least.
Note the sum in the proc tabulate and the reference to your original dataset not the sum data set. I doubt this is going to work as is, but hopefully it gives you an idea.
Proc Tabulate data=work.toowomba_transfers format=comma9.0;
where week_end_dt>= '26Jun017'd and login_id in ('ADB860');
class week_end_dt ORIGINATING_SKILL target_skill_nm;
VAR TRANSFER_CNT;
Table
target_skill_nm = ""
,week_end_dt = " " *Transfer_cnt="" * sum
/ box= "Transfers"
;
run;
Yes, you can likely simplify this code.
Some ideas:
1. Combine the data steps to one. If it's only filtering and keeping variables (WHERE/KEEP) you can do those as a data set option rather than as a separate step. Why does the IN have all int he first data step and then a subset in the second, are you attempting to automate this for many different values?
2. You don't need a SORT if you don't have a BY statement and I don't see a BY statement in your procs.
3. You may be able to combine the summary and tabulate into one step but I'm not 100% sure of that without trying it, but I'd consider it worth an attempt at least.
Note the sum in the proc tabulate and the reference to your original dataset not the sum data set. I doubt this is going to work as is, but hopefully it gives you an idea.
Proc Tabulate data=work.toowomba_transfers format=comma9.0;
where week_end_dt>= '26Jun017'd and login_id in ('ADB860');
class week_end_dt ORIGINATING_SKILL target_skill_nm;
VAR TRANSFER_CNT;
Table
target_skill_nm = ""
,week_end_dt = " " *Transfer_cnt="" * sum
/ box= "Transfers"
;
run;
Thanks Reeza,
I used the following:
Data Work.Toowoomba_Transfers_test;
set work.data;
Keep WEEK_END_DT logon_id EMPLOYEE_NAME ORIGINATING_SKILL target_skill TRANSFER_CNT;
Where WEEK_END_DT >= '26Jun2017.'d
and logon_id in ('ADB860' 'ACW710' 'LCM387' 'SGU331' 'CGJ575' 'MMY391' 'JML872' 'CJP922' 'AHN860' 'LSA902' 'TSM178' 'ASO417' 'OCW711' 'RKV507' 'RLP137');
Run;
Proc Tabulate data=Work.Toowoomba_Transfers_test format=comma9.0;
where week_end_dt>= '26Jun017'd and logon_id in ('ADB860');
class week_end_dt ORIGINATING_SKILL target_skill;
VAR TRANSFER_CNT;
Table
ORIGINATING_SKILL = "" * target_skill = ""
,week_end_dt = " " *Transfer_cnt="" * sum
/ box= "Transfers"
;
run;
abnd it produced this - Can I get rid of the 'Sum' from displaying?
Transfers | 16/07/2017 | 23/07/2017 | 30/07/2017 | 6/08/2017 | 20/08/2017 | |
Sum | Sum | Sum | Sum | Sum | ||
CLK_CID | CLK_CID | . | . | 1 | . | 1 |
CLK_R16 | . | . | . | . | 1 | |
CLK_DRU | CLK_E13 | . | 1 | . | . | . |
CLK_DSC | CLK_D13 | 7 | 2 | . | 1 | . |
CLK_E13 | 1 | . | . | . | . | |
CLK_EMS | CLK_D13 | 3 | . | . | . | . |
CLK_E13 | 3 | 2 | . | . | . | |
CLK_R16 | 1 | . | . | . | . | |
CLK_FAO | CLK_D13 | . | 1 | . | . | . |
CLK_E13 | 1 | . | . | . | . | |
CLK_EAF | . | . | . | . | 1 | |
CLK_F24 | 1 | 1 | . | . | . | |
CLK_F43 | 1 | . | . | . | . | |
CLK_F67 | 2 | . | . | . | . | |
CLK_R18 | 1 | . | . | . | . | |
CLK_Y13 | 1 | . | . | . | . | |
CLK_PAZ | CLK_E13 | 1 | . | . | . | . |
CLK_RET | CLK_R18 | 1 | . | . | . | . |
CLK_RET | . | . | . | . | 1 | |
CLK_TSE | CLK_E13 | 1 | . | . | . | . |
CLK_TSP | CLK_TSP | . | . | 1 | 1 | . |
CLK_TSR | CLK_R16 | . | 1 | . | . | . |
CLK_UPD | CLK_D13 | 1 | 2 | . | . | . |
CLK_E13 | . | 1 | . | . | . | |
CLK_UPE | CLK_E13 | 1 | 1 | . | . | . |
CLK_YAS | CLK_ALY | 1 | . | . | . | . |
CLK_E13 | 1 | 1 | . | . | . | |
CLK_Y11 | 1 | . | . | . | . | |
CLK_Y13 | 3 | 2 | . | . | . |
Cheers
Dean
Yes, use the same syntax you did for the other variables (sum=' ').
You may also want to look into the MISSTEXT option to get the 0 rather than missing.
And your first data step is unnecessary. The WHERE in PROC TABULATE is the same, so at best it's redundant - you should remove one or the other.
Hi Reeza,
So the Proc Tabulate doesn't change the original dataset?
Can I put an OUTPUT OUT= in a Proc Tabulate?
Cheers
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.