Hi All,
I want to create a table that looks like the below from the following. Can I do this from a Proc Tabulate? Or is there a better way of doing it?
Data Work.Toowoomba_Transfers;
set Data.Dataset1;
Keep WEEK_END_DT Logon NAME ORIGINATING_SKILL TARGET_SKILL_NM TRANSFER_CNT;
Where WEEK_END_DT >= '26Jun2017.'d
and logon_id in ('ADB860' 'ACW710' 'LCM387' 'SGU331' 'CGJ575' 'MMY391' 'JML872' 'CJP922' 'AHN860' 'LSA902' 'TSM178' 'ASO417' 'OCW711' 'RKV507' 'RLP137');
Run;
Data work.ADB860;
Set work.toowoomba_Transfers;
Where Logon_ID = 'ADB860';
RUN;
Proc Sort data=work.adb860;
By WEEK_END_DT ORIGINATING_SKILL TARGET_SKILL_NM;
RUN;
Proc Summary data=work.adb860 sum;
By WEEK_END_DT ORIGINATING_SKILL TARGET_SKILL_NM;
VAR TRANSFER_CNT;
Output out=work.adb860_SUM (DROP=_FREQ_ _TYPE_) SUM=;
RUN;
Proc Tabulate data=work.adb860_sum format=comma9.0;
class week_end_dt ORIGINATING_SKILL target_skill_nm;
VAR TRANSFER_CNT;
Table
target_skill_nm = ""
,week_end_dt = " " *Transfer_cnt=""
/ box= "Transfers"
;
run;
data looks like this
WEEK_END_DT | ORIGINATING_SKILL | TARGET_SKILL_NM | TRANSFER_CNT |
2017-07-16 | ABC_DSC | ABC_D13 | 7 |
2017-07-16 | ABC_DSC | ABC_E13 | 1 |
2017-07-16 | ABC_EMS | ABC_D13 | 3 |
2017-07-16 | ABC_EMS | ABC_E13 | 3 |
2017-07-16 | ABC_EMS | ABC_R16 | 1 |
2017-07-16 | ABC_FAO | ABC_E13 | 1 |
2017-07-16 | ABC_FAO | ABC_F24 | 1 |
2017-07-16 | ABC_FAO | ABC_F43 | 1 |
2017-07-16 | ABC_FAO | ABC_F67 | 2 |
2017-07-16 | ABC_FAO | ABC_R18 | 1 |
2017-07-16 | ABC_FAO | ABC_Y13 | 1 |
2017-07-16 | ABC_PAZ | ABC_E13 | 1 |
2017-07-16 | ABC_RET | ABC_R18 | 1 |
2017-07-16 | ABC_TSE | ABC_E13 | 1 |
2017-07-16 | ABC_UPD | ABC_D13 | 1 |
2017-07-16 | ABC_UPE | ABC_E13 | 1 |
2017-07-16 | ABC_YAS | ABC_ALY | 1 |
2017-07-16 | ABC_YAS | ABC_E13 | 1 |
2017-07-16 | ABC_YAS | ABC_Y11 | 1 |
2017-07-16 | ABC_YAS | ABC_Y13 | 3 |
2017-07-23 | ABC_DRU | ABC_E13 | 1 |
2017-07-23 | ABC_DSC | ABC_D13 | 2 |
2017-07-23 | ABC_EMS | ABC_E13 | 2 |
2017-07-23 | ABC_FAO | ABC_D13 | 1 |
2017-07-23 | ABC_FAO | ABC_F24 | 1 |
2017-07-23 | ABC_TSR | ABC_R16 | 1 |
2017-07-23 | ABC_UPD | ABC_D13 | 2 |
2017-07-23 | ABC_UPD | ABC_E13 | 1 |
2017-07-23 | ABC_UPE | ABC_E13 | 1 |
2017-07-23 | ABC_YAS | ABC_E13 | 1 |
2017-07-23 | ABC_YAS | ABC_Y13 | 2 |
2017-07-30 | ABC_CID | ABC_CID | 1 |
2017-07-30 | ABC_TSP | ABC_TSP | 1 |
2017-08-06 | ABC_DSC | ABC_D13 | 1 |
2017-08-06 | ABC_TSP | ABC_TSP | 1 |
2017-08-20 | ABC_CID | ABC_CID | 1 |
2017-08-20 | ABC_CID | ABC_R16 | 1 |
2017-08-20 | ABC_FAO | ABC_EAF | 1 |
2017-08-20 | ABC_RET | ABC_RET | 1 |
final output to look like this.
ORIGINATING_SKILL | TARGET_SKILL_NM | 2017-07-16 | 2017-07-23 | 2017-07-30 | 2017-08-06 | 2017-08-20 |
ABC_DSC | ABC_D13 | 7 | 2 | 0 | 1 | 0 |
ABC_DSC | ABC_E13 | 1 | 0 | 0 | 0 | 0 |
ABC_EMS | ABC_D13 | 3 | 0 | 0 | 0 | 0 |
ABC_EMS | ABC_E13 | 3 | 2 | 0 | 0 | 0 |
ABC_EMS | ABC_R16 | 1 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_E13 | 1 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_F24 | 1 | 1 | 0 | 0 | 0 |
ABC_FAO | ABC_F43 | 1 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_F67 | 2 | 0 | 0 | 0 | 0 |
ABC_FAO | ABC_R18 | 1 | 0 | 0 | 0 | 0 |
Yes, you can likely simplify this code.
Some ideas:
1. Combine the data steps to one. If it's only filtering and keeping variables (WHERE/KEEP) you can do those as a data set option rather than as a separate step. Why does the IN have all int he first data step and then a subset in the second, are you attempting to automate this for many different values?
2. You don't need a SORT if you don't have a BY statement and I don't see a BY statement in your procs.
3. You may be able to combine the summary and tabulate into one step but I'm not 100% sure of that without trying it, but I'd consider it worth an attempt at least.
Note the sum in the proc tabulate and the reference to your original dataset not the sum data set. I doubt this is going to work as is, but hopefully it gives you an idea.
Proc Tabulate data=work.toowomba_transfers format=comma9.0;
where week_end_dt>= '26Jun017'd and login_id in ('ADB860');
class week_end_dt ORIGINATING_SKILL target_skill_nm;
VAR TRANSFER_CNT;
Table
target_skill_nm = ""
,week_end_dt = " " *Transfer_cnt="" * sum
/ box= "Transfers"
;
run;
Yes, you can likely simplify this code.
Some ideas:
1. Combine the data steps to one. If it's only filtering and keeping variables (WHERE/KEEP) you can do those as a data set option rather than as a separate step. Why does the IN have all int he first data step and then a subset in the second, are you attempting to automate this for many different values?
2. You don't need a SORT if you don't have a BY statement and I don't see a BY statement in your procs.
3. You may be able to combine the summary and tabulate into one step but I'm not 100% sure of that without trying it, but I'd consider it worth an attempt at least.
Note the sum in the proc tabulate and the reference to your original dataset not the sum data set. I doubt this is going to work as is, but hopefully it gives you an idea.
Proc Tabulate data=work.toowomba_transfers format=comma9.0;
where week_end_dt>= '26Jun017'd and login_id in ('ADB860');
class week_end_dt ORIGINATING_SKILL target_skill_nm;
VAR TRANSFER_CNT;
Table
target_skill_nm = ""
,week_end_dt = " " *Transfer_cnt="" * sum
/ box= "Transfers"
;
run;
Thanks Reeza,
I used the following:
Data Work.Toowoomba_Transfers_test;
set work.data;
Keep WEEK_END_DT logon_id EMPLOYEE_NAME ORIGINATING_SKILL target_skill TRANSFER_CNT;
Where WEEK_END_DT >= '26Jun2017.'d
and logon_id in ('ADB860' 'ACW710' 'LCM387' 'SGU331' 'CGJ575' 'MMY391' 'JML872' 'CJP922' 'AHN860' 'LSA902' 'TSM178' 'ASO417' 'OCW711' 'RKV507' 'RLP137');
Run;
Proc Tabulate data=Work.Toowoomba_Transfers_test format=comma9.0;
where week_end_dt>= '26Jun017'd and logon_id in ('ADB860');
class week_end_dt ORIGINATING_SKILL target_skill;
VAR TRANSFER_CNT;
Table
ORIGINATING_SKILL = "" * target_skill = ""
,week_end_dt = " " *Transfer_cnt="" * sum
/ box= "Transfers"
;
run;
abnd it produced this - Can I get rid of the 'Sum' from displaying?
Transfers | 16/07/2017 | 23/07/2017 | 30/07/2017 | 6/08/2017 | 20/08/2017 | |
Sum | Sum | Sum | Sum | Sum | ||
CLK_CID | CLK_CID | . | . | 1 | . | 1 |
CLK_R16 | . | . | . | . | 1 | |
CLK_DRU | CLK_E13 | . | 1 | . | . | . |
CLK_DSC | CLK_D13 | 7 | 2 | . | 1 | . |
CLK_E13 | 1 | . | . | . | . | |
CLK_EMS | CLK_D13 | 3 | . | . | . | . |
CLK_E13 | 3 | 2 | . | . | . | |
CLK_R16 | 1 | . | . | . | . | |
CLK_FAO | CLK_D13 | . | 1 | . | . | . |
CLK_E13 | 1 | . | . | . | . | |
CLK_EAF | . | . | . | . | 1 | |
CLK_F24 | 1 | 1 | . | . | . | |
CLK_F43 | 1 | . | . | . | . | |
CLK_F67 | 2 | . | . | . | . | |
CLK_R18 | 1 | . | . | . | . | |
CLK_Y13 | 1 | . | . | . | . | |
CLK_PAZ | CLK_E13 | 1 | . | . | . | . |
CLK_RET | CLK_R18 | 1 | . | . | . | . |
CLK_RET | . | . | . | . | 1 | |
CLK_TSE | CLK_E13 | 1 | . | . | . | . |
CLK_TSP | CLK_TSP | . | . | 1 | 1 | . |
CLK_TSR | CLK_R16 | . | 1 | . | . | . |
CLK_UPD | CLK_D13 | 1 | 2 | . | . | . |
CLK_E13 | . | 1 | . | . | . | |
CLK_UPE | CLK_E13 | 1 | 1 | . | . | . |
CLK_YAS | CLK_ALY | 1 | . | . | . | . |
CLK_E13 | 1 | 1 | . | . | . | |
CLK_Y11 | 1 | . | . | . | . | |
CLK_Y13 | 3 | 2 | . | . | . |
Cheers
Dean
Yes, use the same syntax you did for the other variables (sum=' ').
You may also want to look into the MISSTEXT option to get the 0 rather than missing.
And your first data step is unnecessary. The WHERE in PROC TABULATE is the same, so at best it's redundant - you should remove one or the other.
Hi Reeza,
So the Proc Tabulate doesn't change the original dataset?
Can I put an OUTPUT OUT= in a Proc Tabulate?
Cheers
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.