Hi,
i have to write sas code that simulate an sql over partition.
the following code is what i would like to convert in SAS:
select * ,
COUNT(*) OVER (PARTITION BY CUST_ID) AS COUNT_overPart,
AVG(dayDiff) OVER (PARTITION BY CUST_ID) AS meanDayContact_overPart,
SUM(InStore) OVER (PARTITION BY CUST_ID) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) OVER (PARTITION BY CUST_ID)end AS meanDay_InStore_overPart
from HAVE
Here is my HAVE table
ID | CUST_ID | DATE | Source | OBJ | ord | prev_date | dayDiff | prev_obj | prev_Source | InStore |
1 | 1 | 03AUG2018 | A | 1 | 03AUG2018 | 0 | A | 0 | ||
2 | 1 | 06AUG2018 | A | 2 | 03AUG2018 | 3 | A | 0 | ||
3 | 1 | 06AUG2018 | Fax | A | 3 | 06AUG2018 | 0 | A | 0 | |
4 | 2 | 06AUG2018 | Tel | E | 1 | 06AUG2018 | 0 | E | Tel | 0 |
5 | 2 | 06AUG2018 | Store | F | 2 | 06AUG2018 | 0 | E | Tel | 1 |
6 | 3 | 06AUG2018 | Tel | B | 1 | 06AUG2018 | 0 | B | Tel | 0 |
7 | 3 | 06AUG2018 | Tel | B | 2 | 06AUG2018 | 0 | B | Tel | 0 |
8 | 4 | 06AUG2018 | Tel | C | 1 | 06AUG2018 | 0 | C | Tel | 0 |
9 | 4 | 06AUG2018 | Store | D | 2 | 06AUG2018 | 0 | C | Tel | 1 |
and i want this table:
ID | CUST_ID | DATE | Source | OBJ | ord | prev_date | dayDiff | prev_obj | prev_Source | InStore | COUNT_overPart | meanDayContact_overPart | countInStore_overPart | meanDay_InStore_overPart |
1 | 1 | 03AUG2018 | A | 1 | 03AUG2018 | 0 | A | 0 | 3 | 1 | 0 | 0 | ||
2 | 1 | 06AUG2018 | A | 2 | 03AUG2018 | 3 | A | 0 | 3 | 1 | 0 | 0 | ||
3 | 1 | 06AUG2018 | Fax | A | 3 | 06AUG2018 | 0 | A | 0 | 3 | 1 | 0 | 0 | |
4 | 2 | 06AUG2018 | Tel | E | 1 | 06AUG2018 | 0 | E | Tel | 0 | 2 | 0 | 1 | 0 |
5 | 2 | 06AUG2018 | Store | F | 2 | 06AUG2018 | 0 | E | Tel | 1 | 2 | 0 | 1 | 0 |
6 | 3 | 06AUG2018 | Tel | B | 1 | 06AUG2018 | 0 | B | Tel | 0 | 2 | 0 | 0 | 0 |
7 | 3 | 06AUG2018 | Tel | B | 2 | 06AUG2018 | 0 | B | Tel | 0 | 2 | 0 | 0 | 0 |
8 | 4 | 06AUG2018 | Tel | C | 1 | 06AUG2018 | 0 | C | Tel | 0 | 2 | 0 | 1 | 0 |
9 | 4 | 06AUG2018 | Store | D | 2 | 06AUG2018 | 0 | C | Tel | 1 | 2 | 0 | 1 | 0 |
@Ksharp i saw a post where you used the hash table (or another solution if he had not sufficient memory) but i couldn't use your code to accomplish my task, can you help me?
Here the post:
Thank you
SAS SQL can do it too . No need Hash Table.
data have;
infile cards expandtabs truncover;
input ID CUST_ID DATE : $20. Source $ OBJ $ ord prev_date :$20. dayDiff prev_obj $ prev_Source $ InStore;
cards;
1 1 03AUG2018 Mail A 1 03AUG2018 0 A Mail 0
2 1 06AUG2018 Mail A 2 03AUG2018 3 A Mail 0
3 1 06AUG2018 Fax A 3 06AUG2018 0 A Mail 0
4 2 06AUG2018 Tel E 1 06AUG2018 0 E Tel 0
5 2 06AUG2018 Store F 2 06AUG2018 0 E Tel 1
6 3 06AUG2018 Tel B 1 06AUG2018 0 B Tel 0
7 3 06AUG2018 Tel B 2 06AUG2018 0 B Tel 0
8 4 06AUG2018 Tel C 1 06AUG2018 0 C Tel 0
9 4 06AUG2018 Store D 2 06AUG2018 0 C Tel 1
;
run;
proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff) AS meanDayContact_overPart,
SUM(InStore) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;
There is no partition by operator in SAS (vote if you want it).
@LinusH's suggestion is the way to go.
You may (or not) speed up the query by deriving the GROUPed BY values separately and joining, and/or by loading the table in memory using SASFILE if it's not too large.
SAS SQL can do it too . No need Hash Table.
data have;
infile cards expandtabs truncover;
input ID CUST_ID DATE : $20. Source $ OBJ $ ord prev_date :$20. dayDiff prev_obj $ prev_Source $ InStore;
cards;
1 1 03AUG2018 Mail A 1 03AUG2018 0 A Mail 0
2 1 06AUG2018 Mail A 2 03AUG2018 3 A Mail 0
3 1 06AUG2018 Fax A 3 06AUG2018 0 A Mail 0
4 2 06AUG2018 Tel E 1 06AUG2018 0 E Tel 0
5 2 06AUG2018 Store F 2 06AUG2018 0 E Tel 1
6 3 06AUG2018 Tel B 1 06AUG2018 0 B Tel 0
7 3 06AUG2018 Tel B 2 06AUG2018 0 B Tel 0
8 4 06AUG2018 Tel C 1 06AUG2018 0 C Tel 0
9 4 06AUG2018 Store D 2 06AUG2018 0 C Tel 1
;
run;
proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff) AS meanDayContact_overPart,
SUM(InStore) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;
Hi @Ksharp
thank you for the reply. It works.
If you have time, can you post a solution using hash table ?
Thank you very much
Why ? It is not right scenario to Hash Table.
But I would like to write some IML code if you want .
Yes, of course.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.