Hi,
i have to write sas code that simulate an sql over partition.
the following code is what i would like to convert in SAS:
select * ,
COUNT(*) OVER (PARTITION BY CUST_ID) AS COUNT_overPart,
AVG(dayDiff) OVER (PARTITION BY CUST_ID) AS meanDayContact_overPart,
SUM(InStore) OVER (PARTITION BY CUST_ID) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) OVER (PARTITION BY CUST_ID)end AS meanDay_InStore_overPart
from HAVE
Here is my HAVE table
ID | CUST_ID | DATE | Source | OBJ | ord | prev_date | dayDiff | prev_obj | prev_Source | InStore |
1 | 1 | 03AUG2018 | A | 1 | 03AUG2018 | 0 | A | 0 | ||
2 | 1 | 06AUG2018 | A | 2 | 03AUG2018 | 3 | A | 0 | ||
3 | 1 | 06AUG2018 | Fax | A | 3 | 06AUG2018 | 0 | A | 0 | |
4 | 2 | 06AUG2018 | Tel | E | 1 | 06AUG2018 | 0 | E | Tel | 0 |
5 | 2 | 06AUG2018 | Store | F | 2 | 06AUG2018 | 0 | E | Tel | 1 |
6 | 3 | 06AUG2018 | Tel | B | 1 | 06AUG2018 | 0 | B | Tel | 0 |
7 | 3 | 06AUG2018 | Tel | B | 2 | 06AUG2018 | 0 | B | Tel | 0 |
8 | 4 | 06AUG2018 | Tel | C | 1 | 06AUG2018 | 0 | C | Tel | 0 |
9 | 4 | 06AUG2018 | Store | D | 2 | 06AUG2018 | 0 | C | Tel | 1 |
and i want this table:
ID | CUST_ID | DATE | Source | OBJ | ord | prev_date | dayDiff | prev_obj | prev_Source | InStore | COUNT_overPart | meanDayContact_overPart | countInStore_overPart | meanDay_InStore_overPart |
1 | 1 | 03AUG2018 | A | 1 | 03AUG2018 | 0 | A | 0 | 3 | 1 | 0 | 0 | ||
2 | 1 | 06AUG2018 | A | 2 | 03AUG2018 | 3 | A | 0 | 3 | 1 | 0 | 0 | ||
3 | 1 | 06AUG2018 | Fax | A | 3 | 06AUG2018 | 0 | A | 0 | 3 | 1 | 0 | 0 | |
4 | 2 | 06AUG2018 | Tel | E | 1 | 06AUG2018 | 0 | E | Tel | 0 | 2 | 0 | 1 | 0 |
5 | 2 | 06AUG2018 | Store | F | 2 | 06AUG2018 | 0 | E | Tel | 1 | 2 | 0 | 1 | 0 |
6 | 3 | 06AUG2018 | Tel | B | 1 | 06AUG2018 | 0 | B | Tel | 0 | 2 | 0 | 0 | 0 |
7 | 3 | 06AUG2018 | Tel | B | 2 | 06AUG2018 | 0 | B | Tel | 0 | 2 | 0 | 0 | 0 |
8 | 4 | 06AUG2018 | Tel | C | 1 | 06AUG2018 | 0 | C | Tel | 0 | 2 | 0 | 1 | 0 |
9 | 4 | 06AUG2018 | Store | D | 2 | 06AUG2018 | 0 | C | Tel | 1 | 2 | 0 | 1 | 0 |
@Ksharp i saw a post where you used the hash table (or another solution if he had not sufficient memory) but i couldn't use your code to accomplish my task, can you help me?
Here the post:
Thank you
SAS SQL can do it too . No need Hash Table.
data have;
infile cards expandtabs truncover;
input ID CUST_ID DATE : $20. Source $ OBJ $ ord prev_date :$20. dayDiff prev_obj $ prev_Source $ InStore;
cards;
1 1 03AUG2018 Mail A 1 03AUG2018 0 A Mail 0
2 1 06AUG2018 Mail A 2 03AUG2018 3 A Mail 0
3 1 06AUG2018 Fax A 3 06AUG2018 0 A Mail 0
4 2 06AUG2018 Tel E 1 06AUG2018 0 E Tel 0
5 2 06AUG2018 Store F 2 06AUG2018 0 E Tel 1
6 3 06AUG2018 Tel B 1 06AUG2018 0 B Tel 0
7 3 06AUG2018 Tel B 2 06AUG2018 0 B Tel 0
8 4 06AUG2018 Tel C 1 06AUG2018 0 C Tel 0
9 4 06AUG2018 Store D 2 06AUG2018 0 C Tel 1
;
run;
proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff) AS meanDayContact_overPart,
SUM(InStore) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;
There is no partition by operator in SAS (vote if you want it).
@LinusH's suggestion is the way to go.
You may (or not) speed up the query by deriving the GROUPed BY values separately and joining, and/or by loading the table in memory using SASFILE if it's not too large.
SAS SQL can do it too . No need Hash Table.
data have;
infile cards expandtabs truncover;
input ID CUST_ID DATE : $20. Source $ OBJ $ ord prev_date :$20. dayDiff prev_obj $ prev_Source $ InStore;
cards;
1 1 03AUG2018 Mail A 1 03AUG2018 0 A Mail 0
2 1 06AUG2018 Mail A 2 03AUG2018 3 A Mail 0
3 1 06AUG2018 Fax A 3 06AUG2018 0 A Mail 0
4 2 06AUG2018 Tel E 1 06AUG2018 0 E Tel 0
5 2 06AUG2018 Store F 2 06AUG2018 0 E Tel 1
6 3 06AUG2018 Tel B 1 06AUG2018 0 B Tel 0
7 3 06AUG2018 Tel B 2 06AUG2018 0 B Tel 0
8 4 06AUG2018 Tel C 1 06AUG2018 0 C Tel 0
9 4 06AUG2018 Store D 2 06AUG2018 0 C Tel 1
;
run;
proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff) AS meanDayContact_overPart,
SUM(InStore) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;
Hi @Ksharp
thank you for the reply. It works.
If you have time, can you post a solution using hash table ?
Thank you very much
Why ? It is not right scenario to Hash Table.
But I would like to write some IML code if you want .
Yes, of course.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.