BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gabras
Pyrite | Level 9

Hi,

 

i have to write sas code that simulate an sql over partition.

 

the following code is what i would like to convert in SAS:

 

select * ,
COUNT(*) OVER (PARTITION BY CUST_ID) AS COUNT_overPart,
AVG(dayDiff) OVER (PARTITION BY CUST_ID) AS meanDayContact_overPart,
SUM(InStore) OVER (PARTITION BY CUST_ID) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) OVER (PARTITION BY CUST_ID)end AS meanDay_InStore_overPart
from HAVE

 

Here is my HAVE table

 

IDCUST_IDDATESourceOBJordprev_datedayDiffprev_objprev_SourceInStore
1103AUG2018MailA103AUG20180AMail0
2106AUG2018MailA203AUG20183AMail0
3106AUG2018FaxA306AUG20180AMail0
4206AUG2018TelE106AUG20180ETel0
5206AUG2018StoreF206AUG20180ETel1
6306AUG2018TelB106AUG20180BTel0
7306AUG2018TelB206AUG20180BTel0
8406AUG2018TelC106AUG20180CTel0
9406AUG2018StoreD206AUG20180CTel1

 

and i want this table:

 

IDCUST_IDDATESourceOBJordprev_datedayDiffprev_objprev_SourceInStoreCOUNT_overPartmeanDayContact_overPartcountInStore_overPartmeanDay_InStore_overPart
1103AUG2018MailA103AUG20180AMail03100
2106AUG2018MailA203AUG20183AMail03100
3106AUG2018FaxA306AUG20180AMail03100
4206AUG2018TelE106AUG20180ETel02010
5206AUG2018StoreF206AUG20180ETel12010
6306AUG2018TelB106AUG20180BTel02000
7306AUG2018TelB206AUG20180BTel02000
8406AUG2018TelC106AUG20180CTel02010
9406AUG2018StoreD206AUG20180CTel12010

 

 

@Ksharp i saw a post where you used the hash table (or another solution if he had not sufficient memory) but i couldn't use your code to accomplish my task, can you help me?

Here the post:

https://communities.sas.com/t5/General-SAS-Programming/Partition-by-equivalent-in-SAS-base-or-proc-s...

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

SAS SQL can do it too . No need Hash Table.

 

data have;
infile cards expandtabs truncover;
input ID	CUST_ID	DATE : $20.	Source	$ OBJ	$ ord 	prev_date :$20.	dayDiff	prev_obj $	prev_Source $	InStore;
cards;
1	1	03AUG2018	Mail	A	1	03AUG2018	0	A	Mail	0
2	1	06AUG2018	Mail	A	2	03AUG2018	3	A	Mail	0
3	1	06AUG2018	Fax	A	3	06AUG2018	0	A	Mail	0
4	2	06AUG2018	Tel	E	1	06AUG2018	0	E	Tel	0
5	2	06AUG2018	Store	F	2	06AUG2018	0	E	Tel	1
6	3	06AUG2018	Tel	B	1	06AUG2018	0	B	Tel	0
7	3	06AUG2018	Tel	B	2	06AUG2018	0	B	Tel	0
8	4	06AUG2018	Tel	C	1	06AUG2018	0	C	Tel	0
9	4	06AUG2018	Store	D	2	06AUG2018	0	C	Tel	1
;
run;

proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff)  AS meanDayContact_overPart,
SUM(InStore)  AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end)  AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;

View solution in original post

8 REPLIES 8
LinusH
Tourmaline | Level 20
I'm not 100% sure but I think can solved by just specifying cust_id in the GROUP BY clause, then SAS will remerge stats with detail data.
Data never sleeps
gabras
Pyrite | Level 9
Hi @LinusH

I think you are referring to the group by clause of an sql statement.
I tried to use sql statement, calculating the over partition variabile as (select count() from have group by custId) as count_overpart but it takes a lot of time.
ChrisNZ
Tourmaline | Level 20

There is no partition by operator in SAS (vote if you want it).

@LinusH's suggestion is the way to go.

You may (or not) speed up the query by deriving the GROUPed BY values separately and joining, and/or by loading the table in memory using SASFILE if it's not too large.

 

gabras
Pyrite | Level 9
Hi @ChrisNZ
Thank you for your reply.

I hope this is not THE solution, but a workaround.
I think using hash table is a better solution.
@Ksharp posted a very cool way to use it (see at the bottom of my first post) but since i have never used hash table i wasn’t able to re engineering the code he wrote
Ksharp
Super User

SAS SQL can do it too . No need Hash Table.

 

data have;
infile cards expandtabs truncover;
input ID	CUST_ID	DATE : $20.	Source	$ OBJ	$ ord 	prev_date :$20.	dayDiff	prev_obj $	prev_Source $	InStore;
cards;
1	1	03AUG2018	Mail	A	1	03AUG2018	0	A	Mail	0
2	1	06AUG2018	Mail	A	2	03AUG2018	3	A	Mail	0
3	1	06AUG2018	Fax	A	3	06AUG2018	0	A	Mail	0
4	2	06AUG2018	Tel	E	1	06AUG2018	0	E	Tel	0
5	2	06AUG2018	Store	F	2	06AUG2018	0	E	Tel	1
6	3	06AUG2018	Tel	B	1	06AUG2018	0	B	Tel	0
7	3	06AUG2018	Tel	B	2	06AUG2018	0	B	Tel	0
8	4	06AUG2018	Tel	C	1	06AUG2018	0	C	Tel	0
9	4	06AUG2018	Store	D	2	06AUG2018	0	C	Tel	1
;
run;

proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff)  AS meanDayContact_overPart,
SUM(InStore)  AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end)  AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;
gabras
Pyrite | Level 9

Hi @Ksharp 

thank you for the reply. It works.

 

If you have time, can you post a solution using hash table ? 

 

Thank you very much

Ksharp
Super User

Why ? It is not right scenario to Hash Table.

But I would like to write some IML code if you want .

gabras
Pyrite | Level 9

Yes, of course. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 18679 views
  • 0 likes
  • 4 in conversation