BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gabras
Pyrite | Level 9

Hi,

 

i have to write sas code that simulate an sql over partition.

 

the following code is what i would like to convert in SAS:

 

select * ,
COUNT(*) OVER (PARTITION BY CUST_ID) AS COUNT_overPart,
AVG(dayDiff) OVER (PARTITION BY CUST_ID) AS meanDayContact_overPart,
SUM(InStore) OVER (PARTITION BY CUST_ID) AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end) OVER (PARTITION BY CUST_ID)end AS meanDay_InStore_overPart
from HAVE

 

Here is my HAVE table

 

IDCUST_IDDATESourceOBJordprev_datedayDiffprev_objprev_SourceInStore
1103AUG2018MailA103AUG20180AMail0
2106AUG2018MailA203AUG20183AMail0
3106AUG2018FaxA306AUG20180AMail0
4206AUG2018TelE106AUG20180ETel0
5206AUG2018StoreF206AUG20180ETel1
6306AUG2018TelB106AUG20180BTel0
7306AUG2018TelB206AUG20180BTel0
8406AUG2018TelC106AUG20180CTel0
9406AUG2018StoreD206AUG20180CTel1

 

and i want this table:

 

IDCUST_IDDATESourceOBJordprev_datedayDiffprev_objprev_SourceInStoreCOUNT_overPartmeanDayContact_overPartcountInStore_overPartmeanDay_InStore_overPart
1103AUG2018MailA103AUG20180AMail03100
2106AUG2018MailA203AUG20183AMail03100
3106AUG2018FaxA306AUG20180AMail03100
4206AUG2018TelE106AUG20180ETel02010
5206AUG2018StoreF206AUG20180ETel12010
6306AUG2018TelB106AUG20180BTel02000
7306AUG2018TelB206AUG20180BTel02000
8406AUG2018TelC106AUG20180CTel02010
9406AUG2018StoreD206AUG20180CTel12010

 

 

@Ksharp i saw a post where you used the hash table (or another solution if he had not sufficient memory) but i couldn't use your code to accomplish my task, can you help me?

Here the post:

https://communities.sas.com/t5/General-SAS-Programming/Partition-by-equivalent-in-SAS-base-or-proc-s...

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

SAS SQL can do it too . No need Hash Table.

 

data have;
infile cards expandtabs truncover;
input ID	CUST_ID	DATE : $20.	Source	$ OBJ	$ ord 	prev_date :$20.	dayDiff	prev_obj $	prev_Source $	InStore;
cards;
1	1	03AUG2018	Mail	A	1	03AUG2018	0	A	Mail	0
2	1	06AUG2018	Mail	A	2	03AUG2018	3	A	Mail	0
3	1	06AUG2018	Fax	A	3	06AUG2018	0	A	Mail	0
4	2	06AUG2018	Tel	E	1	06AUG2018	0	E	Tel	0
5	2	06AUG2018	Store	F	2	06AUG2018	0	E	Tel	1
6	3	06AUG2018	Tel	B	1	06AUG2018	0	B	Tel	0
7	3	06AUG2018	Tel	B	2	06AUG2018	0	B	Tel	0
8	4	06AUG2018	Tel	C	1	06AUG2018	0	C	Tel	0
9	4	06AUG2018	Store	D	2	06AUG2018	0	C	Tel	1
;
run;

proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff)  AS meanDayContact_overPart,
SUM(InStore)  AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end)  AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;

View solution in original post

8 REPLIES 8
LinusH
Tourmaline | Level 20
I'm not 100% sure but I think can solved by just specifying cust_id in the GROUP BY clause, then SAS will remerge stats with detail data.
Data never sleeps
gabras
Pyrite | Level 9
Hi @LinusH

I think you are referring to the group by clause of an sql statement.
I tried to use sql statement, calculating the over partition variabile as (select count() from have group by custId) as count_overpart but it takes a lot of time.
ChrisNZ
Tourmaline | Level 20

There is no partition by operator in SAS (vote if you want it).

@LinusH's suggestion is the way to go.

You may (or not) speed up the query by deriving the GROUPed BY values separately and joining, and/or by loading the table in memory using SASFILE if it's not too large.

 

gabras
Pyrite | Level 9
Hi @ChrisNZ
Thank you for your reply.

I hope this is not THE solution, but a workaround.
I think using hash table is a better solution.
@Ksharp posted a very cool way to use it (see at the bottom of my first post) but since i have never used hash table i wasn’t able to re engineering the code he wrote
Ksharp
Super User

SAS SQL can do it too . No need Hash Table.

 

data have;
infile cards expandtabs truncover;
input ID	CUST_ID	DATE : $20.	Source	$ OBJ	$ ord 	prev_date :$20.	dayDiff	prev_obj $	prev_Source $	InStore;
cards;
1	1	03AUG2018	Mail	A	1	03AUG2018	0	A	Mail	0
2	1	06AUG2018	Mail	A	2	03AUG2018	3	A	Mail	0
3	1	06AUG2018	Fax	A	3	06AUG2018	0	A	Mail	0
4	2	06AUG2018	Tel	E	1	06AUG2018	0	E	Tel	0
5	2	06AUG2018	Store	F	2	06AUG2018	0	E	Tel	1
6	3	06AUG2018	Tel	B	1	06AUG2018	0	B	Tel	0
7	3	06AUG2018	Tel	B	2	06AUG2018	0	B	Tel	0
8	4	06AUG2018	Tel	C	1	06AUG2018	0	C	Tel	0
9	4	06AUG2018	Store	D	2	06AUG2018	0	C	Tel	1
;
run;

proc sql;
create table want as
select * ,
COUNT(*) AS COUNT_overPart,
AVG(dayDiff)  AS meanDayContact_overPart,
SUM(InStore)  AS countInStore_overPart,
AVG(case when InStore = 1 then dayDiff else 0 end)  AS meanDay_InStore_overPart
from HAVE
group BY CUST_ID;
quit;
gabras
Pyrite | Level 9

Hi @Ksharp 

thank you for the reply. It works.

 

If you have time, can you post a solution using hash table ? 

 

Thank you very much

Ksharp
Super User

Why ? It is not right scenario to Hash Table.

But I would like to write some IML code if you want .

gabras
Pyrite | Level 9

Yes, of course. 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 18933 views
  • 0 likes
  • 4 in conversation