Hi all,
I have a email campaign activity table and I want to calculate the top ten links people have clicked on grouped by each individual campaign ID. I cannot simply do a order by function because I think it will take the overall order across the whole table. I need the top ten unique to table a, the top 10 unique to table b, etc...
I would post an example worksheet but I have sensitive data that I am unable to present. I hope I explained my need well enough but feel free to ask for further clarification.
Thanks for any help!
Use PROC FREQ to calculate the counts.
proc freq data=have noprint;
table campaign*link out=link_freq;
run;
proc sort data=link_freq;
by descending count;
run;
data want;
set link_freq;
by campaign;
if first.campaign then count=0;
else count+1;
if count <= 10;
run;
Without being able to see your data, let me suggest you use PROC RANK, and use a BY statement to separate the RANKS by table.
Use PROC FREQ to calculate the counts.
proc freq data=have noprint;
table campaign*link out=link_freq;
run;
proc sort data=link_freq;
by descending count;
run;
data want;
set link_freq;
by campaign;
if first.campaign then count=0;
else count+1;
if count <= 10;
run;
@Reeza has the right idea. There are just a few spots where you need to clean up the code:
proc freq data=have noprint;
table campaign*link / out=link_freq;
run;
proc sort data=link_freq;
by campaign descending count;
run;
data want;
set link_freq;
by campaign;
if first.campaign then cnt=1;
else cnt+1;
if cnt <= 10;
run;
Thanks @Astounding! Without data it's easy to skip steps sometimes 🙂
Hi @Astounding,
This seems to work. However, the last part seems to only bring back the top 10 rows from the link_freq table. I am looking to get the top 10 per campaign_id.
Regards
Can you post the exact code you used?
By Campaign should return results for every campaign.
@Dogo23 wrote:
Hi @Astounding,
This seems to work. However, the last part seems to only bring back the top 10 rows from the link_freq table. I am looking to get the top 10 per campaign_id.
Regards
proc freq
data=tableau.activity_match noprint;
table marketing_history_id*link_name / out=link_freq;
run;
proc sort data=link_freq;
by marketing_history_id descending count;
run;
data want;
set link_freq;
by marketing_history_id;
if marketing_history_id then cnt=1;
else cnt+1;
if cnt <= 10;
run;
It looks correct to me, and it seems like the answer/output is correct if that's the screenshot. The first entry has only 8, so maybe that campaign only had 8 entries for some reason?
You're correct that the first entry only had 8 records. However, there are are 500+ of those campaign_ids (marketing_history_id) that are not coming through.
I was thinking this should have the top ten for each one. If I change cnt from <=10 to say something like 20, it just brings in the next 20 rows from that table instead. It's almost like saying limit to first 10 rows from the table when I really need the top 10 by ID. Hope that makes sense.
@ReezaHere's the data I pulling from in that second step. As you can see, there are more ID's that should be pulled into that final output.
Include the log from the code please, the code appears correct...
1 ;*';*";*/;quit;run;
2 OPTIONS PAGENO=MIN;
3 %LET _CLIENTTASKLABEL='Program';
4 %LET _CLIENTPROJECTPATH='C:\Users\1364503\Desktop\Dataset For Tableau (002).egp';
5 %LET _CLIENTPROJECTNAME='Dataset For Tableau (002).egp';
6 %LET _SASPROGRAMFILE=;
7
8 ODS _ALL_ CLOSE;
9 OPTIONS DEV=ACTIVEX;
10 GOPTIONS XPIXELS=0 YPIXELS=0;
11 FILENAME EGSR TEMP;
12 ODS tagsets.sasreport13(ID=EGSR) FILE=EGSR
13 STYLE=HtmlBlue
14 STYLESHEET=(URL="file:///C:/Program%20Files/SASHome/SASEnterpriseGuide/7.1/Styles/HtmlBlue.css")
15 NOGTITLE
16 NOGFOOTNOTE
17 GPATH=&sasworklocation
18 ENCODING=UTF8
19 options(rolap="on")
20 ;
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
21
22 GOPTIONS ACCESSIBLE;
23 proc freq
24 data=tableau.activity_match noprint;
25 table marketing_history_id*link_name / out=link_freq;
26 run;
NOTE: There were 25567356 observations read from the data set TABLEAU.ACTIVITY_MATCH.
NOTE: The data set WORK.LINK_FREQ has 154424 observations and 4 variables.
NOTE: PROCEDURE FREQ used (Total process time):
real time 20.51 seconds
cpu time 20.52 seconds
27
28 proc sort data=link_freq;
29 by marketing_history_id descending count;
30 run;
NOTE: There were 154424 observations read from the data set WORK.LINK_FREQ.
NOTE: The data set WORK.LINK_FREQ has 154424 observations and 4 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.20 seconds
cpu time 0.20 seconds
31
32 data want;
33 set link_freq;
34 by marketing_history_id;
35 if marketing_history_id then cnt=1;
36 else cnt+1;
37 if cnt <= 10;
38 run;
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
35:4
2 The SAS System 10:02 Monday, March 5, 2018
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=MAINCTA COUNT=16 PERCENT=0.0000625798 FIRST.MARKETING_HISTORY_ID=1
LAST.MARKETING_HISTORY_ID=0 cnt=1 _ERROR_=1 _N_=1
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=TRTMYACCCT COUNT=16 PERCENT=0.0000625798 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=2 _ERROR_=1 _N_=2
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=TRTRDMPTS COUNT=16 PERCENT=0.0000625798 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=3 _ERROR_=1 _N_=3
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=TRTRATES COUNT=12 PERCENT=0.0000469348 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=4 _ERROR_=1 _N_=4
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=TRTSPECIAL COUNT=12 PERCENT=0.0000469348 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=5 _ERROR_=1 _N_=5
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=TRTS0105AB COUNT=4 PERCENT=0.0000156449 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=6 _ERROR_=1 _N_=6
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=TRTS0105AM COUNT=4 PERCENT=0.0000156449 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=7 _ERROR_=1 _N_=7
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146825EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146825EN LINK_NAME=TRTS0401AM COUNT=4 PERCENT=0.0000156449 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=1 cnt=8 _ERROR_=1 _N_=8
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN LINK_NAME=TRTS0106AM COUNT=4 PERCENT=0.0000156449 FIRST.MARKETING_HISTORY_ID=1
LAST.MARKETING_HISTORY_ID=0 cnt=9 _ERROR_=1 _N_=9
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN
LINK_NAME=LINK 01 - /CONTENT/GP/EN/HOME.HTML?src={(META:EMAIL_TYPE)}_EMAIL_{(META:CAMPAIGNNAME)}_{(TERRITORY_EMAIL_VERSION)}_{(TRANS
LATION_LANG)}_{(META:MAIL_DATE)}_SLOT1PR3_{(MARKETING_HISTORY_ID)}_{(LP_MBR_ACCT)} COUNT=2 PERCENT=7.8224749E-6
FIRST.MARKETING_HISTORY_ID=0 LAST.MARKETING_HISTORY_ID=0 cnt=10 _ERROR_=1 _N_=10
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN LINK_NAME=MAINCTA COUNT=2 PERCENT=7.8224749E-6 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=11 _ERROR_=1 _N_=11
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN LINK_NAME=TRTMANAGES COUNT=2 PERCENT=7.8224749E-6 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=12 _ERROR_=1 _N_=12
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN LINK_NAME=TRTMYACCCT COUNT=2 PERCENT=7.8224749E-6 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=13 _ERROR_=1 _N_=13
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN LINK_NAME=TRTS0106AB COUNT=2 PERCENT=7.8224749E-6 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=14 _ERROR_=1 _N_=14
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN LINK_NAME=TRTS0401AB COUNT=2 PERCENT=7.8224749E-6 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=15 _ERROR_=1 _N_=15
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592146826EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592146826EN LINK_NAME=TRTSPECIAL COUNT=2 PERCENT=7.8224749E-6 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=1 cnt=16 _ERROR_=1 _N_=16
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592148156EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592148156EN LINK_NAME=TRTMYACCCT COUNT=28 PERCENT=0.0001095146 FIRST.MARKETING_HISTORY_ID=1
LAST.MARKETING_HISTORY_ID=0 cnt=17 _ERROR_=1 _N_=17
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592148156EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592148156EN LINK_NAME=TRTS0105AB COUNT=12 PERCENT=0.0000469348 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=18 _ERROR_=1 _N_=18
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592148156EN' , at line 35 column 4.
MARKETING_HISTORY_ID=A00487002592148156EN LINK_NAME=TRTS0502AM COUNT=8 PERCENT=0.0000312899 FIRST.MARKETING_HISTORY_ID=0
3 The SAS System 10:02 Monday, March 5, 2018
LAST.MARKETING_HISTORY_ID=0 cnt=19 _ERROR_=1 _N_=19
NOTE: Invalid numeric data, MARKETING_HISTORY_ID='A00487002592148156EN' , at line 35 column 4.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
MARKETING_HISTORY_ID=A00487002592148156EN LINK_NAME=TRTSPECIAL COUNT=8 PERCENT=0.0000312899 FIRST.MARKETING_HISTORY_ID=0
LAST.MARKETING_HISTORY_ID=0 cnt=20 _ERROR_=1 _N_=20
NOTE: There were 154424 observations read from the data set WORK.LINK_FREQ.
NOTE: The data set WORK.WANT has 10 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.07 seconds
39
40 GOPTIONS NOACCESSIBLE;
41 %LET _CLIENTTASKLABEL=;
42 %LET _CLIENTPROJECTPATH=;
43 %LET _CLIENTPROJECTNAME=;
44 %LET _SASPROGRAMFILE=;
45
46 ;*';*";*/;quit;run;
47 ODS _ALL_ CLOSE;
48
49
50 QUIT; RUN;
51
huh...anything in there look like it's flagging something for you?
NOTE: Character values have been converted to numeric values at the places given by: (Line)Column).
35:4
Finding line 35/4
35 if marketing_history_id then cnt=1;
You forgot the first. portion of that code. Fixing that will fix the issue. Check the original code provided for reference.
Hi @Reeza
I didn't have the fist (bold in code below) in the step. I didn't realize that had to be in the code, but that brought in all the other records.
Thanks for your help!
data want;
set link_freq;
by marketing_history_id;
if first.marketing_history_id then cnt=1;
else cnt+1;
if cnt <= 10;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.