I have 700k distinct groupby counts.
If I use the input={} statement it generates correctly the columns _frequency_ and _position_, but the _cumfreq_ is not what I want top have. I would like to accumulate the freq within a contract (numero_operacion). My first groupbyinfo run relates to this attempt which at least runs quickly and gives correct freq and position counts.
When I try to use the groupby={} option in the table{} statement, then it takes en eternity to finish.
It's not so big to justify this amount of time, but probably I'm missing out something.
I don't know how to use and combine, vars{}, inputs{} and groupby{} wisely. And I am lost to fine-tune by setting algorithm2, groupbylimit groupbyorder,...
Can someone help out and provide a complex example. the examples from the sas site should cover more sophisticated examples.
By the way, I know how to resolve my problem with alternative approaches.
But I want to truely understand the groupbyinfo action.
thanks a lot.
proc cas;
session mysession;
simple.groupByInfo / /* 1 */
includeDuplicates=true,
minFrequency=1,
generatedcolumns={'frequency', 'position', 'cumfreq'},
groupByLimit=100M,
nworkerthreads=8,
noVars=true,
algorithm2=true,
journaltrace=true,
inputs={'Numero_operacion', 'JourneyName'},
casOut={name="testa_dup", replace=true, CASLIB="mkt"},
table={
/* vars={'Numero_operacion', 'JourneyName'}, groupBy={'_date'}, */
groupByMode="redistribute", orderBy='fecha_envio',
/* orderBy="fecha_envio", */
name="FUNNEL_REN", CASLIB="mkt"
,computedVars={
name="_date"},
computedVarsProgram="_date=put(datepart(fecha_envio), monyy.);"
};
run;
proc cas;
table.fetch /
format=True, maxrows=100,
/* fetchVars={ */
/* "_score_", 'Numero_operacion', 'JourneyName'}, */
table={name="testa_dup", caslib="mkt", where="Numero_operacion in ('01RN40000930' '01RN27000014')"};
run;
proc cas;
session mysession;
simple.groupByInfo / /* 1 */
includeDuplicates=true,
minFrequency=1,
generatedcolumns={'frequency', 'position', 'cumfreq'},
groupByLimit=100M,
details=true,
nworkerthreads=8,
noVars=true,
algorithm2=true,
journaltrace=true,
inputs={'Numero_operacion', 'JourneyName'},
casOut={name="testa_dup", replace=true, CASLIB="mkt"},
table={groupby={'Numero_operacion', 'JourneyName'}, groupByMode="redistribute",
/* orderBy='fecha_envio', */
/* orderBy="fecha_envio", */
name="FUNNEL_RE1", CASLIB="mkt"
,computedVars={
name="_date"},
computedVarsProgram="_date=put(datepart(fecha_envio), monyy.);"
};
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.