BookmarkSubscribeRSS Feed
ClemH
Calcite | Level 5

I got a piece of code on SAS that predicts consumer behavior. So far I did 50 samples with 50 logistic regression by hand, but I'd like to automate this process. Steps are as follows:

  • Create a table with all client having value "1"
  • Create a table with all client having value "0"
  • (below code starts here) Start a loop that:
    • Get a sample of 3000 people from client having value "1"
    • Get a sample of 3000 people from client having value "0"
    • Join those two tables
    • Logistic regression which should get as output (ROC value, and Maximum Likelihood Estimates)
  • Get a all probabilities in the same file MODELE_RESULTS

You'll find below a piece of the code. Can you advice me on how to loop this logistic regression 50 times please? So far I can't make it work... I'm beginner in SQL

 

%macro RunReg (DSName, NumVars) ;
%do i=1 %to &NumVars

/* Create a 3000 people sample called TOP_1*/
PROC SURVEYSELECT DATA= TOP_1
    OUT= ALEA_1
    METHOD=SRS
    N=3000;QUIT;


/* Create a 3000 people sample called TOP_0*/
PROC SURVEYSELECT DATA= TOP_0
    OUT= ALEA_0
    METHOD=SRS
    N=3000;QUIT;


/*Append both tables */
PROC SQL;
    CREATE TABLE BOTH_SAMPLES As
    SELECT * FROM TOP_1
      OUTER UNION CORR
    SELECT * FROM TOP_0;QUIT;


/* Logistic regression*/DATA WORK.&DSName noprint
        Outset=PE(rename(x&i=Value));    Model Y = x&I;
    SET WORK.APPEND_TABLE(IN=__ORIG) WORK.BASE_PREDICT_2;    __FLAG=__ORIG;    __DEP=TOP_CREDIT_HABITAT_2017;
    if not __FLAG then TOP_CREDIT_HABITAT_2017=.;RUN;

PROC SQL;
    CREATE VIEW WORK.SORTTempTableSorted AS
        SELECT *
    FROM WORK.TMP0TempTableAddtnlPredictData;QUIT;TITLE;TITLE1 "Résultats de la régression logistique";FOOTNOTE;FOOTNOTE1 "Généré par le Système SAS (&_SASSERVERNAME, &SYSSCPL) le %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) à %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))";
PROC LOGISTIC DATA=WORK.SORTTempTableSorted
        PLOTS(ONLY)=NONE
    ;    CLASS age_classe    (PARAM=EFFECT) Flag_bq_principale   (PARAM=EFFECT) flag_univers_detenus     (PARAM=EFFECT) csp_1    (PARAM=EFFECT) SGMT_FIDELITE    (PARAM=EFFECT) situ_fam_1   (PARAM=EFFECT);    MODEL TOP_CREDIT_HABITAT_2017 (Event = '1')=top_situ_particuliere top_chgt_csp_6M top_produit_monetaire_bloque top_CREDIT top_chgt_contrat_travail_6M top_credit_CONSO top_credit_HABITAT top_produit_monetaire_dispo top_VM_autres top_Sicav top_produit_epargne_logement top_Predica top_ferm_prod_6M top_ouv_prod_6M top_produit_Assurance top_produit_Cartes top_produit_Credit "moy_surface_financière_6M"n moy_surf_financiere_ecart_6M moy_encours_dav_6M moy_encours_dav_ecart_6M moy_monetaire_dispo_6M moy_monetaire_dispo_ecart_6M moy_emprunts_6M moy_emprunts_ecarts_6M moy_sicav_6M moy_sicav_ecart_6M moy_vm_autres_6M moy_vm_autres_ecart_6M moy_predica_6M moy_predica_ecart_6M moy_bgpi_6M moy_bgpi_ecart_6M moy_epargne_logement_6M moy_epargne_logement_ecart_6M "moy.an_mt_flux_cred_norme_B2"n "moy.an_mt_op_cred_ep_a_terme"n "moy.an_mt_op_debit_ep_a_terme"n "moy.an_mt_ope_credit_depot"n "moy.an_mt_ope_credit_ep_a_vue"n "moy.an_mt_ope_debit_depot"n "moy.an_mt_ope_debit_ep_a_vue"n "moy.an_mt_pmts_carte_etr"n "moy.an_mt_remise_chq"n "moy.an_mt_paie_carte"n "moy.an_mt_paie_chq"n "moy.an_nb_paie_carte"n "moy.an_nb_paie_chq"n "moy.an_mt_ret_carte_Aut_bq"n "moy.an_mt_ret_carte_CRCA"n "moy.an_mt_ret_carte_etr"n "moy.an_nb_flux_cred_normeB2"n "moy.an_nb_ope_credit_ep_a_terme"n "moy.an_nb_ope_debit_ep_a_terme"n "moy.an_nb_ope_credit_depot"n "moy.an_nb_ope_credit_ep_a_vue"n "moy.an_nb_ope_debit_depot"n "moy.an_nb_ope_debit_ep_a_vue"n "moy.an_nb_pmts_carte_etr"n "moy.an_nb_remise_chq"n "moy.an_nb_ret_carte_Aut_bq"n "moy.an_nb_ret_carte_CRCA"n "moy.an_nb_ret_carte_etr"n "moy.an_nb_ret_carte"n "moy.an_mt_factu_ttc"n "moy.an_mt_reduc_ttc"n "moy.an_mt_rist_ttc"n "moy.an_mt_mvt_domicilie_mktg"n "moy.an_nb_mvt_M_domicilie_mktg"n top_produit_Epargne top_ouverture_reclam age_classe Flag_bq_principale flag_univers_detenus csp_1 SGMT_FIDELITE situ_fam_1     /        SELECTION=STEPWISE
        SLE=0.05        SLS=0.05        INCLUDE=0        LINK=LOGIT
        OUTROC=_PROB_
        ALPHA=95        EXPEST
        PARMLABEL
        CORRB
        NOPRINT
    ;    OUTPUT OUT=WORK.PREDLogRegPredictions(LABEL="Statistiques et prédictions de régression logistique pour WORK.APPEND_TABLE" WHERE=(NOT __FLAG))        PREDPROBS=INDIVIDUAL;RUN;QUIT;
%end;
%mend;DATA WORK.PREDLogRegPredictions; 
    set WORK.PREDLogRegPredictions; 
    TOP_CREDIT_HABITAT_2017=__DEP; 
    _FROM_=__DEP;
    DROP __DEP; 
    DROP __FLAG;RUN ;QUIT ;
5 REPLIES 5
Rick_SAS
SAS Super FREQ

You can restructure your code to make it more efficient by replacing the macro loop with BY-group processing. See 

- Simulation in SAS: The slow way or the BY way

- An easy way to run thousands of regressions in SAS

 

ClemH
Calcite | Level 5

I got a piece of code on SAS that predicts consumer behavior. So far I did 50 samples with 50 logistic regression by hand, but I'd like to automate this process. Steps are as follows:

  • Create a table with all client having value "1"
  • Create a table with all client having value "0"
  • (below code starts here) Start a loop that:
    • Get a sample of 3000 people from client having value "1"
    • Get a sample of 3000 people from client having value "0"
    • Join those two tables
    • Logistic regression which should get as output (ROC value, and Maximum Likelihood Estimates)
  • Get a all probabilities in the same file MODELE_RESULTS

You'll find below a piece of the code. Can you advice me on how to loop this logistic regression 50 times please? So far I can't make it work... I'm beginner in SQL

 

%macro RunReg (DSName, NumVars) ;
%do i=1 %to &NumVars

/* Create a 3000 people sample called TOP_1*/
PROC SURVEYSELECT DATA= TOP_1
    OUT= ALEA_1
    METHOD=SRS
    N=3000;QUIT;


/* Create a 3000 people sample called TOP_0*/
PROC SURVEYSELECT DATA= TOP_0
    OUT= ALEA_0
    METHOD=SRS
    N=3000;QUIT;


/*Append both tables */
PROC SQL;
    CREATE TABLE BOTH_SAMPLES As
    SELECT * FROM TOP_1
      OUTER UNION CORR
    SELECT * FROM TOP_0;QUIT;


/* Logistic regression*/DATA WORK.&DSName noprint
        Outset=PE(rename(x&i=Value));    Model Y = x&I;
    SET WORK.APPEND_TABLE(IN=__ORIG) WORK.BASE_PREDICT_2;    __FLAG=__ORIG;    __DEP=TOP_CREDIT_HABITAT_2017;
    if not __FLAG then TOP_CREDIT_HABITAT_2017=.;RUN;

PROC SQL;
    CREATE VIEW WORK.SORTTempTableSorted AS
        SELECT *
    FROM WORK.TMP0TempTableAddtnlPredictData;QUIT;TITLE;TITLE1 "Résultats de la régression logistique";FOOTNOTE;FOOTNOTE1 "Généré par le Système SAS (&_SASSERVERNAME, &SYSSCPL) le %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) à %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))";
PROC LOGISTIC DATA=WORK.SORTTempTableSorted
        PLOTS(ONLY)=NONE
    ;    CLASS age_classe    (PARAM=EFFECT) Flag_bq_principale   (PARAM=EFFECT) flag_univers_detenus     (PARAM=EFFECT) csp_1    (PARAM=EFFECT) SGMT_FIDELITE    (PARAM=EFFECT) situ_fam_1   (PARAM=EFFECT);    MODEL TOP_CREDIT_HABITAT_2017 (Event = '1')=top_situ_particuliere top_chgt_csp_6M top_produit_monetaire_bloque top_CREDIT top_chgt_contrat_travail_6M top_credit_CONSO top_credit_HABITAT top_produit_monetaire_dispo top_VM_autres top_Sicav top_produit_epargne_logement top_Predica top_ferm_prod_6M top_ouv_prod_6M top_produit_Assurance top_produit_Cartes top_produit_Credit "moy_surface_financière_6M"n moy_surf_financiere_ecart_6M moy_encours_dav_6M moy_encours_dav_ecart_6M moy_monetaire_dispo_6M moy_monetaire_dispo_ecart_6M moy_emprunts_6M moy_emprunts_ecarts_6M moy_sicav_6M moy_sicav_ecart_6M moy_vm_autres_6M moy_vm_autres_ecart_6M moy_predica_6M moy_predica_ecart_6M moy_bgpi_6M moy_bgpi_ecart_6M moy_epargne_logement_6M moy_epargne_logement_ecart_6M "moy.an_mt_flux_cred_norme_B2"n "moy.an_mt_op_cred_ep_a_terme"n "moy.an_mt_op_debit_ep_a_terme"n "moy.an_mt_ope_credit_depot"n "moy.an_mt_ope_credit_ep_a_vue"n "moy.an_mt_ope_debit_depot"n "moy.an_mt_ope_debit_ep_a_vue"n "moy.an_mt_pmts_carte_etr"n "moy.an_mt_remise_chq"n "moy.an_mt_paie_carte"n "moy.an_mt_paie_chq"n "moy.an_nb_paie_carte"n "moy.an_nb_paie_chq"n "moy.an_mt_ret_carte_Aut_bq"n "moy.an_mt_ret_carte_CRCA"n "moy.an_mt_ret_carte_etr"n "moy.an_nb_flux_cred_normeB2"n "moy.an_nb_ope_credit_ep_a_terme"n "moy.an_nb_ope_debit_ep_a_terme"n "moy.an_nb_ope_credit_depot"n "moy.an_nb_ope_credit_ep_a_vue"n "moy.an_nb_ope_debit_depot"n "moy.an_nb_ope_debit_ep_a_vue"n "moy.an_nb_pmts_carte_etr"n "moy.an_nb_remise_chq"n "moy.an_nb_ret_carte_Aut_bq"n "moy.an_nb_ret_carte_CRCA"n "moy.an_nb_ret_carte_etr"n "moy.an_nb_ret_carte"n "moy.an_mt_factu_ttc"n "moy.an_mt_reduc_ttc"n "moy.an_mt_rist_ttc"n "moy.an_mt_mvt_domicilie_mktg"n "moy.an_nb_mvt_M_domicilie_mktg"n top_produit_Epargne top_ouverture_reclam age_classe Flag_bq_principale flag_univers_detenus csp_1 SGMT_FIDELITE situ_fam_1     /        SELECTION=STEPWISE
        SLE=0.05        SLS=0.05        INCLUDE=0        LINK=LOGIT
        OUTROC=_PROB_
        ALPHA=95        EXPEST
        PARMLABEL
        CORRB
        NOPRINT
    ;    OUTPUT OUT=WORK.PREDLogRegPredictions(LABEL="Statistiques et prédictions de régression logistique pour WORK.APPEND_TABLE" WHERE=(NOT __FLAG))        PREDPROBS=INDIVIDUAL;RUN;QUIT;
%end;
%mend;DATA WORK.PREDLogRegPredictions; 
    set WORK.PREDLogRegPredictions; 
    TOP_CREDIT_HABITAT_2017=__DEP; 
    _FROM_=__DEP;
    DROP __DEP; 
    DROP __FLAG;RUN ;QUIT ;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

That code is very hard to read.  Some formatting goes a long way in reading code, for instance, new line for each command, not using upper case etc.

From your text, you could just wrap the macro call such:

data _null_;
  do i=1 to 50;
    call execute('%macro RunReg (DSName, NumVars) ;');
  end;
run;

This will run the code 50 times.  Am sure you some other change to it though?

PaigeMiller
Diamond | Level 26

Hey @ClemH, there's no reason to post the same question multiple times. You already asked this at

https://communities.sas.com/t5/SAS-Statistical-Procedures/How-to-loop-a-logistic-regression-n-number...

where a very good answer has been provided by @Rick_SAS, that doesn't need macros.

 

Hey, Moderators, can you combine these threads?

--
Paige Miller
Kurt_Bremser
Super User

@PaigeMiller wrote:

Hey @ClemH, there's no reason to post the same question multiple times. You already asked this at

https://communities.sas.com/t5/SAS-Statistical-Procedures/How-to-loop-a-logistic-regression-n-number...

where a very good answer has been provided by @Rick_SAS, that doesn't need macros.

 

Hey, Moderators, can you combine these threads?


I merged it back into the original thread.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2210 views
  • 0 likes
  • 5 in conversation