BookmarkSubscribeRSS Feed
ayapow
Calcite | Level 5

Hello all,

 

I am having difficulty figuring out the reason for this error message. Currently trying to run a multivariable model for prediction on vaping status. My issue lies in my references. I have found GRAD_T will pull in all levels of 8th, 10th, and 12th grade into the model without difficulty, but only when I exclude AGE_T from the model entirely. I have tried variations on AGE_T including 'first' but it just seems to mess up GRAD_T. Running a cross tabulation of GRAD_T with AGE_T and I do have zeros in the rows, but independently values are present. 

 

I have went back in my code and eliminated missing values for both GRAD_T and AGE_T. I have even tried different coded string in the '', including 'first' and '1' without success.

*BEGINNING OF CODE*;

libname MTFGR '/home/u52765837/MTFDATA';

*INPUT ALL DATA STARTING WITH GRADE 12, 2019*;

data school12_2019;
    set mtfgr.grade1219;
    year=2019;
    grade=12;
    keep GRADE v2150 v2151 v2102 v2106 v2117 v2582 v2102d v2106d v2117d v13 archive_wt RESPONDENT_AGE V2163
 V2164 V2179 SEX RACE GRADAVG DAD_EDU MOM_EDU CIG30 MJ30 ALC30 VAPE30 AGE SCHOOL_REGION;

*CREATING SEX LABELS*;
IF V2150= 1 THEN SEX= 'MALE';
ELSE IF V2150= 2 THEN SEX= 'FEMALE';

*CREATING RACE LABELS*;
IF V2151= 1 THEN RACE= 'BLACK/AFRICAN AMERICAN';
ELSE IF V2151= 2 THEN RACE= 'WHITE/CAUCASIAN';
ELSE IF V2151= 3 THEN RACE= 'HISPANIC';

*CREATING AGE LABELS*;
IF RESPONDENT_AGE= 1 THEN AGE= 'UNDER 18';
ELSE IF RESPONDENT_AGE= 2 THEN AGE= '18 OR OLDER';
if RESPONDENT_AGE= -9  THEN delete;

*CREATING LABEL FOR SCHOOL REGION*;
IF V13= 1 THEN SCHOOL_REGION= 'NORTHEAST';
ELSE IF V13= 2 THEN SCHOOL_REGION= 'MIDWEST';
ELSE IF V13= 3 THEN SCHOOL_REGION= 'SOUTH';
ELSE IF V13= 4 THEN SCHOOL_REGION= 'WEST';

*CREATING GRADE AVERAGE INTO NEW VARIABLE*;
IF V2179 IN (9,8) THEN GRADAVG= 'A';
ELSE IF V2179 IN (7,6,5)  THEN GRADAVG= 'B';
ELSE IF V2179 IN (4,3,2) THEN GRADAVG= 'C';
ELSE IF V2179= 1 THEN GRADAVG= 'D';

*CREATING FATHER EDUCATION INTO NEW VARIABLE*;
IF V2163 IN (1,2,3) THEN DAD_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V2163
 IN (4,5) THEN DAD_EDU= 'UP TO COLLEGE';
ELSE IF V2163
 = 6 THEN DAD_EDU= 'GRAD SCHOOL';
ELSE IF V2163
= 9 THEN DAD_EDU= 'DONT KNOW';

*CREATING MOTHER EDUCATION INTO NEW VARIABLE*;
IF V2164 IN (1,2,3) THEN MOM_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V2164 IN (4,5) THEN MOM_EDU= 'UP TO COLLEGE';
ELSE IF V2164 = 6 THEN MOM_EDU= 'GRAD SCHOOL';
ELSE IF V2164= 9 THEN MOM_EDU= 'DONT KNOW';

/* CREATING CIG 30 DAY DICHOTOMOUS VARIABLE */
IF V2102D= 0 THEN CIG30= 'NO ';
ELSE IF V2102D = 1 THEN CIG30= 'YES';

/* CREATING ALCOHOL 30 DAY DICHOTOMOUS VARIABLE */
IF V2106D= 0 THEN ALC30= 'NO ';
ELSE IF V2106D= 1 THEN ALC30= 'YES';

/* CREATING MARIJUANA 30 DAY DICHOTOMOUS VARIABLE */
IF V2117D= 0 THEN MJ30= 'NO ';
ELSE IF V2117D= 1 THEN MJ30= 'YES';

/* CREATING VAPE 30 DAY DICHOTOMOUS VARIABLE */
IF V2582= 1 THEN VAPE30= 'NO ';
ELSE IF V2582 IN (2,3,4,5,6,7) THEN VAPE30= 'YES';

*labeling variables FOR MISSING VALUES*; 
if v2150= -9 then v2150= .;
if v2151= -9 then v2151= .;
if v2102= -9 then v2102= .;
if v2106= -9 then v2106= .;
if v2117= -9 then v2117= .;
if v2582= -9 then v2582= .;
if v2102d= -9 then v2102d= .;
if v2106d= -9 then v2106d= .;
if v2117d= -9 then v2117d= .;
if V2163= -9 then V2163= .;
if v2164= -9 then v2164= .;
if v2179= -9 then v2179= .;

label SEX= 'SEX'
      RACE= 'RACE'
      V2582= '30 DAYS VAPE NICOTINE'  
      CIG30= '30 DAY CIGARETTE USE- DICHOTOMOUS'
      ALC30= '30 DAY ALCOHOL USE- DICHOTOMOUS'
      MJ30= '30 DAY MARIJUANA USE- DICHOTOMOUS'
      SCHOOL_REGION= 'SCHOOL REGION'
      AGE= 'AGE CATEGORIES'
      DAD_EDU= 'FATHER EDUCATION LEVEL'
      MOM_EDU= 'MOTHER EDUCATION LEVEL'
      GRADAVG= 'AVERAGE GRADE'
      VAPE30= '30 DAY VAPE USE- DICHOTOMOUS';
      
*GRADE 12, 2020*;

data school12_2020;
    set mtfgr.grade1220;
    year=2020;
    grade=12;
    keep GRADE v2150 v2151 v2102 v2106 v2117 v2582 v2102d v2106d v2117d v13 archive_wt RESPONDENT_AGE V2163
 V2164 V2179 SEX RACE GRADAVG DAD_EDU MOM_EDU CIG30 MJ30 ALC30 VAPE30 AGE SCHOOL_REGION;
    
*CREATING SEX LABELS*;
IF V2150= 1 THEN SEX= 'MALE';
ELSE IF V2150= 2 THEN SEX= 'FEMALE';

*CREATING RACE LABELS*;
IF V2151= 1 THEN RACE= 'BLACK/AFRICAN AMERICAN';
ELSE IF V2151= 2 THEN RACE= 'WHITE/CAUCASIAN';
ELSE IF V2151= 3 THEN RACE= 'HISPANIC';

*CREATING AGE LABELS*;
IF RESPONDENT_AGE= 1 THEN AGE= 'UNDER 18';
ELSE IF RESPONDENT_AGE= 2 THEN AGE= '18 OR OLDER';
if RESPONDENT_AGE= -9 THEN delete;

*CREATING LABEL FOR SCHOOL REGION*;
IF V13= 1 THEN SCHOOL_REGION= 'NORTHEAST';
ELSE IF V13= 2 THEN SCHOOL_REGION= 'MIDWEST';
ELSE IF V13= 3 THEN SCHOOL_REGION= 'SOUTH';
ELSE IF V13= 4 THEN SCHOOL_REGION= 'WEST';

*CREATING GRADE AVERAGE INTO NEW VARIABLE*;
IF V2179 IN (9,8) THEN GRADAVG= 'A';
ELSE IF V2179 IN (7,6,5)  THEN GRADAVG= 'B';
ELSE IF V2179 IN (4,3,2) THEN GRADAVG= 'C';
ELSE IF V2179= 1 THEN GRADAVG= 'D';

*CREATING FATHER EDUCATION INTO NEW VARIABLE*;
IF V2163
 IN (1,2,3) THEN DAD_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V2163
 IN (4,5) THEN DAD_EDU= 'UP TO COLLEGE';
ELSE IF V2163
 = 6 THEN DAD_EDU= 'GRAD SCHOOL';
ELSE IF V2163
= 9 THEN DAD_EDU= 'DONT KNOW';

*CREATING MOTHER EDUCATION INTO NEW VARIABLE*;
IF V2164 IN (1,2,3) THEN MOM_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V2164 IN (4,5) THEN MOM_EDU= 'UP TO COLLEGE';
ELSE IF V2164 = 6 THEN MOM_EDU= 'GRAD SCHOOL';
ELSE IF V2164= 9 THEN MOM_EDU= 'DONT KNOW';

/* CREATING CIG 30 DAY DICHOTOMOUS VARIABLE */
IF V2102D= 0 THEN CIG30= 'NO ';
ELSE IF V2102D = 1 THEN CIG30= 'YES';

/* CREATING ALCOHOL 30 DAY DICHOTOMOUS VARIABLE */
IF V2106D= 0 THEN ALC30= 'NO ';
ELSE IF V2106D= 1 THEN ALC30= 'YES';

/* CREATING MARIJUANA 30 DAY DICHOTOMOUS VARIABLE */
IF V2117D= 0 THEN MJ30= 'NO ';
ELSE IF V2117D= 1 THEN MJ30= 'YES';

/* CREATING VAPE 30 DAY DICHOTOMOUS VARIABLE */
IF V2582= 1 THEN VAPE30= 'NO ';
ELSE IF V2582 IN (2,3,4,5,6,7) THEN VAPE30= 'YES';

*labeling variables FOR MISSING VALUES*; 
if v2150= -9 then v2150= .;
if v2151= -9 then v2151= .;
if v2102= -9 then v2102= .;
if v2106= -9 then v2106= .;
if v2117= -9 then v2117= .;
if v2582= -9 then v2582= .;
if v2102d= -9 then v2102d= .;
if v2106d= -9 then v2106d= .;
if v2117d= -9 then v2117d= .;
if V2163
= -9 then V2163
= .;
if v2164= -9 then v2164= .;
if v2179= -9 then v2179= .;

label SEX= 'SEX'
      RACE= 'RACE'
      V2582= '30 DAYS VAPE NICOTINE'  
      CIG30= '30 DAY CIGARETTE USE- DICHOTOMOUS'
      ALC30= '30 DAY ALCOHOL USE- DICHOTOMOUS'
      MJ30= '30 DAY MARIJUANA USE- DICHOTOMOUS'
      SCHOOL_REGION= 'SCHOOL REGION'
      AGE= 'AGE CATEGORIES'
      DAD_EDU= 'FATHER EDUCATION LEVEL'
      MOM_EDU= 'MOTHER EDUCATION LEVEL'
      GRADAVG= 'AVERAGE GRADE'
      VAPE30= '30 DAY VAPE USE- DICHOTOMOUS';
      
    
/* 2021 */
data school12_2021;
    set mtfgr.grade1221;
    year=2021;
    grade=12;
    keep GRADE v2150 v2151 v2102 v2106 v2117 v7782 v2102d v2106d v2117d v13 archive_wt RESPONDENT_AGE V2163
 V2164 V2179 SEX RACE GRADAVG DAD_EDU MOM_EDU CIG30 MJ30 ALC30 VAPE30 AGE SCHOOL_REGION;
    
*CREATING SEX LABELS*;
IF V2150= 1 THEN SEX= 'MALE';
ELSE IF V2150= 2 THEN SEX= 'FEMALE';

*CREATING RACE LABELS*;
IF V2151= 1 THEN RACE= 'BLACK/AFRICAN AMERICAN';
ELSE IF V2151= 2 THEN RACE= 'WHITE/CAUCASIAN';
ELSE IF V2151= 3 THEN RACE= 'HISPANIC';

*CREATING AGE LABELS*;
IF RESPONDENT_AGE= 1 THEN AGE= 'UNDER 18';
ELSE IF RESPONDENT_AGE= 2 THEN AGE= '18 OR OLDER';
if RESPONDENT_AGE= -9 THEN delete;

*CREATING LABEL FOR SCHOOL REGION*;
IF V13= 1 THEN SCHOOL_REGION= 'NORTHEAST';
ELSE IF V13= 2 THEN SCHOOL_REGION= 'MIDWEST';
ELSE IF V13= 3 THEN SCHOOL_REGION= 'SOUTH';
ELSE IF V13= 4 THEN SCHOOL_REGION= 'WEST';

*CREATING GRADE AVERAGE INTO NEW VARIABLE*;
IF V2179 IN (9,8) THEN GRADAVG= 'A';
ELSE IF V2179 IN (7,6,5)  THEN GRADAVG= 'B';
ELSE IF V2179 IN (4,3,2) THEN GRADAVG= 'C';
ELSE IF V2179= 1 THEN GRADAVG= 'D';

*CREATING FATHER EDUCATION INTO NEW VARIABLE*;
IF V2163
 IN (1,2,3) THEN DAD_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V2163
 IN (4,5) THEN DAD_EDU= 'UP TO COLLEGE';
ELSE IF V2163
 = 6 THEN DAD_EDU= 'GRAD SCHOOL';
ELSE IF V2163
= 9 THEN DAD_EDU= 'DONT KNOW';

*CREATING MOTHER EDUCATION INTO NEW VARIABLE*;
IF V2164 IN (1,2,3) THEN MOM_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V2164 IN (4,5) THEN MOM_EDU= 'UP TO COLLEGE';
ELSE IF V2164 = 6 THEN MOM_EDU= 'GRAD SCHOOL';
ELSE IF V2164= 9 THEN MOM_EDU= 'DONT KNOW';

/* CREATING CIG 30 DAY DICHOTOMOUS VARIABLE */
IF V2102D= 0 THEN CIG30= 'NO ';
ELSE IF V2102D = 1 THEN CIG30= 'YES';

/* CREATING ALCOHOL 30 DAY DICHOTOMOUS VARIABLE */
IF V2106D= 0 THEN ALC30= 'NO ';
ELSE IF V2106D= 1 THEN ALC30= 'YES';

/* CREATING MARIJUANA 30 DAY DICHOTOMOUS VARIABLE */
IF V2117D= 0 THEN MJ30= 'NO ';
ELSE IF V2117D= 1 THEN MJ30= 'YES';

/* CREATING VAPE 30 DAY DICHOTOMOUS VARIABLE */
IF V7782= 1 THEN VAPE30= 'NO ';
ELSE IF V7782 IN (2,3,4,5,6,7) THEN VAPE30= 'YES';

*labeling variables FOR MISSING VALUES*; 
if v2150= -9 then v2150= .;
if v2151= -9 then v2151= .;
if v2102= -9 then v2102= .;
if v2106= -9 then v2106= .;
if v2117= -9 then v2117= .;
if v2102d= -9 then v2102d= .;
if v2106d= -9 then v2106d= .;
if v2117d= -9 then v2117d= .;
if V2163= -9 then V2163= .;
if v2164= -9 then v2164= .;
if v2179= -9 then v2179= .;

label SEX= 'SEX'
      RACE= 'RACE'
      V2582= '30 DAYS VAPE NICOTINE'  
      CIG30= '30 DAY CIGARETTE USE- DICHOTOMOUS'
      ALC30= '30 DAY ALCOHOL USE- DICHOTOMOUS'
      MJ30= '30 DAY MARIJUANA USE- DICHOTOMOUS'
      SCHOOL_REGION= 'SCHOOL REGION'
      AGE= 'AGE CATEGORIES'
      DAD_EDU= 'FATHER EDUCATION LEVEL'
      MOM_EDU= 'MOTHER EDUCATION LEVEL'
      GRADAVG= 'AVERAGE GRADE'
      VAPE30= '30 DAY VAPE USE- DICHOTOMOUS';

*CONCATENATED DATASET FOR 12 GRADE ALL YEARS*;

DATA SCHOOL123;
    SET SCHOOL12_2019 SCHOOL12_2020 SCHOOL12_2021;
    rename archive_wt=weight;
RUN;


*INPUT DATA STARTING WITH GRADE 8, 2019-2021*; 
data school8_2019;
    set mtfgr.grade819;
    year=2019;
    grade=8;
       keep v7301 v7202 V1252 v1070 v7102 v7107 v7114 V7763 v7102d v7107d v7114d v507 v5 V7215 V7216 V7221 GRADLVL SEX RACE GRADAVG DAD_EDU MOM_EDU CIG30 MJ30 ALC30 VAPE30 AGE SCHOOL_REGION;

*CREATING GRADE LEVEL DISTINCTION*;
IF V7301 = 2 THEN GRADLVL= '8TH ';
ELSE IF V7301= 4 THEN GRADLVL= '10TH';
ELSE IF V7301 IN (1,3,5,6, -9) THEN DELETE;

*CREATING SEX LABELS*;
IF V7202= 1 THEN SEX= 'MALE';
ELSE IF V7202= 2 THEN SEX= 'FEMALE';

*CREATING RACE LABELS*;
IF V1070= 1 THEN RACE= 'BLACK/AFRICAN AMERICAN';
ELSE IF V1070= 2 THEN RACE= 'WHITE/CAUCASIAN';
ELSE IF V1070= 3 THEN RACE= 'HISPANIC';

*CREATING AGE LABELS*;
IF V1252= 1 THEN AGE= 'UNDER 16';
ELSE IF V1252= 2 THEN AGE= '16 OR OLDER';
IF V1252= (-9 -8) THEN delete;

*CREATING LABEL FOR SCHOOL REGION*;
IF V507= 1 THEN SCHOOL_REGION= 'NORTHEAST';
ELSE IF V507= 2 THEN SCHOOL_REGION= 'MIDWEST';
ELSE IF V507= 3 THEN SCHOOL_REGION= 'SOUTH';
ELSE IF V507= 4 THEN SCHOOL_REGION= 'WEST';

*CREATING GRADE AVERAGE INTO NEW VARIABLE*;
IF V7221 IN (9,8) THEN GRADAVG= 'A';
ELSE IF V7221 IN (7,6,5)  THEN GRADAVG= 'B';
ELSE IF V7221   IN (4,3,2) THEN GRADAVG= 'C';
ELSE IF V7221= 1 THEN GRADAVG= 'D';

*CREATING FATHER EDUCATION INTO NEW VARIABLE*;
IF V7215 IN (1,2,3) THEN DAD_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V7215 IN (4,5) THEN DAD_EDU= 'UP TO COLLEGE';
ELSE IF V7215= 6 THEN DAD_EDU= 'GRAD SCHOOL';
ELSE IF V7215= 7 THEN DAD_EDU= 'DONT KNOW';

*CREATING MOTHER EDUCATION INTO NEW VARIABLE*;
IF V7216 IN (1,2,3) THEN MOM_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V7216 IN (4,5) THEN MOM_EDU= 'UP TO COLLEGE';
ELSE IF V7216 = 6 THEN MOM_EDU= 'GRAD SCHOOL';
ELSE IF V7216= 7 THEN MOM_EDU= 'DONT KNOW';

/* CREATING CIG 30 DAY DICHOTOMOUS VARIABLE */
IF V7102D= 0 THEN CIG30= 'NO ';
ELSE IF V7102D = 1 THEN CIG30= 'YES';

/* CREATING ALCOHOL 30 DAY DICHOTOMOUS VARIABLE */
IF V7107D= 0 THEN ALC30= 'NO ';
ELSE IF V7107D= 1 THEN ALC30= 'YES';

/* CREATING MARIJUANA 30 DAY DICHOTOMOUS VARIABLE */
IF V7114D= 0 THEN MJ30= 'NO ';
ELSE IF V7114D= 1 THEN MJ30= 'YES';

/* CREATING VAPE 30 DAY DICHOTOMOUS VARIABLE */
IF V7763= 1 THEN VAPE30= 'NO ';
ELSE IF V7763 IN (2,3,4,5,6,7) THEN VAPE30= 'YES';

*labeling variables FOR MISSING VALUES*; 
if v7202= -9 then v2150= .;
if v1070= -9 then vV1070= .;
/* if v2102= -9 then v2102= .; */
/* if v2106= -9 then v2106= .; */
/* if v2117= -9 then v2117= .; */
if v7221= -9 then v7221= .;
IF V7215= -9 THEN V7215= .;
IF V7216= -9 THEN V7216= .; 
if V7102D= -9 then vV7102D= .;
if v7107D= -9 then v7107D= .;
if V7114D= -9 then V7114D= .;

label SEX= 'SEX'
      RACE= 'RACE' 
      CIG30= '30 DAY CIGARETTE USE- DICHOTOMOUS'
      ALC30= '30 DAY ALCOHOL USE- DICHOTOMOUS'
      MJ30= '30 DAY MARIJUANA USE- DICHOTOMOUS'
      SCHOOL_REGION= 'SCHOOL REGION'
      AGE= 'AGE CATEGORIES'
      DAD_EDU= 'FATHER EDUCATION LEVEL'
      MOM_EDU= 'MOTHER EDUCATION LEVEL'
      GRADAVG= 'AVERAGE GRADE'
      VAPE30= '30 DAY VAPE USE- DICHOTOMOUS'
      GRADLVL= 'GRADE LEVEL';

*2020*;
data school8_2020;
    set mtfgr.grade820;
    year=2020;
    grade=8;
       keep V7301 v7202 V1252 v1070 v7102 v7107 v7114 V7763 v7102d v7107d v7114d v507 v5 V7215 V7216 V7221 GRADLVL SEX RACE GRADAVG DAD_EDU MOM_EDU CIG30 MJ30 ALC30 VAPE30 AGE SCHOOL_REGION;

*CREATING GRADE LEVEL DISTINCTION*;
IF V7301 = 2 THEN GRADLVL= '8TH ';
ELSE IF V7301= 4 THEN GRADLVL= '10TH';
ELSE IF V7301 IN (1,3,5,6, -9) THEN DELETE;

*CREATING SEX LABELS*;
IF V7202= 1 THEN SEX= 'MALE';
ELSE IF V7202= 2 THEN SEX= 'FEMALE';

*CREATING RACE LABELS*;
IF V1070= 1 THEN RACE= 'BLACK/AFRICAN AMERICAN';
ELSE IF V1070= 2 THEN RACE= 'WHITE/CAUCASIAN';
ELSE IF V1070= 3 THEN RACE= 'HISPANIC';

*CREATING AGE LABELS*;
IF V1252= 1 THEN AGE= 'UNDER 16';
ELSE IF V1252= 2 THEN AGE= '16 OR OLDER';
ELSE IF V1252= (-8 -9) THEN delete;

*CREATING LABEL FOR SCHOOL REGION*;
IF V507= 1 THEN SCHOOL_REGION= 'NORTHEAST';
ELSE IF V507= 2 THEN SCHOOL_REGION= 'MIDWEST';
ELSE IF V507= 3 THEN SCHOOL_REGION= 'SOUTH';
ELSE IF V507= 4 THEN SCHOOL_REGION= 'WEST';

*CREATING GRADE AVERAGE INTO NEW VARIABLE*;
IF V7221 IN (9,8) THEN GRADAVG= 'A';
ELSE IF V7221 IN (7,6,5)  THEN GRADAVG= 'B';
ELSE IF V7221   IN (4,3,2) THEN GRADAVG= 'C';
ELSE IF V7221= 1 THEN GRADAVG= 'D';

*CREATING FATHER EDUCATION INTO NEW VARIABLE*;
IF V7215 IN (1,2,3) THEN DAD_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V7215 IN (4,5) THEN DAD_EDU= 'UP TO COLLEGE';
ELSE IF V7215= 6 THEN DAD_EDU= 'GRAD SCHOOL';
ELSE IF V7215= 7 THEN DAD_EDU= 'DONT KNOW';

*CREATING MOTHER EDUCATION INTO NEW VARIABLE*;
IF V7216 IN (1,2,3) THEN MOM_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V7216 IN (4,5) THEN MOM_EDU= 'UP TO COLLEGE';
ELSE IF V7216 = 6 THEN MOM_EDU= 'GRAD SCHOOL';
ELSE IF V7216= 7 THEN MOM_EDU= 'DONT KNOW';

/* CREATING CIG 30 DAY DICHOTOMOUS VARIABLE */
IF V7102D= 0 THEN CIG30= 'NO ';
ELSE IF V7102D = 1 THEN CIG30= 'YES';

/* CREATING ALCOHOL 30 DAY DICHOTOMOUS VARIABLE */
IF V7107D= 0 THEN ALC30= 'NO ';
ELSE IF V7107D= 1 THEN ALC30= 'YES';

/* CREATING MARIJUANA 30 DAY DICHOTOMOUS VARIABLE */
IF V7114D= 0 THEN MJ30= 'NO ';
ELSE IF V7114D= 1 THEN MJ30= 'YES';

/* CREATING VAPE 30 DAY DICHOTOMOUS VARIABLE */
IF V7763= 1 THEN VAPE30= 'NO ';
ELSE IF V7763 IN (2,3,4,5,6,7) THEN VAPE30= 'YES';

*labeling variables FOR MISSING VALUES*; 
if v7202= -9 then v2150= .;
if v1070= -9 then vV1070= .;
/* if v2102= -9 then v2102= .; */
/* if v2106= -9 then v2106= .; */
/* if v2117= -9 then v2117= .; */
if v7221= -9 then v7221= .;
IF V7215= -9 THEN V7215= .;
IF V7216= -9 THEN V7216= .; 
if V7102D= -9 then vV7102D= .;
if v7107D= -9 then v7107D= .;
if V7114D= -9 then V7114D= .;

label SEX= 'SEX'
      RACE= 'RACE'  
      CIG30= '30 DAY CIGARETTE USE- DICHOTOMOUS'
      ALC30= '30 DAY ALCOHOL USE- DICHOTOMOUS'
      MJ30= '30 DAY MARIJUANA USE- DICHOTOMOUS'
      SCHOOL_REGION= 'SCHOOL REGION'
      AGE= 'AGE CATEGORIES'
      DAD_EDU= 'FATHER EDUCATION LEVEL'
      MOM_EDU= 'MOTHER EDUCATION LEVEL'
      GRADAVG= 'AVERAGE GRADE'
      VAPE30= '30 DAY VAPE USE- DICHOTOMOUS';

*2021*;
data school8_2021;
    set mtfgr.grade821;
    year=2021;
    grade=8;
       keep V7301 v7202 V1252 v1070 v7102 v7107 v7114 V7782 v7102d v7107d v7114d v507 v5 V7215 V7216 V7221 GRADLVL SEX RACE GRADAVG DAD_EDU MOM_EDU CIG30 MJ30 ALC30 VAPE30 AGE SCHOOL_REGION;

*CREATING GRADE LEVEL DISTINCTION*;
IF V7301 = 2 THEN GRADLVL= '8TH ';
ELSE IF V7301= 4 THEN GRADLVL= '10TH';
ELSE IF V7301 IN (1,3,5,6, -9) THEN DELETE;

*CREATING SEX LABELS*;
IF V7202= 1 THEN SEX= 'MALE';
ELSE IF V7202= 2 THEN SEX= 'FEMALE';
/* ELSE IF V7202= 3 THEN SEX= 'OTHER'; */

*CREATING RACE LABELS*;
IF V1070= 1 THEN RACE= 'BLACK/AFRICAN AMERICAN';
ELSE IF V1070= 2 THEN RACE= 'WHITE/CAUCASIAN';
ELSE IF V1070= 3 THEN RACE= 'HISPANIC';

*CREATING AGE LABELS*;
IF V1252= 1 THEN AGE= 'UNDER 16';
ELSE IF V1252= 2 THEN AGE= '16 OR OLDER';
ELSE IF V1252= (-9 -8) THEN delete;

*CREATING LABEL FOR SCHOOL REGION*;
IF V507= 1 THEN SCHOOL_REGION= 'NORTHEAST';
ELSE IF V507= 2 THEN SCHOOL_REGION= 'MIDWEST';
ELSE IF V507= 3 THEN SCHOOL_REGION= 'SOUTH';
ELSE IF V507= 4 THEN SCHOOL_REGION= 'WEST';

*CREATING GRADE AVERAGE INTO NEW VARIABLE*;
IF V7221 IN (9,8) THEN GRADAVG= 'A';
ELSE IF V7221 IN (7,6,5)  THEN GRADAVG= 'B';
ELSE IF V7221   IN (4,3,2) THEN GRADAVG= 'C';
ELSE IF V7221= 1 THEN GRADAVG= 'D';

*CREATING FATHER EDUCATION INTO NEW VARIABLE*;
IF V7215 IN (1,2,3) THEN DAD_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V7215 IN (4,5) THEN DAD_EDU= 'UP TO COLLEGE';
ELSE IF V7215= 6 THEN DAD_EDU= 'GRAD SCHOOL';
ELSE IF V7215= 7 THEN DAD_EDU= 'DONT KNOW';

*CREATING MOTHER EDUCATION INTO NEW VARIABLE*;
IF V7216 IN (1,2,3) THEN MOM_EDU= 'UP TO HIGH SCHOOL';
ELSE IF V7216 IN (4,5) THEN MOM_EDU= 'UP TO COLLEGE';
ELSE IF V7216 = 6 THEN MOM_EDU= 'GRAD SCHOOL';
ELSE IF V7216= 7 THEN MOM_EDU= 'DONT KNOW';

/* CREATING CIG 30 DAY DICHOTOMOUS VARIABLE */
IF V7102D= 0 THEN CIG30= 'NO ';
ELSE IF V7102D = 1 THEN CIG30= 'YES';

/* CREATING ALCOHOL 30 DAY DICHOTOMOUS VARIABLE */
IF V7107D= 0 THEN ALC30= 'NO ';
ELSE IF V7107D= 1 THEN ALC30= 'YES';

/* CREATING MARIJUANA 30 DAY DICHOTOMOUS VARIABLE */
IF V7114D= 0 THEN MJ30= 'NO ';
ELSE IF V7114D= 1 THEN MJ30= 'YES';

/* CREATING VAPE 30 DAY DICHOTOMOUS VARIABLE */
IF V7782= 1 THEN VAPE30= 'NO ';
ELSE IF V7782 IN (2,3,4,5,6,7) THEN VAPE30= 'YES';

*labeling variables FOR MISSING VALUES*; 
if v7202= -9 then v2150= .;
if v1070= -9 then vV1070= .;
/* if v2102= -9 then v2102= .; */
/* if v2106= -9 then v2106= .; */
/* if v2117= -9 then v2117= .; */
if v7221= -9 then v7221= .;
IF V7215= -9 THEN V7215= .;
IF V7216= -9 THEN V7216= .; 
if V7102D= -9 then vV7102D= .;
if v7107D= -9 then v7107D= .;
if V7114D= -9 then V7114D= .;
IF V7782= -9 THEN V7782= .;

label SEX= 'SEX'
      RACE= 'RACE'
      CIG30= '30 DAY CIGARETTE USE- DICHOTOMOUS'
      ALC30= '30 DAY ALCOHOL USE- DICHOTOMOUS'
      MJ30= '30 DAY MARIJUANA USE- DICHOTOMOUS'
      SCHOOL_REGION= 'SCHOOL REGION'
      DAD_EDU= 'FATHER EDUCATION LEVEL'
      MOM_EDU= 'MOTHER EDUCATION LEVEL'
      GRADAVG= 'AVERAGE GRADE'
      VAPE30= '30 DAY VAPE USE- DICHOTOMOUS';
     
      
*CONCATENATED DATA SETS 2019-2021 FOR 8TH AND 10TH GRADE*;

DATA SCHOOL1234;
    SET SCHOOL8_2019 SCHOOL8_2020 SCHOOL8_2021;
rename v5=weight;

RUN;

*FORMATTING FOR CHARACTER STRINGS*;
PROC FORMAT; *FOR CHARACTER STRINGS*;
VALUE AGE_T 1= 'UNDER 16' 2= '16 AND OLDER/UNDER 18' 3= 'OVER 18';
VALUE GRAD_T 1= '8TH GRADE' 2= '10TH GRADE' 3= '12TH GRADE';

*complete data set w all grades after concatenating*;
DATA SCHOOL12345;
SET SCHOOL123 SCHOOL1234;
keep vape30 gradlvl SEX RACE SCHOOL_REGION AGE GRADAVG DAD_EDU MOM_EDU CIG30 ALC30 MJ30 AGE_T GRAD_T WEIGHT;

*creating GROUPINGS/ FORMATTING ALL VARIABLES FOR MULTIVARIABLE ANALYSIS*;

if AGE= 'UNDER 16' then AGE_T= 1;
ELSE IF AGE='16 OR OL' or AGE= 'UNDER 18' THEN AGE_T= 2;
ELSE IF AGE='18 OR OL' THEN AGE_T= 3;

IF GRADLVL ='8TH' THEN GRAD_T= 1;
else if grade =12 THEN GRAD_T= 3;
else IF GRADLVL='10TH' THEN GRAD_T= 2;

FORMAT AGE_T AGE_T. GRAD_T GRAD_T.;

RUN;

PROC CONTENTS DATA= SCHOOL12345;


*frequnecy count with procsurvey no strata, all grades*;
proc surveyfreq data=school12345;
TABLES (SEX RACE SCHOOL_REGION GRADAVG DAD_EDU MOM_EDU CIG30 ALC30 MJ30 GRAD_T AGE_T)*vape30/chisq;
where vape30 is not missing;
weight WEIGHT;
title "procsurveyfreq ALL GRADES, NO STRATIFICATION, 2019-2021";
RUN;


*multivariable logistic analysis*;
proc surveylogistic data=school12345;
class AGE_T(REF='UNDER 16') GRAD_T(REF='8TH GRADE') sex(ref='MALE') RACE(REF='WHITE/CAUCASIAN') SCHOOL_REGION(REF='NORTHEAST') 
GRADAVG(REF='A') DAD_EDU(REF='UP TO COLLEGE') MOM_EDU(REF='UP TO COLLEGE') CIG30(REF='YES') ALC30(REF='YES') MJ30(REF='YES')/ PARAM=REF;
   weight weight;
   model vape30(EVENT='YES') = AGE_T GRAD_T SEX RACE SCHOOL_REGION GRADAVG DAD_EDU MOM_EDU alc30 cig30 mj30;
run;

 

 

3 REPLIES 3
ballardw
Super User

Any time you ask about an error you should copy the text of the code for the procedure or data step along with all the messages for that procedure/data step from the LOG and on the forum open a text box and paste the LOG text.

 

SAS often places diagnostic information in the log along with the actual error text and that tells where to start.

 

See this example of a similar error from Procsurveyreg (simpler just to create the error message).

1416  proc format;
1417  value myage
1418  11 ='Youngest'
1419  ;
NOTE: Format MYAGE is already on the library WORK.FORMATS.
NOTE: Format MYAGE has been output.
1420  /*reference error*/

1421  proc surveyreg data=sashelp.class;
1422     class age (ref='Youngest');
1423     model weight= age height;
1424  run;

ERROR: Invalid reference value for Age.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SURVEYREG used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

1425

1426  /* see what is missing in your code*/

1427  proc surveyreg data=sashelp.class;
1428     class age (ref='Youngest');
1429     model weight= age height;
1430     format age myage.;
1431  run;

NOTE: PROCEDURE SURVEYREG used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

If you use the FORMATTED  value of a variable as the Reference value you better make sure that the format is actually assigned to the variable. You will need to make sure the formats are assigned to multiple variables.

 

Age of students and grade level are highly correlated, grade is practically dependent on age for the majority of students. It is extremely likely than none of your 12 grade students are in the under 16 age group (one of the places I would not be surprised to a zero in proc freq of age vs grade). Similarly you are unlikely to have very few if any grade 8 students that are over 18. So you are getting separation of data with both age and grade in the model. Using only grades 8, 10 and 12 will make the separation even stronger. You probably won't get much of a usable model for both age and school grade level.

ayapow
Calcite | Level 5

Thank you for your response. This is the log including the procsurvey freq procedure and the error message.

 

 686        
 687        *multivariable logistic analysis*;
 688        proc surveylogistic data=school12345;
 689        class AGE_T(REF='UNDER 16') GRAD_T(REF='1') sex(ref='MALE') RACE(REF='WHITE/CAUCASIAN') SCHOOL_REGION(REF='NORTHEAST')
 690        GRADAVG(REF='A') DAD_EDU(REF='UP TO COLLEGE') MOM_EDU(REF='UP TO COLLEGE') CIG30(REF='YES') ALC30(REF='YES')
 690      ! MJ30(REF='YES')/ PARAM=REF;
 691           weight weight;
 692           model vape30(EVENT='YES') = AGE_T GRAD_T SEX RACE SCHOOL_REGION GRADAVG DAD_EDU MOM_EDU alc30 cig30 mj30;
 693        run;
 
 ERROR: Invalid reference value for GRAD_T.
 NOTE: The SAS System stopped processing this step because of errors.
 NOTE: PROCEDURE SURVEYLOGISTIC used (Total process time):
       real time           0.05 seconds
       user cpu time       0.05 seconds
       system cpu time     0.00 seconds
       memory              1997.43k
       OS Memory           33704.00k
       Timestamp           10/05/2023 11:27:12 AM
       Step Count                        36  Switch Count  1
       Page Faults                       0
       Page Reclaims                     340
       Page Swaps                        0
       Voluntary Context Switches        5
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
       

This is my first time completing a project essentially from beginning to end with analysis. Can you explain what you meant by separation of the data with both age and grade? I can change the model and I understand they correlate, just looking for clarification on that statement.

 


@ballardw wrote:

Age of students and grade level are highly correlated, grade is practically dependent on age for the majority of students. It is extremely likely than none of your 12 grade students are in the under 16 age group (one of the places I would not be surprised to a zero in proc freq of age vs grade). Similarly you are unlikely to have very few if any grade 8 students that are over 18. So you are getting separation of data with both age and grade in the model. Using only grades 8, 10 and 12 will make the separation even stronger. You probably won't get much of a usable model for both age and school grade level.


Patrick
Opal | Level 21

Additionally to your initial question already answered by others also consider to replace all these if/then/else with a format. 

Using formats not only will increase "readability" of your code, it will also avoid the need to repeat the same if/then/else logic in multiple data steps.

 

Based on your code here an example how this could look like:

 

proc format;
  value ageGroups
    1= 'UNDER 18';
    2= '18 OR OLDER';
    other= ' '
    ;
run;

data tableOut;
 set tableIn;
  /* CREATING AGE LABELS */
  age=put(respondent_age,ageGroups.);
  if missing(age) then delete;
run;

 

 

It's often not even necessary to create a new variable with the formatted values but to just "attach" the format to the variable with the "raw" values. This way SAS will print the formatted values and many SAS Procs allow you to chose if you want to use the internal "raw" values or the formatted values.

data tableOut;
 set tableIn;
 format respondent_age ageGroups.;
 if put(respondent_age,ageGroups.)=' ' then delete;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 594 views
  • 0 likes
  • 3 in conversation