BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ksmielitz
Quartz | Level 8

I have two defined educational variables (resdaded and resmomed) for each respondent. I need to pull the max value for each respondent id. My data set is defined to order by id. Here's what I have so far:

 

 

*Residential Father's Highest Grade Completed (95= ungraded, only 3, so not concerned);
if R1302600 not in (.R, .D, .V, .I, .N) then do;
ResDad=R1302600; DadEducMiss=0;
if ResDad in (1:5) then elementary=1; else elementary=0;
if ResDad in (6:8) then middleschool=1; else middleschool=0;
if ResDad in (9:12) then highschool=1; else highschool=0;
if ResDad in (13:15) then somecollege=1; else somecollege=0;
if ResDad in (16:17) then Bachelors=1; else Bachelors=0;
if ResDad in (18:20) then PostGrad=1; else PostGrad=0;
if ResDad=95 then ResDad=.;
end;
if R1302600 in (.R, .D, .V, .I, .N) then do;
ResDad=R1302600; DadEducMiss=1;
end;

*Residential Mother's Highest Grade Completed (95=ungraded, only 5, so not concerned);
if R1302700 not in (.R, .D, .V, .I, .N) then do;
ResMom=R1302700; MomEducMiss=0;
if ResMom in (1:5) then elementary=1; else elementary=0;
if ResMom in (6:8) then middleschool=1; else middleschool=0;
if ResMom in (9:12) then highschool=1; else highschool=0;
if ResMom in (13:15) then somecollege=1; else somecollege=0;
if ResMom in (16:17) then Bachelors=1; else Bachelors=0;
if ResMom in (18:20) then PostGrad=1; else PostGrad=0;
if ResMom=95 then ResMom=.;
end;
if R1302700 in (.R, .D, .V, .I, .N) then do;
ResMom=R1302700; MomEducMiss=1;
end;

*Highest Residential Parent Education;
if resdad in (1:20) and resmom in (1:20) then do;
PARED=max (resdad resmom);

???????? (I'm not sure if how I'm defining "*Highest Residential Parent Education;" is correct...just a guess based on the info I got from my prof)

.........this is where I get stuck. Resdad and Resmom run in proc freq, so I know I'm good there, but if I want to identify whether resdad or resmom has the highest education in the household, how do I do that? 

 

I've searched Base SAS programming for different ways to run the code, and I'm not sure which proc statement to use to do that either.

 

Thanks in advance, 

Kate

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Do you only have one record per ID? If so, you can do something like the following, since your education seems to be ordinal.

You'll need to account for the missing and 95 but you can do that with an if condition. I'm also not sure how you wanted the output to look like since you didn't specify. 

 

if resDad>resMom then highest='resDad';
else if resMom>resDad then highest='resMom';

 

 

 

 

View solution in original post

13 REPLIES 13
Reeza
Super User

Do you only have one record per ID? If so, you can do something like the following, since your education seems to be ordinal.

You'll need to account for the missing and 95 but you can do that with an if condition. I'm also not sure how you wanted the output to look like since you didn't specify. 

 

if resDad>resMom then highest='resDad';
else if resMom>resDad then highest='resMom';

 

 

 

 

ksmielitz
Quartz | Level 8

Reeza, 

 

Yes, just one record per ID. I feel confident I can address the issues with the 95 and the missing data. This is terrific. 

Thank you very much!

 

Frankly, I'm not sure what I want the output to look like? I'm concerned about using the resulting "ParEd" to help in defining a SES variable. (Can you tell I don't have much experience, LOLOL).

 

Once I have the highest value identified I need dummy variables for each of the education levels, so I've tried this, but it keeps saying "ParEd" not found.

*Highest Residential Parent Education;
if resdad in (1:20) and resmom in (1:20) then do;
ParEd=max (resdad resmom);
if resDad>resMom then highest=resDad; else if resMom>resDad then highest=resMom;
if ParEd in (1:5) then ParEdElem=1; else ParEdElem=0;
if ParEd in (6:8) then ParEdmiddle=1; else ParEdmiddle=0;
if ParEd in (9:12) then ParEdHS=1; else ParEdHS=0;
if ParEd in (13:15) then ParEdSC=1; else ParEdSC=0;
if ParEd in (16:17) then ParEdBSBA=1; else ParEdBSBA=0;
if ParEd in (18:20) then ParEdGrad=1; else ParEdGrad=0;
end;

 

K8

ksmielitz
Quartz | Level 8

Okay, I figured out what I did wrong above...I was trying to define "highest" and "ParEd" in the same section. So, I took out ParEd and subbed in "highest" and it ran. Phew! Now I just need to stick in my code to account for the missing data. *Wish me luck that I do it right the first time! 😉

ksmielitz
Quartz | Level 8

Okay, so I know I'm messing up something easy, but I'm not sure what...here's the code that runs, but still says missing. I thought I had the if statement correct, but no dice. 

*Highest Residential Parent Education;
if resdad in (1:20) and resmom in (1:20) then do;
if resDad>resMom then highest=resDad; else if resMom>resDad then highest=resMom;
if highest in (1:5) then ParEdElem=1; else ParEdElem=0;
if highest in (6:8) then ParEdmiddle=1; else ParEdmiddle=0;
if highest in (9:12) then ParEdHS=1; else ParEdHS=0;
if highest in (13:15) then ParEdSC=1; else ParEdSC=0;
if highest in (16:17) then ParEdBSBA=1; else ParEdBSBA=0;
if highest in (18:20) then ParEdGrad=1; else ParEdGrad=0;
end;
if highest not in (1:20) then do;
highestmiss=.;
end;
Reeza
Super User

Your code for highest could be 

 

highest=max(resDad, resMom);

 

I don't see anything wrong in your code. Maybe in your log? Note that you set highestmiss to missing rather than highest in last step. 

 

Do you already have a highest variable, that could cause issues. 

 

Run a proc freq to check results. 

 

Proc freq data=have;

where resDad in (1:20) and resMom in (1:20);

table highest;

run; 

ksmielitz
Quartz | Level 8

@Reeza you are the bomb shizzle! I'll try out what you suggested. After dinner...and a movie with my hubby, LOL.

THANKS!!!

ksmielitz
Quartz | Level 8

@Reeza the highest=max (resdad, resmom); appears to work. I fixed the highestmiss error.

I ran a proc freq and everything runs and looks like it will work. I attempted the proc freq you suggested, but it says

 

17041  Proc freq data=famses;
ERROR: File WORK.FAMSES.DATA does not exist.
17042  where resDad in (1:20) and resMom in (1:20);
WARNING: No data sets qualify for WHERE processing.
17043  table highest;
ERROR: No data set open to look up variables.
17044  run;

My code that got the above error was 

Proc freq data=famses;
where resDad in (1:20) and resMom in (1:20);
table highest;
run;

I had tried the "have" but didn't have "have" defined anywhere, so it couldn't find it. The variable comes from the famses data set that I am drawing data from.

 

But when I run a proc freq on highest/missing; (with highest=max (resDad, resMom) or on one of the defined dummy variables (ParEdelem, for example) both work and the total number of respondents and total number of missing are the same. Code for reference: 

*Highest Residential Parent Education;
if resdad in (1:20) and resmom in (1:20) then do;
if resDad>resMom then highest=resDad; else if resMom>resDad then highest=resMom;
highest=max(resdad, resmom);
if highest in (1:5) then ParEdElem=1; else ParEdElem=0;
if highest in (6:8) then ParEdmiddle=1; else ParEdmiddle=0;
if highest in (9:12) then ParEdHS=1; else ParEdHS=0;
if highest in (13:15) then ParEdSC=1; else ParEdSC=0;
if highest in (16:17) then ParEdBSBA=1; else ParEdBSBA=0;
if highest in (18:20) then ParEdGrad=1; else ParEdGrad=0;
end;
if highest not in (1:20) then do;
if highest=95 then ungraded=1; else ungraded=0;
if highest in (.R, .D, .V, .I, .N) then highestmiss=1; else highestmiss=0;
end;

So, I think I'm okay...(?)

Thanks again for your help!!

K8

 

 

 

 

 

 

Reeza
Super User

Post your full code. Where your data and run statements? 

ksmielitz
Quartz | Level 8
LIBNAME NLSYa 'C:\Users\ProfessorKate\Desktop\K-State PhD Program\Heckman RC\NLSY\NLSY for Heckman RC';

DATA nlsyproject;
MERGE NLSYa.famses NLSYa.crimepercep NLSYa.otherincome;
BY ID;

*Gender;
Gender=R0536300;
if 0<Gender<3;
if gender=1 then male=1; else male=0;
if gender=2 then female=1; else female=0;

*Race;
Race=R1482600;
if 0<Race<5;
if race=1 then hispanic=1; else hispanic=0;
if race=2 then black=1; else black=0;
if race=3 then other=1; else other=0;
if race=4 then white=1; else white=0;

*HH receive welfare income?;
if R0611300 not in (.R, .D, .V, .I, .N) THEN DO;
Welfare=R0611300; WelfareMiss=0;
if Welfare=1 then welfareyes=1; else welfareyes=0;
if Welfare=0 then welfareno=1; else welfareno=0;
end;
if R0611300 in (.R, .D, .V, .I, .N) THEN DO;
Welfare=R0611300; WelfareMiss=1;
end;

/*Total Welfare income received (annual);
if R0611400 not in (.R, .D, .V, .I, .N) THEN DO;
TotalWelfare=R0611400; TotalWelfareMiss=0;
if TotalWelfare<500 then LT500=1; else LT500=0;
if 500<=TotalWelfare<1000 then BT500_1k=1; else BT500_1k=0;
if 1000<=TotalWelfare<1500 then BT1k_1500=1; else BT1k_1500=0;
if 1500<=TotalWelfare<2000 then BT1500_2k=1; else BT1500_2k=0;
if 2000<=TotalWelfare<2500 then BT2k_2500=1; else BT2k_2500=0;
if 2500<=TotalWelfare<3000 then BT2500_3k=1; else BT2500_3k=0;
if 3000<=TotalWelfare<3500 then BT3k_3500=1; else BT3k_3500=0;
if 3500<=TotalWelfare<4000 then BT3500_4k=1; else BT3500_4k=0;
if 4000<=TotalWelfare<4500 then BT4k_4500=1; else BT4k_4500=0;
if 4500<=TotalWelfare<5000 then BT4500_5k=1; else BT4500_5k=0;
if 5000<=TotalWelfare<7000 then BT5k_7k=1; else BT5k_7k=0;
if 7000<=TotalWelfare<10000 then BT7k_10k=1; else BT7k_10k=0;
if 10000<=TotalWelfare then TenKplus=1; else TenKplus=0;
end;
if R0611400 in (.R, .D, .V, .I, .N) THEN DO;
TotalWelfare=R0611300; TotalWelfareMiss=1;
end;
*/
*HH receive SSI?;
if R0611900 not in (.R, .D, .V, .I, .N) THEN DO; 
SSI=R0611900; SSIMiss=0;
if SSI=1 then SSIyes=1; else SSIyes=0;
if SSI=0 then SSIno=1; else SSIno=0;
end;
if R0611900 in (.R, .D, .V, .I, .N) THEN DO; 
SSI=R0611900; SSIMiss=1;
end;

/*Total SSI received (annual);
if R0611200 not in (.R, .D, .V, .I, .N) THEN DO;
RecSSI=R061200; RecSSIMiss=0;
if SSI<500 then SSLT500=1; else SSILT500=0;
if 500<=RecSSI<1000 then SSIBT500_1k=1; else SSIBT500_1k=0;
if 1000<=RecSSI<1500 then SSIBT1k_1500=1; else SSIBT1k_1500=0;
if 1500<=RecSSI<2000 then SSIBT1500_2k=1; else SSIBT1500_2k=0;
if 2000<=RecSSI<2500 then SSIBT2k_2500=1; else SSIBT2k_2500=0;
if 2500<=RecSSI<3000 then SSIBT2500_3k=1; else SSIBT2500_3k=0;
if 3000<=RecSSI<3500 then SSIBT3k_3500=1; else SSIBT3k_3500=0;
if 3500<=RecSSI<4000 then SSIBT3500_4k=1; else SSIBT3500_4k=0;
if 4000<=RecSSI<4500 then SSIBT4k_4500=1; else SSIBT4k_4500=0;
if 4500<=RecSSI<5000 then SSIBT4500_5k=1; else SSIBT4500_5k=0;
if 5000<=RecSSI<7000 then SSIBT5k_7k=1; else SSIBT5k_7k=0;
if 7000<=RecSSI<10000 then SSIBT7k_10k=1; else SSIBT7k_10k=0;
if 10000<=RecSSI<20000 then SSIBT10k_20k=1; else SSIBT10k_20k=0;
if 20000<=recSSI<50000 then SSIBT20k_50k=1; else SSIBT20k_50k=0;
end;
if R061200 in (.R, .D, .V, .I, .N) THEN DO; 
RecSSI=R061200; SSIMiss=1;
end;
*/

*HH receive food stamps?;
if R0611600 not in (.R, .D, .V, .I, .N) THEN DO;
FoodStamps=R0611600; FoodStampsMiss=0;
if FoodStamps=1 then FSyes=1; else FSyes=0;
if FoodStamps=0 then FSno=1; else FSno=0;
end;
IF R0611600 in (.R, .D, .V, .I, .N) THEN DO;  
FoodStamps=R0611600; FoodStampsMiss=1; 
end;

/*Total Food Stamps received (annual);
if R0611700 not in (.R, .D, .V, .I, .N) THEN DO;
TotalFS=R0611700; TotalFSMiss=0;
if 0<=TotalFS<500 then LT500FS=1; else LT500FS=0;
if 500<=Totalfs<1000 then BT500_1kFS=1; else BT500_1kFS=0;
if 1000<=Totalfs<1500 then BT1k_1500FS=1; else BT1k_1500FS=0;
if 1500<=Totalfs<2000 then BT1500_2kFS=1; else BT1500_2kFS=0;
if 2000<=Totalfs<2500 then BT2k_2500FS=1; else BT2k_2500FS=0;
if 2500<=Totalfs<3000 then BT2500_3kFS=1; else BT2500_3kFS=0;
if 3000<=Totalfs<3500 then BT3k_3500FS=1; else BT3k_3500FS=0;
if 3500<=Totalfs<4000 then BT3500_4kFS=1; else BT3500_4kFS=0;
if 4000<=Totalfs<4500 then BT4k_4500FS=1; else BT4k_4500FS=0;
if 4500<=Totalfs<5000 then BT4500_5kFS=1; else BT4500_5kFS=0;
if 5000<=Totalfs<7000 then BT5k_7kFS=1; else BT5k_7kFS=0;
if 7000<=TotalFS then SevenKplusFS=1; else SevenKplusFS=0; 
end;
IF R0611700 in (.R, .D, .V, .I, .N) THEN DO;  
TotalFS=R0611700; TotalFSMiss=1; 
end;

*HH entitled to to child support?;
if R0612300 not in (.R, .D, .V, .I, .N) then do;
childsup=R0612300; childsupmiss=0;
if childsup=1 then csyes=1; else csyes=0;
if childsup=0 then csno=1; else csno=0;
end;
if R0612300 in (.R, .D, .V, .I, .N) then do;
childsup=R0612300; childsupmiss=1;
end;

*Total Child Support Received (annual);
if R0612400 not in (.R, .D, .V, .I, .N) THEN DO;
TotalCS=R0612400; TotalCSMiss=0;
if 0<=TotalCS<1000 then LT1KCS=1; else LT1KFS=0;
if 1000<=Totalcs<2000 then BT1k_2kCS=1; else BT1k_2kCS=0;
if 2000<=Totalcs<3000 then BT2k_3kCS=1; else BT2k_3kCS=0;
if 3000<=Totalcs<4000 then BT3k_4kCS=1; else BT3k_4kCS=0;
if 4000<=Totalcs<5000 then BT4k_5kCS=1; else BT4k_5kCS=0;
if 5000<=Totalcs<6000 then BT5k_6kCS=1; else BT5k_6kCS=0;
if 6000<=Totalcs<7000 then BT6k_7kCS=1; else BT6k_7kCS=0;
if 7000<=Totalcs<8000 then BT7k_8kCS=1; else BT7k_8kCS=0;
if 8000<=Totalcs<9000 then BT8k_9kCS=1; else BT8k_9kCS=0;
if 9000<=Totalcs<10000 then BT9k_10kCS=1; else BT9_10kCS=0;
if 10000<=Totalcs<15000 then BT10k_15kCS=1; else BT10k_15kCS=0;
if 15000<=TotalCS then GTE15KplusCS=1; else GTE15KplusCS=0; 
end;
IF R0612400 in (.R, .D, .V, .I, .N) THEN DO;  
TotalCS=R0612400; TotalCSMiss=1; 
end;

*Mobile Home;
*NEED TO COMBINE 1-3 FOR SES/OWNERSHIP/RENTING PORTION OF SES VARIABLE);
IF R0617200 not in (.R, .D, .V, .I, .N) THEN DO;  
Mobilehome=R0617200; MHMiss=0; 
if MobileHome=1 then ownhomeandlot=1; else ownhomeandlot=0;
if MobileHome=2 then ownhome=1; else ownhome=0;
if MobileHome=3 then ownlot=1; else ownlot=0;
if MobileHome=4 then rentsMH=1; else rentsMH=0;
if MobileHome=7 then neitherownorrentMH=1; else neitherownorrentMH=0;
end;
IF R0617200 in (.R, .D, .V, .I, .N) THEN DO;  
TotalFS=R0617200; MHMiss=1; 
end;

*Apartment/House;
If R0617900 not in (.R, .D, .V, .I, .N) THEN DO;
ApartHouse=R0617900; ApartHouseMiss=0;
if ApartHouse=1 then ownorbuying=1; else ownorbuying=0;
if ApartHouse=2 then AptHouserent=1; else AptHouserent=0;
if ApartHouse=3 then neitherownorrent=1; else neitherownorrent=0;
end;
if R0617900 in (.R, .D, .V, .I, .N) then do;
ApartHouse=R0617900; ApartHouseMiss=1;
end;
*/
*Checking, Savings, Money Market;
if R0620400 not in (.R, .D, .V, .I, .N) THEN DO;
Banked=R0620400; BankedMiss=0;
if Banked=1 then account=1; else account=0;
if Banked=0 then noaccount=1; else noaccount=0;
end;
if R0620400 in (.R, .D, .V, .I, .N) then do;
Banked=R0620400; BankedMiss=1;
end;

/*Average balance of accounts;
if R0620500 not in (.R, .D, .V, .I, .N) THEN DO;
AVGBAL=R0620500; AVGBALmiss=0;
if AVGBAL=0 then NOBAL=1; else NOBAL=0;
if 1<=AVGBAL<5000 then BT1_5kavgbal=1; else BT1_5kavgbal=0;
if 5000<=AVGBAL<10000 then BT5k_10kavgbal=1; else BT5k_10kavgbal=0;
if 10000<=AVGBAL<15000 then BT10k_15kavgbal=1; else BT10k_15kavgbal=0;
if 15000<=AVGBAL<20000 then BT15k_20kavgbal=1; else BT15k_20kavgbal=0;
if 20000<=AVGBAL<=25000 then BT20k_25kavgbal=1; else BT20k_25kavgbal=0;
if 25000<=AVGBAL<=30000 then BT25k_30kavgbal=1; else BT25k_30kavgbal=0;
if 30000<=AVGBAL<=40000 then BT30k_40kavgbal=1; else BT30k_40kavgbal=0;
if 40000<=AVGBAL<=70000 then BT40k_70kavgbal=1; else BT40k_70kavgbal=0;
if 70000<=AVGBAL then GTE70kplus=1; else GTE70kplus=0;
end;
if R0620500 in (.R, .D, .V, .I, .N) THEN DO;
AVGBAL=R0620500; AVGBALmiss=1;
end;
*/
/*Vehicles;
if R0621300 not in (.R, .D, .V, .I, .N) THEN DO;
Vehicle=R0621300; Vehiclemiss=0;
if vehicle=1 then owncar=1; else owncar=0;
if vehicle=0 then noowncar=1; else noowncar=1;
end;
if R0621300 in (.R, .D, .V, .I, .N) THEN DO;
Vehicle=R0621300; Vehiclemiss=1;
end;
*/

*Age of mother when R born;
if R1200200 not in (.R, .D, .V, .I, .N) THEN DO;
AgeofMom=R1200200; AgeofMomMiss=0;
if 9<=AgeofMom<=14 then BabyMom=1; else BabyMom=0;
if 15<=AgeofMom<=19 then TeenMom=1; else TeenMom=0;
if 20<=AgeofMom<=24 then YoungMom=1; else YoungMom=0;
if 25<=AgeofMom<=29 then late20sMom=1; else late20sMom=0;
if 30<=AgeofMom<=34 then early30sMom=1; else early30sMom=0;
if 35<=AgeofMom<=39 then late30sMom=1; else late30sMom=0;
if 40<=AgeofMom then GTE40yrsplusMom=1; else GTE40yrsplusMom=0;
end;
if R1200200 in (.R, .D, .V, .I, .N) THEN DO;
AgeofMom=R1200200; AgeofMomMiss=1;
end;

*Household income;
if R1204500 not in (.R, .D, .V, .I, .N) THEN DO;
HHIncome=R1204500; IncomeMiss=0;
if 0<=HHIincome then noincome=1; else noincome=0;
if 1<=HHIncome<=5000 then LT5k=1; else LT5k=0;
if 5001<=HHINcome<=10000 then Income5_10k=1; else Income5_10k=0;
if 10001<=HHIncome<=20000 then Income10_20k=1; else Income10_20k=0;
if 20001<=HHIncome<=30000 then Income20_30k=1; else Income20_30k=0;
if 30001<=HHIncome<=40000 then Income30_40k=1; else Income30_40k=0;
if 40001<=HHIncome<=50000 then Income40_50k=1; else Income40_50k=0;
if 50001<=HHIncome<=65000 then Income50_65k=1; else Income50_65k=0;
if 65001<=HHIncome<=80000 then Income65_80k=1; else Income65_80k=0;
if 80001<=HHIncome<=100000 then Income80_100k=1; else Income80_100k=0;
if 100001<=HHIncome<=150000 then Income100_150k=1; else Income100_150k=0;
if 150001<=HHIncome then IncomeGTE150kplus=1; else IncomeGTE150kplus=0;
end;
if R1204500 in (.R, .D, .V, .I, .N) THEN DO;
HHIncome=R1204500; IncomeMiss=0;
end;

*Residential Father's Highest Grade Completed (95= ungraded, only 3, so not concerned);
if R1302600 not in (.R, .D, .V, .I, .N) then do;
ResDad=R1302600; DadEducMiss=0;
if ResDad in (1:5) then elementary=1; else elementary=0;
if ResDad in (6:8) then middleschool=1; else middleschool=0;
if ResDad in (9:12) then highschool=1; else highschool=0;
if ResDad in (13:15) then somecollege=1; else somecollege=0;
if ResDad in (16:17) then Bachelors=1; else Bachelors=0;
if ResDad in (18:20) then PostGrad=1; else PostGrad=0;
if ResDad=95 then ResDad=.;
end;
if R1302600 in (.R, .D, .V, .I, .N) then do;
ResDad=R1302600; DadEducMiss=1;
end;

*Residential Mother's Highest Grade Completed (95=ungraded, only 5, so not concerned);
if R1302700 not in (.R, .D, .V, .I, .N) then do;
ResMom=R1302700; MomEducMiss=0;
if ResMom in (1:5) then elementary=1; else elementary=0;
if ResMom in (6:8) then middleschool=1; else middleschool=0;
if ResMom in (9:12) then highschool=1; else highschool=0;
if ResMom in (13:15) then somecollege=1; else somecollege=0;
if ResMom in (16:17) then Bachelors=1; else Bachelors=0;
if ResMom in (18:20) then PostGrad=1; else PostGrad=0;
if ResMom=95 then ResMom=.;
end;
if R1302700 in (.R, .D, .V, .I, .N) then do;
ResMom=R1302700; MomEducMiss=1;
end;

*Highest Residential Parent Education;
if resdad in (1:20) and resmom in (1:20) then do;
if resDad>resMom then highest=resDad; else if resMom>resDad then highest=resMom;
highest=max(resdad, resmom);
if highest in (1:5) then ParEdElem=1; else ParEdElem=0;
if highest in (6:8) then ParEdmiddle=1; else ParEdmiddle=0;
if highest in (9:12) then ParEdHS=1; else ParEdHS=0;
if highest in (13:15) then ParEdSC=1; else ParEdSC=0;
if highest in (16:17) then ParEdBSBA=1; else ParEdBSBA=0;
if highest in (18:20) then ParEdGrad=1; else ParEdGrad=0;
end;
if highest not in (1:20) then do;
if highest=95 then ungraded=1; else ungraded=0;
if highest in (.R, .D, .V, .I, .N) then highestmiss=1; else highestmiss=0;
end;

Proc freq data=famses;
where resDad in (1:20) and resMom in (1:20);
table highest;
run;

 

 

 

ksmielitz
Quartz | Level 8

There have been 2 different ways coding for stats has been taught in my program. One professor (the one we worked with the longest) codes with the downloaded data as I have shown with my code. Another professor (one we spent 2 weeks with last summer) codes like this (different data set):

 

libname FINRA "C:\Users\ProfessorKate\Desktop\Summer 2015 Stats";

proc contents data=finra.finra12;
run;

data finra.working;
set finra.finra12;

/*sex*/
sex=A3;
if sex=1 then male=1; else male=0;

/*age*/
agecats=a3ar_w;
if agecats=1 then Under25=1; else Under25=0;
if agecats=2 then BT25_34=1; else BT25_34=0;
if agecats=3 then BT35_44=1; else BT35_44=0;
if agecats=4 then BT45_54=1; else BT45_54=0;
if agecats=5 then BT55_64=1; else BT55_64=0;
if agecats=6 then Over65=1; else Over65=0;

/*education*/
if a5_2012<98;
education=a5_2012;
if education=1 then LTHS=1; else LTHS=0;
if 2<=education<3 then HSGED=1; else HSGED=0;
if education=4 then somecoll=1; else somecoll=0;
if education=5 then college=1; else college=0;
if education=6 then collplus=1; else collplus=0;

/*financial confidence*/
if 0<m1_1<8;
daytoday=m1_1;
if 0<m1_2<8;
math=m1_2;
if 0<m4<8;
overall=m4;

finconf=(daytoday + math + overall)/3;

/*income*/
if 0<A8<98;
incomecats=A8;
if 	incomecats	=	1	then 	incomelt15	=	1	;	else 	incomelt15	=	0	;
if 	incomecats	=	2	then 	income15to25	=	1	;	else 	income15to25	=	0	;
if 	incomecats	=	3	then 	income25to35	=	1	;	else 	income25to35	=	0	;
if 	incomecats	=	4	then 	income35to50	=	1	;	else 	income35to50	=	0	;
if 	incomecats	=	5	then 	income50to75	=	1	;	else 	income50to75	=	0	;
if 	incomecats	=	6	then 	income75to100	=	1	;	else 	income75to100	=	0	;
if 	incomecats	=	7	then 	income100to150	=	1	;	else 	income100to150	=	0	;
if 	incomecats	=	8	then 	income150plus	=	1	;	else 	income150plus	=	0	;

/*race*/
if A4A_new_w>0;
racewhite=A4A_new_w;
if racewhite=2 then racewhite=0;

/*financial knowledge*/
if 0<m6<99;
if m6=1 then know1=1; else know1=0;
if 0<m7<99;
if m7=3 then know2=1; else know2=0;
if 0<m8<99;
if m8=2 then know3=1; else know3=0;
if 0<m9<99;
if m9=1 then know4=1; else know4=0;
if 0<m10<99;
if m10=1 then know5=1; else know5=0;

/* Best Practices*/
if 0<j5<99;
if j5=1 then efund=1; else efund=0;

if 0<f2_1<99;
if f2_1=1 then CCPIF=1; else CCPIF=0;

if 0<j11<99;
if j11=1 then ccheck=1; else ccheck=0;

if 0<C1_2012<99;
if 0<C4_2012<99;

if C1_2012=1 or C4_2012 =1 then retacct=1; else retacct=0;

if 0<b4<99;
if b4=2 then NoOD=1; else NoOD=0;

if 0<H1 <99;
if 0<H3 <99;
if H1 + H3 = 2 then insurance=1; else insurance=0; 

bestpractice= efund + CCPIF + ccheck + retacct + NoOD + insurance;


/*Fin sat*/
if 0<j1<98;
finsat=j1;

/*financial knowledge*/
if 0<M6<99;
if M6=1 then compoundinterest=1; else compoundinterest=0;

if 0<M7<99;
if M7=3 then inflation=1; else inflation=0;

if 0<M8<99;
if M8=2 then bonds=1; else bonds=0;

if 0<M9<99;
if M9=1 then mortgages=1; else mortgages=0;

if 0<M10<99;
if M10=2 then diversification=1; else diversification=0;

finknow= compoundinterest + inflation + bonds + mortgages + diversification;

/*marital status*/
if A7A>0;
marcats= A7A;
if 	marcats	=	1	then 	statmar	=	1	;	else 	statmar	=	0	;
if 	marcats	=	2	then 	statcohab	=	1	;	else 	statcohab	=	0	;
if 	marcats	=	3	then 	statsingle	=	1	;	else 	statsingle	=	0	;

/*children*/
if A11<99;
childrencats=A11;
if childrencats<=4 then children=1; else children=0;

/*employment status*/
if A9<99;
employmentcats=A9;

if 	employmentcats	=	1	then 	workself	=	1	;	else 	workself	=	0;
if 	employmentcats	=	2	then 	workfull	=	1	;	else 	workfull	=	0;
if 	employmentcats	=	3	then 	workpart	=	1	;	else 	workpart	=	0;
if 	4<= employmentcats	<=6	then 	workother	=	1	;	else 	workother	=	0;
if 	employmentcats	=	7	then 	workunemployed	=	1	;	else 	workunemployed	=	0;
if 	employmentcats	=	8	then 	workretired	=	1	;	else 	workretired	=	0 ;  

/*risktolerance*/
if J2<98;
risktol= J2;

run;

I *might* be able to reset how I have my data set up for the other code (which is my current, under deadline project), but if I can do that at a later time it's preferable.

 

 

ksmielitz
Quartz | Level 8

Reeza, I know you are wishing you'd never responded 😉

 

So, I'm thinking what I have for highest isn't working quite right.

When I run: 

*Highest Residential Parent Education;
if resdad in (1:20) and resmom in (1:20) then do;
if resDad>resMom then highest=resDad; else if resMom>resDad then highest=resMom;
highest=max(resdad, resmom);
if highest in (1:5) then ParEdElem=1; else ParEdElem=0;
if highest in (6:8) then ParEdmiddle=1; else ParEdmiddle=0;
if highest in (9:12) then ParEdHS=1; else ParEdHS=0;
if highest in (13:15) then ParEdSC=1; else ParEdSC=0;
if highest in (16:17) then ParEdBSBA=1; else ParEdBSBA=0;
if highest in (18:20) then ParEdGrad=1; else ParEdGrad=0;
end;
if highest not in (1:20) then do;
if highest=95 then ungraded=1; else ungraded=0;
if highest in (.R, .D, .V, .I, .N) then highestmiss=1; else highestmiss=0;
end;

with my proc freq I get 3626 missing, but when I add up the missing from ResDad and ResMom I get a total of 4259. 

Thanks again for all your help!

K8

 

Tom
Super User Tom
Super User

If you are just worried about missing values then your numbers look fine to me.

If the 4,259 is the count of those that are missing A or B and 3,626 is count of those with missing MAX(A,B).  Then that just means that you have 3,626 that are missing both A and B.  Looks like a reasonable number.

ksmielitz
Quartz | Level 8

Tom, 

Thank you. That makes perfect sense! I appreciate your response!

 

Kate

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1580 views
  • 6 likes
  • 3 in conversation