The below program runs without error and reads the amount of observations as are listed in the csv file. However, when I run frequency tables, I noticed certain categories were missing entirely. For example, in the job table, "recovery specialist" has been omitted entirely (not saying it's missing, it's just gone!) Does anyone know why this would happen?? It's happening to categories on the emp table as well. Please help!
proc import datafile = '/folders/myfolders/mhttc1.csv'
out=mhttc
dbms=csv
replace;
run;
proc format;
value place 1='NE'
2='IA'
3='MO'
4='KS'
5='NM'
6='Online';
value job 1='Clinician'
2='Clinical supervisor'
3='Recovery specialist'
4='Manager/coordinator/adiministrator'
5='Client/patient educator'
6='Case manager'
7='Prevention case manager'
8='Counselor'
9='Mental health terapist'
10='Parole/probation/re-entry support'
11='Outreach staff'
12='Disease intervention/investigator'
13='Resident/fellow'
14='Teacher/faculty'
15='Trainer/TA provider'
16='Group facilitator'
17='Not currently employed'
18='Other'
19='Missing';
value emp 1='Community or Faith-based service organization'
2='Government'
3='State/local health department'
4='School/university (academic dept)'
5='Hospital/hospital-affiliated clinic'
6='HMO/managed care organization'
7='Solo/group private practice'
8='Addictions treatment program (inpatient)'
9='Addictions treatment program (outpatient)'
10='Addictions treatment program (residential)'
11='Recovery support program'
12='School/university-based health clinic'
13='Correctional facility'
14='Probation/parole office'
15='Local law enforcement department'
16='Military/VA'
17='Tribal/Indian health services'
18='Community health center'
19='Not currently employed'
20='Other'
21='Missing';
data shells;
length location $21.;
set work.mhttc;
if location = 'Chadron, Nebra' or location = 'LaVista, Nebra' or location = 'Lincoln, NE' or location = 'Omaha, NE' or location = 'wayne' or location = 'Wayne, Nebrask'
then place = 1;
else if location = 'Council Bluffs' or location = 'Des Moines, IA' or location = 'Knoxville, IA'
then place = 2;
else if location = 'Columbia, MO'
then place = 3;
else if location = 'Kansas City, K'
then place = 4;
else if location = 'Santa Fe, NM'
then place = 5;
else if location = 'online'
then place = 6;
if workrole=. or workrole=-9 or workrole=-99
then job=19;
else if workrole=10 then job=1;
else if workrole=20 then job=2;
else if workrole=30 then job=3;
else if workrole=40 then job=4;
else if workrole=50 then job=5;
else if workrole=60 then job=6;
else if workrole=70 then job=7;
else if workrole=80 then job=8;
else if workrole=90 then job=9;
else if workrole=100 then job=10;
else if workrole=110 then job=11;
else if workrole=120 then job=12;
else if workrole=130 then job=13;
else if workrole=140 then job=14;
else if workrole=150 then job=15;
else if workrole=160 then job=16;
else if workrole=170 then job=17;
else if workrole=180 then job=18;
if worksetting=. or worksetting=-9 or worksetting =-99
then emp=21;
else if worksetting=10 then emp=1;
else if worksetting=20 then emp=2;
else if worksetting=30 then emp=3;
else if worksetting=40 then emp=4;
else if worksetting=50 then emp=5;
else if worksetting=60 then emp=6;
else if worksetting=70 then emp=7;
else if worksetting=80 then emp=8;
else if worksetting=90 then emp=9;
else if worksetting=100 then emp=10;
else if worksetting=110 then emp=11;
else if worksetting=120 then emp=12;
else if worksetting=130 then emp=13;
else if worksetting=140 then emp=14;
else if worksetting=150 then emp=15;
else if worksetting=160 then emp=16;
else if worksetting=170 then emp=17;
else if worksetting=180 then emp=18;
else if worksetting=190 then emp=19;
else if worksetting=200 then emp=20;
format place place. job job. emp emp.;
run;
*proc contents data=shells;
*run;
proc freq data=shells;
table job*place / nopercent nocol norow;
run;
proc freq data=shells;
table emp*place / nopercent nocol norow;
run;
Looking at your program:
If you get no obs with formatted value of "recovery specialist", then you have no obs with job=3,
which means no obs with workrole=30.
Do you have such cases in the mhttc data set?
OR …. you could have instances of job=3 that all occur with missing values for PLACE. Your proc freq by default is not showing a column for missing PLACE unless you use the MISSING option.
Also, on a smaller note, you can reduce the typing (and consequent typing errors) in your program by replacing code such as this:
if workrole=. or workrole=-9 or workrole=-99
then job=19;
else if workrole=10 then job=1;
else if workrole=20 then job=2;
else if workrole=30 then job=3;
else if workrole=40 then job=4;
else if workrole=50 then job=5;
else if workrole=60 then job=6;
else if workrole=70 then job=7;
else if workrole=80 then job=8;
else if workrole=90 then job=9;
else if workrole=100 then job=10;
else if workrole=110 then job=11;
else if workrole=120 then job=12;
else if workrole=130 then job=13;
else if workrole=140 then job=14;
else if workrole=150 then job=15;
else if workrole=160 then job=16;
else if workrole=170 then job=17;
else if workrole=180 then job=18;
with
if workrole in (.,-9,-99) then job=19 ; else
if workrole in (10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180) then job=workrole/10;
Looking at your program:
If you get no obs with formatted value of "recovery specialist", then you have no obs with job=3,
which means no obs with workrole=30.
Do you have such cases in the mhttc data set?
OR …. you could have instances of job=3 that all occur with missing values for PLACE. Your proc freq by default is not showing a column for missing PLACE unless you use the MISSING option.
Also, on a smaller note, you can reduce the typing (and consequent typing errors) in your program by replacing code such as this:
if workrole=. or workrole=-9 or workrole=-99
then job=19;
else if workrole=10 then job=1;
else if workrole=20 then job=2;
else if workrole=30 then job=3;
else if workrole=40 then job=4;
else if workrole=50 then job=5;
else if workrole=60 then job=6;
else if workrole=70 then job=7;
else if workrole=80 then job=8;
else if workrole=90 then job=9;
else if workrole=100 then job=10;
else if workrole=110 then job=11;
else if workrole=120 then job=12;
else if workrole=130 then job=13;
else if workrole=140 then job=14;
else if workrole=150 then job=15;
else if workrole=160 then job=16;
else if workrole=170 then job=17;
else if workrole=180 then job=18;
with
if workrole in (.,-9,-99) then job=19 ; else
if workrole in (10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180) then job=workrole/10;
Thank you for your help! I thought I had checked whether I had observations in the datafile for those that didn't show up in my frequency tables but I didn't....a lesson in getting to know your data! Thank you for teaching me how to simplify my code as well.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.