BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tainaj
Obsidian | Level 7
Also, this is part of what the code @Tom provided showed:

raceomb=East Asian
raceomb_002=.
raceombmulti="."
ethnicityomb=Not Hispanic or Latino
sex_5=Female
D_biep_White_Good_all=-0.367
Mn_RT_all_3467=981.71
N_3467=120
PCT_error_3467=2.5
Order=White+Good first
labels=.
Side_Good_34=Good left first
Side_White_34=White left first
pct_300=0.0
pct_400=0.0
pct_2K=3.3
pct_3K=0.0
pct_4K=0.0
tblack=8
tblack_1to11=8
tblack_0to10=.
twhite=5
twhite_1to11=5 neutral
twhite_0to10=.
att=3
att_7=I slightly prefer African Americans to European Americans.
att7=.
D_biep_White_Good_36=-0.553
D_biep_White_Good_47=-0.181
Mn_RT_all_3=1145.10
Mn_RT_all_4=989.58
Mn_RT_all_6=922.80
Mn_RT_all_7=921.60
SD_all_3=458.22
SD_all_4=419.54
SD_all_6=309.96
SD_all_7=326.82
N_3=20
N_4=40
N_5=40
N_6=20
N_7=40
Mn_RT_correct_3=1145.10
Mn_RT_correct_4=960.10
Mn_RT_correct_6=922.80
Mn_RT_correct_7=869.42
SD_correct_3=458.22
SD_correct_4=380.77
SD_correct_6=309.96
SD_correct_7=237.90
N_ERROR_3=0
N_ERROR_4=1
N_ERROR_6=0
N_ERROR_7=2
countrycit=" 1"
countryres=" 1"
edu=.
edu_14=.
studentornot=No
edunotstudent=Ph.D.
edustudent=.
employment="6"
occuself="nul"
occupation_self="29-1000"
Tom
Super User Tom
Super User

To see the actual values for the numeric variables you might want to remove the format from them also.

format _numeric_ ;

For example these two variables must be numeric since they did not get quoted.

ethnicityomb=Not Hispanic or Latino
sex_5=Female

The values you have posted for COUNTRYCIT has a single space, but perhaps it has been modified by being pasted into the body of your message.  To see the actual values of the bytes in the character variables you can use the $HEX format which will print two hexadecimal digits for each byte.  So the value ' 1' would print as 2031 since '20'x is the ASCII code for a space and '31'x is the ASCII code for the character 1. 

 

If they are spaces you could also just try using the LEFT() or STRIP() function to remove the leading spaces in your WHERE and/or IF statement.  So either of these statements should find that record.

where countrycit=' 1' ;
where left(countrycit)='1' ;
tainaj
Obsidian | Level 7
I ran the numeric format and this was fine: birthyear=1966
num="0"
num_002=.
birthsex=.
genderidentity=""
raceomb=2
raceomb_002=.
raceombmulti="."
ethnicityomb=2
sex_5=2
D_biep_White_Good_all=-0.36693193
Mn_RT_all_3467=981.70833333
N_3467=120
PCT_error_3467=2.5
Order=1
labels=.
Side_Good_34=1
Side_White_34=1
pct_300=0
pct_400=0
pct_2K=3.3333333333
pct_3K=0
pct_4K=0
tblack=8
tblack_1to11=8
tblack_0to10=.
twhite=5
twhite_1to11=5
twhite_0to10=.
att=3
att_7=3
att7=.
D_biep_White_Good_36=-0.552703649
D_biep_White_Good_47=-0.18116021
Mn_RT_all_3=1145.1
Mn_RT_all_4=989.575
Mn_RT_all_6=922.8
Mn_RT_all_7=921.6
SD_all_3=458.21931086
SD_all_4=419.53630615
SD_all_6=309.96342738
SD_all_7=326.81860412
N_3=20
N_4=40
N_5=40
N_6=20
N_7=40
Mn_RT_correct_3=1145.1
Mn_RT_correct_4=960.1025641
Mn_RT_correct_6=922.8
Mn_RT_correct_7=869.42105263
SD_correct_3=458.21931086
SD_correct_4=380.76666393
SD_correct_6=309.96342738
SD_correct_7=237.90102775
N_ERROR_3=0
N_ERROR_4=1
N_ERROR_6=0
N_ERROR_7=2
countrycit=" 1

I ran the $HEX format (first time doing this so hopefully it's right) and I got this:
countrycit=20202031

Based on what you said, this should be 3 spaces...? If so, I didn't receive any observations:
896 proc import out=work.IAT2016_raw
897 datafile="C:\Users\tjoseph6\Documents\My SAS Files\IAT2016.sav"
898 dbms=sav replace;
899 run;

NOTE: Variable Name Change. D_biep.White_Good_all -> D_biep_White_Good_all
NOTE: Variable Name Change. D_biep.White_Good_36 -> D_biep_White_Good_36
NOTE: Variable Name Change. D_biep.White_Good_47 -> D_biep_White_Good_47
NOTE: One or more variables were converted because the data type is not supported by the V9 engine.
For more details, run with options MSGLEVEL=I.
NOTE: The import data set has 1051105 observations and 534 variables.
NOTE: WORK.IAT2016_RAW data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 1:12.75
cpu time 27.49 seconds


900 data IAT2016;
901 set work.IAT2016_raw (keep=year birthyear sex_5 raceomb D_biep_White_Good_all countrycit
901! occupation_self politicalid_7);
902 Age=year-birthyear;
903 where countrycit eq ' 1' & put( occupation_self,$OCCUPAA. ) in ('29-1000','31-1000') & sex_5 in
903! (1,2);
904 if D_biep_White_Good_all ne .;
905 if politicalid_7 eq . then delete;
906 if Age lt 16 then delete;
907 if raceomb eq . then delete;
908 label sex_5='Gender'
909 raceomb='Race'
910 D_biep_White_Good_all='Overall IAT D Score'
911 politicalid_7='Political Ideology Spectrum'
912 occupation_self='Occupation'
913 countrycit='Country';
914 run;

NOTE: There were 0 observations read from the data set WORK.IAT2016_RAW.
WHERE (countrycit=' 1') and PUT(occupation_self, $OCCUPAA79.) in ('29-1000', '31-1000') and
sex_5 in (1, 2);
NOTE: The data set WORK.IAT2016 has 0 observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 38.00 seconds
cpu time 10.96 seconds


915 proc contents data=IAT2016;
916 run;

NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds


917 proc means data=IAT2016;
918 class occupation_self;
919 var D_biep_White_Good_all;
920 output out = Summary
921 mean =
922 n = / autoname;
923 format occupation_self $OCCUPAA.;
924 run;

NOTE: No observations in data set WORK.IAT2016.
NOTE: The data set WORK.SUMMARY has 0 observations and 5 variables.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.05 seconds
cpu time 0.03 seconds


I will use the suggestions for the where statement in one moment though!
tainaj
Obsidian | Level 7
The other where statement to remove the space did not result in observations. Did I put it correctly?

925 proc import out=work.IAT2016_raw
926 datafile="C:\Users\tjoseph6\Documents\My SAS Files\IAT2016.sav"
927 dbms=sav replace;
928 run;

NOTE: Variable Name Change. D_biep.White_Good_all -> D_biep_White_Good_all
NOTE: Variable Name Change. D_biep.White_Good_36 -> D_biep_White_Good_36
NOTE: Variable Name Change. D_biep.White_Good_47 -> D_biep_White_Good_47
NOTE: One or more variables were converted because the data type is not supported by the V9 engine.
For more details, run with options MSGLEVEL=I.
NOTE: The import data set has 1051105 observations and 534 variables.
NOTE: WORK.IAT2016_RAW data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 1:09.51
cpu time 25.56 seconds


929 data IAT2016;
930 set work.IAT2016_raw (keep=year birthyear sex_5 raceomb D_biep_White_Good_all countrycit
930! occupation_self politicalid_7);
931 Age=year-birthyear;
932 where left(countrycit)='1' & put( occupation_self,$OCCUPAA. ) in ('29-1000','31-1000') & sex_5 in
932! (1,2);
933 if D_biep_White_Good_all ne .;
934 if politicalid_7 eq . then delete;
935 if Age lt 16 then delete;
936 if raceomb eq . then delete;
937 label sex_5='Gender'
938 raceomb='Race'
939 D_biep_White_Good_all='Overall IAT D Score'
940 politicalid_7='Political Ideology Spectrum'
941 occupation_self='Occupation'
942 countrycit='Country';
943 run;

NOTE: There were 0 observations read from the data set WORK.IAT2016_RAW.
WHERE (LEFT(countrycit)='1') and PUT(occupation_self, $OCCUPAA79.) in ('29-1000', '31-1000')
and sex_5 in (1, 2);
NOTE: The data set WORK.IAT2016 has 0 observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 38.19 seconds
cpu time 9.75 seconds


944 proc contents data=IAT2016;
945 run;

NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds


946 proc means data=IAT2016;
947 class occupation_self;
948 var D_biep_White_Good_all;
949 output out = Summary
950 mean =
951 n = / autoname;
952 format occupation_self $OCCUPAA.;
953 run;

NOTE: No observations in data set WORK.IAT2016.
NOTE: The data set WORK.SUMMARY has 0 observations and 5 variables.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.05 seconds
cpu time 0.04 seconds

Tom
Super User Tom
Super User

Take it step by step.

Apply the first criteria.

data step1;
  set IAT2016_raw;
  where left(countrycit)='1';
run;

Then the next:

data step2;
  set step1;
  set IAT2016_raw;
  where occupation_self in ('29-1000','31-1000');
run;

Then the next:

data step3;
  set step2;
  where  sex_5 in (1,2);
run;

 

tainaj
Obsidian | Level 7
I just applied each step one after the other one:
954 data step1;
955 set IAT2016_raw;
956 where left(countrycit)='1';
957 run;

NOTE: There were 502384 observations read from the data set WORK.IAT2016_RAW.
WHERE LEFT(countrycit)='1';
NOTE: The data set WORK.STEP1 has 502384 observations and 534 variables.
NOTE: DATA statement used (Total process time):
real time 38.80 seconds
cpu time 14.09 seconds


958 data step2;
959 set step1;
960 set IAT2016_raw;
961 where occupation_self in ('29-1000','31-1000');
962 run;

NOTE: There were 9106 observations read from the data set WORK.STEP1.
NOTE: There were 9105 observations read from the data set WORK.IAT2016_RAW.
WHERE occupation_self in ('29-1000', '31-1000');
NOTE: The data set WORK.STEP2 has 9105 observations and 534 variables.
NOTE: DATA statement used (Total process time):
real time 35.55 seconds
cpu time 9.37 seconds


963 data step3;
964 set step2;
965 where sex_5 in (1,2);
966 run;

NOTE: There were 8364 observations read from the data set WORK.STEP2.
WHERE sex_5 in (1, 2);
NOTE: The data set WORK.STEP3 has 8364 observations and 534 variables.
NOTE: DATA statement used (Total process time):
real time 1.55 seconds
cpu time 1.50 seconds
Tom
Super User Tom
Super User

So is 8,364 the sample size you expected?

 

The issue was the occupation code.  You were comparing the formatted value to a list of the coded values because you used the PUT() function in the WHERE statement.

tainaj
Obsidian | Level 7
I am honestly unsure how to see what my expected sample size should (I've only seen this in proc reg, proc glm, etc statements but this doesn't work since it says no observations), but it definitely should be in the thousands. I've run data from the years 2017-2020 so far and they were in the 10s of thousands for those.
Someone previously told me to add the PUT function along with the formatted value in the where statement, but I'll remove it and run again!
tainaj
Obsidian | Level 7
Oh wow it worked!! I honestly was close to losing hope. I cannot thank you enough for your knowledge and patience with this!!!


proc import out=work.IAT2016_raw
datafile="C:\Users\tjoseph6\Documents\My SAS Files\IAT2016.sav"
dbms=sav replace;
run;
data IAT2016;
set work.IAT2016_raw (keep=year birthyear sex_5 raceomb D_biep_White_Good_all countrycit occupation_self politicalid_7);
Age=year-birthyear;
where left(countrycit)='1' & occupation_self in ('29-1000','31-1000') & sex_5 in (1,2);
if D_biep_White_Good_all ne .;
if politicalid_7 eq . then delete;
if Age lt 16 then delete;
if raceomb eq . then delete;
label sex_5='Gender'
raceomb='Race'
D_biep_White_Good_all='Overall IAT D Score'
politicalid_7='Political Ideology Spectrum'
occupation_self='Occupation'
countrycit='Country';
run;
proc contents data=IAT2016;
run;
proc means data=IAT2016;
class occupation_self;
var D_biep_White_Good_all;
output out = Summary
mean =
n = / autoname;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 38 replies
  • 1352 views
  • 9 likes
  • 5 in conversation