BookmarkSubscribeRSS Feed
tschmac
Calcite | Level 5

hi - the data I've downloaded had a bunch of dummy variables that were coded as essentially yes=2 and no=1. I tried to fix this by creating a new dataset and changing them to 1=yes and 0=no. This is my code below:

data dummies;
set IPUMS.usa_00003_f;
if HCOVPRIV=2 then private=1;
else private=0;
if HCOVPUB=2 then public=1;
else public=0;
if HINSCAID=2 then medicaid=1;
else medicaid=0;
if HINSCARE=2 then medicare=1;
else medicare=0;
if HINSEMP=2 then employer=1;
else employer=0;
if HINSPUR=2 then direct=1;
else direct=0;
if HINSTRI=2 then tricare=1;
else tricare=0;
if HINSVA=2 then va=1;
else va=0;

proc freq data=dummies;
tables private public medicaid medicare employer direct tricare va / out=coverage;
by year; run;
proc means; run;

 

I also ran a proc freq prior to this new dataset being made and it ran perfectly fine but now that I've made this new dataset, some variables are missing (age, sex, race, year). I have no idea why those variables went away and now my new dummy variables are all showing up as a value of 0, but there are 430 observations so I know that they are there but just all showing up as zero. Can someone please let me know what I've done wrong?

 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

The most important step in debugging is to look at the data in IPUMS.usa_00003_f, and see what is in there, if all of the desired variables are in there, and then see if, in your own mind, your code is doing the right thing to the data.

 

If you don't see an error, then show us the data in IPUMS.usa_00003_f.

 

As a side comment, you really don't need dummy variables for PROC FREQ, and depending on what you are doing, you may not need dummy variables for PROC SUMMARY/PROC MEANS either.

--
Paige Miller
Reeza
Super User

Do you not have any missing data?

 

If you did, you're not coding it to 0 with this format in your IF/THEN statements. 

 

if HCOVPRIV=2 then private=1;
else private=0;

@tschmac wrote:

hi - the data I've downloaded had a bunch of dummy variables that were coded as essentially yes=2 and no=1. I tried to fix this by creating a new dataset and changing them to 1=yes and 0=no. This is my code below:

data dummies;
set IPUMS.usa_00003_f;
if HCOVPRIV=2 then private=1;
else private=0;
if HCOVPUB=2 then public=1;
else public=0;
if HINSCAID=2 then medicaid=1;
else medicaid=0;
if HINSCARE=2 then medicare=1;
else medicare=0;
if HINSEMP=2 then employer=1;
else employer=0;
if HINSPUR=2 then direct=1;
else direct=0;
if HINSTRI=2 then tricare=1;
else tricare=0;
if HINSVA=2 then va=1;
else va=0;

proc freq data=dummies;
tables private public medicaid medicare employer direct tricare va / out=coverage;
by year; run;
proc means; run;

 

I also ran a proc freq prior to this new dataset being made and it ran perfectly fine but now that I've made this new dataset, some variables are missing (age, sex, race, year). I have no idea why those variables went away and now my new dummy variables are all showing up as a value of 0, but there are 430 observations so I know that they are there but just all showing up as zero. Can someone please let me know what I've done wrong?

 


 

ballardw
Super User

@tschmac wrote:

 

I also ran a proc freq prior to this new dataset being made and it ran perfectly fine but now that I've made this new dataset, some variables are missing (age, sex, race, year). I have no idea why those variables went away and now my new dummy variables are all showing up as a value of 0, but there are 430 observations so I know that they are there but just all showing up as zero. Can someone please let me know what I've done wrong?

 


If you ever rerun that recode syntax on that data the all 0 is what will happen.

So make sure that you never use syntax like;

 

Data samedataset;

    set samedataset;

   <recode text>.

 

I have seen this happen multiple times when using syntax like the above. Run the code, check and then decide you need to make another change of some sort, such as recoding another variable. But since the first time you ran the code you replaced the existing data set the original 2/1 values had already been turned to 1/0. So your code would change all the 1 to 0 (there no longer being any twos).

 

I suspect you will need to go back to the original step that created your data set and start over.

 

OR another approach, depending on how you read your data, might be to use a custom informat to read the values as wanted to start with.

An brief example that includes an exception for completely unexpected values. The informat will work slightly differently with infile depending on the file type.

Proc format library=work;
invalue my2level
'2'= 1
'1'= 0
' ','.' = .
other = _error_;
;

data example;
   informat x my2level.;
   input x;
datalines;
2
1
. 
z
;

Also when you are going to do the exact same thing to a bunch of variable then likely array use is a cleaner approach:

data dummies;
   set IPUMS.usa_00003_f;
   array h  HCOVPRIV HCOVPUB HINSCAID  HINSCARE  HINSEMP  HINSPUR HINSTRI  HINSVA;
   array v  private  public  medicaid  medicare  employer direct  tricare  va    ;
   do i= 1 to dim(h);
      v[i] = ( h[i] = 2);
   end;
   drop i;
run;

To do something from one variable to matching variable use 2 arrays, Make sure the variables are in the same order.

The above code also uses the not trivial fact that SAS will return a numeric value of 1 for a true expression and 0 for false.

Imagine that you have 50 identically coded variables that you need to recode. You can see that the array can simplify your code drastically. You only need to the add the variables in order to two array statements (in this case). You don't have to add an additional 42 blocks of If/then/else code.

 

Astounding
PROC Star

A safer method to convert  your variables would be:

 


private = HCOVPRIV - 1;
public = HCOVPUB - 1;
etc.

That way, if there are unusual values in your original variables, they will still stand out when you are done processing the data.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1575 views
  • 2 likes
  • 5 in conversation