BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gejoachim99
Fluorite | Level 6

I am using an existing dataset to create a bunch of new variables, all to make one final variable. I have that final variable in a continuous form (jointyears), and I want to make it categorical (jointyears_cat). I've used if/then statements to do so. However, the log is telling me when I run the if/then statements, that the categorical variable, jointyears, is uninitialized, even though I created it immediately before and included a data and a set statement. I also ran a proc contents for work.adenovar and work.adeno1, and in each the variables that I need/are coding with are all present. Here is my code/log:

 

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /*duration*/
74 data work.adenovar;
75 set work.adeno1;
76 if hashtot >= 1 and hashtot < =4 and hashfage1 ^= 99 and hashlage1 ^= 99 then hashdur1 = (hashlage1-hashfage1)+1;
77 if hashtot >= 2 and hashtot <= 4 and hashfage2 ^= 99 and hashlage1 ^= 99 then hashdur2 = (hashlage2-hashfage2)+1;
78 if hashtot >= 3 and hashtot <= 4 and hashfage3 ^= 99 and hashlage1 ^= 99 then hashdur3 = (hashlage3-hashfage3)+1;
79 else if hashtot = 4 and hashfage4 ^= 99 and hashlage4 ^= 99 then hashdur4 = (hashlage4-hashfage4)+1;
80
81 /*total duration*/
82 hashdurtot = sum(of hashdur1-hashdur4);
83
84 /*convert all times to days*/
85 if hashnumx1 = 1 then hashnumx1_day = "1";
86 if hashnumx1 = 2 then hashnumx1_day = "7";
87 if hashnumx1 = 3 then hashnumx1_day = "30";
88 if hashnumx1 = 4 then hashnumx1_day = "365.25";
89 if hashnumx1 = 5 then hashnumx1_day = "1";
90 else if 8 > hashnumx1 > 9 then hashnumx1_day = ".";
91 if hashnumx2 = 1 then hashnumx2_day = "1";
92 if hashnumx2 = 2 then hashnumx2_day = "7";
93 if hashnumx2 = 3 then hashnumx2_day = "30";
94 if hashnumx2 = 4 then hashnumx2_day = "365.25";
95 if hashnumx2 = 5 then hashnumx2_day = "1";
96 else if 8 > hashnumx2 > 9 then hashnumx2_day = ".";
97 if hashnumx3 = 1 then hashnumx3_day = "1";
98 if hashnumx3 = 2 then hashnumx3_day = "7";
99 if hashnumx3 = 3 then hashnumx3_day = "30";
100 if hashnumx3 = 4 then hashnumx3_day = "365.25";
101 if hashnumx3 = 5 then hashnumx3_day = "1";
102 else if 8 > hashnumx3 > 9 then hashnumx3_day = ".";
103 if hashnumx4 = 1 then hashnumx4_day = "1";
104 if hashnumx4 = 2 then hashnumx4_day = "7";
105 if hashnumx4 = 3 then hashnumx4_day = "30";
106 if hashnumx4 = 4 then hashnumx4_day = "365.25";
107 if hashnumx4 = 5 then hashnumx4_day = "1";
108 else if 8 > hashnumx4 > 9 then hashnumx4_day = ".";
109
110 /*inhalations/day calculation*/
111 /*1ep*/ if hashtot = 1 then inhl_day = ((hashinhl1*hashnum1)/hashnumx1_day);
112 /*2ep*/ if hashtot = 2 then inhl_day = ((hashinhl1*hashnum1)/hashnumx1_day) + ((hashinhl2*hashnum2)/hashnumx2_day);
113 /*3ep*/ if hashtot = 3 then inhl_day = ((hashinhl1*hashnum1)/hashnumx1_day) + ((hashinhl2*hashnum2)/hashnumx2_day) +
113 ! ((hashinhl3*hashnum3)/hashnumx3_day);
114 /*4ep*/ if hashtot = 4 then inhl_day = ((hashinhl1*hashnum1)/hashnumx1_day) + ((hashinhl2*hashnum2)/hashnumx2_day) +
114 ! ((hashinhl3*hashnum3)/hashnumx3_day) + ((hashinhl4*hashnum4)/hashnumx4_day);
115 else if 98 > hashtot > 99 then inhl_day = ".";
116
117 /*inhalations/day to joints/day*/
118 joints_day = (inhl_day / 12);
119
120 /*joint years calculation*/
121 jointyears = joints_day * hashdurtot;
122 run;
 
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
111:62 112:62 112:101 113:62 113:101 113:140 114:62 114:101 114:140 114:179 115:43
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
403 at 82:14 4 at 111:61 404 at 118:24 407 at 121:25
NOTE: There were 989 observations read from the data set WORK.ADENO1.
NOTE: The data set WORK.ADENOVAR has 989 observations and 2961 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.07 seconds
 
 
123
124 data work.adenovar;
125 set work.adeno1;
126 if jointyears = 0 then jointyears_cat = "0";
127 if 0 > jointyears >= 2 then jointyears_cat = "1";
128 if 2 > jointyears >= 5 then jointyears_cat = "2";
129 else if jointyears > 5 then jointyears_cat = "3";
130 run;
 
NOTE: Variable jointyears is uninitialized.
NOTE: There were 989 observations read from the data set WORK.ADENO1.
NOTE: The data set WORK.ADENOVAR has 989 observations and 2951 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds
 
 
131
132 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
144
 
Please help!! Thank you. 
1 ACCEPTED SOLUTION

Accepted Solutions
vellad
Obsidian | Level 7
I think you meant to set adenovar instead of adeno1 in the second data step!

View solution in original post

12 REPLIES 12
vellad
Obsidian | Level 7
I think you meant to set adenovar instead of adeno1 in the second data step!
gejoachim99
Fluorite | Level 6
That fixed it! thank you so much!
Reeza
Super User
Please mark the question as solved by selecting @vellad's answer as the correct solution.
ballardw
Super User

If you are using 99 for the variables to indicate "value not recorded" "value out of range" "not valid" "not available" or any related idea that it should never be used for calculation then you may want to consider learning that SAS has a concept of "missing value". By default missing values are not used for most calculations.

 

This can potentially save you a LOT of "and thisvar ne 99 and thatvar ne 99 and othervar ne 99" type code. (I use ne instead of ^= because it is easier to type). Example:

data example;
  input  hashtot  hashlage1   hashfage1 ;

  if hashtot >= 1 and hashtot < =4  then hashdur1 = (hashlage1-hashfage1)+1;
datalines;
0 3 4
1 2 3
1 . 17
1 1 .
1 . .
;

proc print;
run;

The missing values are also excluded for summary statistics purposes:

proc means data=example n max min mean median;
   var hashtot hashlage1 hashfage1;
run;

Naming every variable with a prefix like "hash" may be counter productive in that is 4 extra characters you have to type all the time.

That many variables with suffixes makes me suspect that you may have a poor data structure and that is causing lots of repetitive code for the same variable meaning. Typically this occurs when translating a process from spreadsheet thinking.

 

gejoachim99
Fluorite | Level 6

Thank you for the reply, even though it is not at all related to the question I asked. I am fully aware that SAS has a concept of "missing value," as you put it, and that 99 is not used in calculations. I wrote my code to ensure those observations were excluded. I'm not sure how you came to those conclusions about my use of 99, as I gave very little information about my dataset. Following this, you literally have no information about my dataset, the size of it, or what I'm using it for, so I would refrain from telling me how I should name my variables, and declaring that I have poor data structure. Next time, try answering the questions that people ask. Have a good one! 

ballardw
Super User

@gejoachim99 wrote:

Thank you for the reply, even though it is not at all related to the question I asked. I am fully aware that SAS has a concept of "missing value," as you put it, and that 99 is not used in calculations. I wrote my code to ensure those observations were excluded. I'm not sure how you came to those conclusions about my use of 99, as I gave very little information about my dataset. Following this, you literally have no information about my dataset, the size of it, or what I'm using it for, so I would refrain from telling me how I should name my variables, and declaring that I have poor data structure. Next time, try answering the questions that people ask. Have a good one! 


We have a lot people on this forum just learning SAS. The number of posts you have made does not indicate any great experience with SAS so helpful hints should be considered before getting upset. I reached the consideration about possibly using 99 as missing data because 1) you were excluding that value from calculation for multiple variables and 2) 99, 999, 9999 and similar values have been used as "special values" for a very long time and often in languages that do not have a concept like "missing" requiring use of many explicit "if" clauses to work around them.

 

I note that you do not say anything that your data or process did not derive from a spreadsheet anywhere though.

 

I can also tell that you are potentially mangling some date related values  (365.25 is NOT exactly one year), are unfamiliar with arrays (which would likely reduce all of the hashnumx code by a lot), don't care to use proper variable types (creating a bunch of "day" variables as character and then immediately using them in arithmetic) and could quite possibly use some experience with Formats to avoid that second data step entirely.

gejoachim99
Fluorite | Level 6

My only point is that you did not answer the question I asked. While I only have started using this forum recently, I have been using SAS for years now. You also have no idea the PURPOSE of my coding, so I am again confused about how exactly you are coming to the conclusion you are. I would implore you to not make such assumptions. Lastly, I assure you, I would take what you had written into consideration without getting "upset," as you insist I am, if you would attempt to help in a less condescending and uninformed way. 

JackHamilton
Lapis Lazuli | Level 10

These forums have participants who are SAS employees, but most participants are non-SAS volunteers.  You have no standing to specify what kind of help you are going to get. 

I agree with ballardw's comments on your code.  They do make the unwarranted assumption that you are interested in making your code easier to read and debug, but even if that is not your goal, it might be the goal of other people who read the messages here.  As of this writing, 140 people have looked at this message thread, and some of them might find ballardw's advice useful.  They might be newcomers who haven't yet learned that dividing by 365.25, or even 365.2422, is not the best way in SAS to calculate years, or that coding 99... is not the best way to represent unknown values in SAS (and for that matter, was probably never the best choice in any language when dealing with non-trivial data domains).

 

 

gejoachim99
Fluorite | Level 6

I see your point. In that case, I truly hope that users insulting my code will be helpful to others. 

Reeza
Super User

To all involved, you can click on someone's name and then click Ignore. It's a great feature that I've started using to protect my time and sanity.

FreelanceReinh
Jade | Level 19

@Reeza wrote:

To all involved, you can click on someone's name and then click Ignore. It's a great feature that I've started using to protect my time and sanity.


Hi @Reeza,

 

Sounds promising. Are there other implications of "ignoring" someone than just blocking their PMs?

Reeza
Super User
It blocks notifications as well, but not sure beyond that.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 4239 views
  • 12 likes
  • 6 in conversation