I am working with an advertising database for a school project and I've been using the online SAS Studio. I am trying to create a new categorical variable ('TVSpend') from a numerical variable ('TVBudget' which shows the amount spent on TV Advertising) with my dataset. I am having issues with it, but the first 2 variables I created worked fine. I'm sure it is something small I am overlooking, can anyone help?
The error I am getting is right after "If TVBudget >= 100 and < 200 then TVSpend = 'Medium'; else" and the message is:
Here is my code:
Libname Final '/home/pavlockb30/my_courses/SASSpring2019/MyData/FinalProject';
Data Final.Advertising;
Infile '/home/pavlockb30/my_courses/SASSpring2019/MyData/FinalProject/Advertising.csv' DSD Firstobs=2;
Input Observation TVBudget RadioBudget NewspaperBudget Sales;
If Sales > Sum(TVBudget, RadioBudget, NewspaperBudget) then Profit = 'Yes'; else
If Sales < Sum(TVBudget, RadioBudget, NewspaperBudget) then Profit = 'No'; else
If Sales = Sum(TVBudget, RadioBudget, NewspaperBudget) then Profit = 'Breakeven';
If TVBudget > RadioBudget and NewspaperBudget then LargestExpenditure = 'TV'; else
If RadioBudget > TVBudget and NewspaperBudget then LargestExpenditure = 'Radio'; else
If NewspaperBudget > TVBudget and RadioBudget then LargestExpenditure = 'Newspaper';
If TVBudget = . then TVSpend = 'Missing'; else
If TVBudget < 100 then TVSpend = 'Low'; else
If TVBudget >= 100 and < 200 then TVSpend = 'Medium'; else
If TVBudget <= 200 then TVSpend = 'High';
Run;
Proc Print Data = Final.Advertising Noobs;
Run;
try changing this
If TVBudget >= 100 and < 200 then TVSpend = 'Medium'; else
to this
If TVBudget >= 100 and TVBudget < 200 then TVSpend = 'Medium'; else
If TVBudget >= 200 then TVSpend = 'High';
try changing this
If TVBudget >= 100 and < 200 then TVSpend = 'Medium'; else
to this
If TVBudget >= 100 and TVBudget < 200 then TVSpend = 'Medium'; else
If TVBudget >= 200 then TVSpend = 'High';
That was it! Thank you so much!
Or even
If 100 <= TVBudget < 200 then TVSpend = 'Medium'; else
Another approach is to create a custom format such as
Proc format; value tvspend . = 'Missing' low - <100 = 'Low' 100 -< 200 = 'Medium' 200 - High = 'High' ;
And use the format with the numeric value:
proc freq ; tables tvbudget; format tvbudget tvspend.; run;
Advantages:
1) I you need to create a different grouping or change boundaries just modify the Proc Format code or create a new format as needed
2) values will likely sort more consistently, as is "High" will come before "Low" or "Medium" when using character variables.
3) if you have multiple variables that would use the same boundaries you simply use the same format, no need for additional variables generally.
4) you don't have to address variable length issues that sometimes arise when your first character value is shorter than others and you get truncated values because you did not explicitly set a length for the target variable(s).
Most of the analysis and graphic procedures will honor groups created by proc format.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.