BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
maliha9999
Calcite | Level 5

Hi everyone, 

 

I am writing some SAS code for an example dataset below. I am trying to create a new variable to categorize my city/village variable into 14 different categories.

 


City_Village date Sales_Total LHIN
ActonOCT20188341 
ActonOCT201817558 
ActonOCT201811847 
ActonJAN20197428 
ActonJAN201918106 
ActonJAN201913147 
ActonAPR20193876 
ActonAPR201912083 

 

I have written the following code for this example dataset after importing it into SAS.

 

data Sales_LHIN;
set Sales;
LHIN = "";

if City_Village = ("Aurora", "Bradford", "Concord", "Holland Landing", "Keswick", "Maple", "Markham", "Newmarket", "Ontario Centre", "Richmond Hill", "Stouffville", "Thornhill", "Willowdale West", "Willowdale East", "Willowdale South", "King City") then LHIN = "Central";
else if City_Village = ("Ajax", "Bowmanville", "Cobourg", "Courtice", "East Durham Regional Municipality", "Kawartha Lakes And Haliburton County", "Lindsay", "Oshawa", "Peterborough", "Pickering", "Port Hope", "Port Perry", "Uxbridge", "West Durham Regional Municipality", "West Northumberland County", "Whitby") then LHIN = "Central East"; 
else if City_Village = ("Bolton", "Brampton", "Caledon", "Dufferin County", "Grimsby", "North Peel Regional Municipality", "Orangeville", "Woodbridge North", "Woodbridge South") then LHIN = "Central West";
else if City_Village = ("Arnprior", "Cornwall", "Hawkesbury", "Ottawa", "Pembroke", "Petawawa", "Prescott And Russell United Counties", "Renfrew", "Renfrew County And Lanark Highlands Township", "Rideau Lakes Area", "Rockland", "Russell", "Stormont and Glengarry United Counties") then LHIN = "Champlain"; 
else if City_Village = ("Amherstburg", "Chatham-Kent", "Essex", "Kingsville", "Lambton", "Leamington", "Windsor") then LHIN = "Erie St.Clair"; 
else if City_Village = ("Ancaster", "Baden", "Brant And Norfolk", "Burlington", "Caledon", "Delhi", "Dunnville", "East Haldimand County", "Fort Erie", "Hamilton", "La Salle", "Niagara Regional Municipality", "Paris", "Port Colborne", "St. Catherines", "Welland", "West Haldimand County") then LHIN = "Hamilton Niagara Haldimand Brant";
else if City_Village = ("Acton", "Halton Hills", "Milton", "Mississauga", "Oakville") then LHIN = "Mississauga Halton"; 
else if City_Village = ("Algoma", "Cochrane Region", "Elliot Lake", "Espanola", "Kapuskasing", "Kirkland Lake", "Manitoulin", "Nipissing", "North Bay", "Parry Sound", "Sault Ste. Marie Central", "Sault Ste. Marie East", "Sault Ste. Marie North", "Sturgeon Falls", "Timmins", "Timiskaming", "Greater Sudbury") then LHIN = "North East"; 
else if City_Village = ("Alliston", "Barrie", "Bracebridge", "Collingwood", "Georgian Bay", "Gravenhurst", "Huntsville", "Innisfil", "Lake Simcoe North Shore", "Lake Simcoe Southeast Shore", "Midland", "Orillia", "Penetanguishene", "Simcoe", "Wasaga Beach") then LHIN = "North Simcoe Muskoka";
else if City_Village = ("Dryden", "Fort Frances", "Kenora Region", "Lake of The Woods East Shore", "Lake Superior East Shore", "Lake Superior North Shore", "Neebing", "Northwestern Ontario", "Rainy River Region", "Sioux Lookout", "Thunder Bay") then LHIN = "North West";
else if City_Village = ("Amherstview", "Belleville", "Brockville", "Frontenac County, Addington County, Loyalist Shores And Southwest Leeds", "Gananoque", "Kingston", "Napanee", "Quinte Shores, East Northumberland County & Prince Edward County", "Smiths Falls", "South Leeds And Grenville United Counties", "Trenton") then LHIN = "South East"; 
else if City_Village = ("Aylmer", "Bruce Peninsula", "Goderich", "Hanover", "Huron", "Kincardine", "Listowel", "London-Middlesex", "Heaford", "Owen Sound", "Oxford", "Perth", "Port Stanley", "St. Mary's", "St. Thomas North", "St. Thomas South", "Stratford", "Tillsonburg")  then LHIN = "South West";
else if City_Village = ("Toronto", "Weston") then LHIN = "Toronto Central";
else if City_Village = ("Cambridge", "Elmira", "Fergus", "Guelph", "Kitchener", "Waterloo") then LHIN = "Waterloo Wellington";
run; 

 

 

I received the following error after running the code. 

 

if City_Village = ("Aurora", "Bradford", "Concord", "Holland Landing", "Keswick",
-
388
76
181! "Maple", "Markham", "Newmarket", "Ontario Centre", "Richmond Hill", "Stouffville",
181! "Thornhill", "Willowdale West", "Willowdale East", "Willowdale South", "King City") then
181! LHIN = "Central";
182 else if City_Village = ("Ajax", "Bowmanville", "Cobourg", "Courtice", "East Durham
-
388
76
182! Regional Municipality", "Kawartha Lakes And Haliburton County", "Lindsay", "Oshawa",
182! "Peterborough", "Pickering", "Port Hope", "Port Perry", "Uxbridge", "West Durham Regional
182! Municipality", "West Northumberland County", "Whitby") then LHIN = "Central East";
183 else if City_Village = ("Bolton", "Brampton", "Caledon", "Dufferin County", "Grimsby",
-
388
76
183! "North Peel Regional Municipality", "Orangeville", "Woodbridge North", "Woodbridge South")
183! then LHIN = "Central West";
184 else if City_Village = ("Arnprior", "Cornwall", "Hawkesbury", "Ottawa", "Pembroke",
-
388
76
184! "Petawawa", "Prescott And Russell United Counties", "Renfrew", "Renfrew County And Lanark
184! Highlands Township", "Rideau Lakes Area", "Rockland", "Russell", "Stormont and Glengarry
184! United Counties") then LHIN = "Champlain";
185 else if City_Village = ("Amherstburg", "Chatham-Kent", "Essex", "Kingsville",
-
----
-
388
180
180
200
185! "Lambton", "Leamington", "Windsor") then LHIN = "Erie St.Clair";
186 else if City_Village = ("Ancaster", "Baden", "Brant And Norfolk", "Burlington",
-
388
76
186! "Caledon", "Delhi", "Dunnville", "East Haldimand County", "Fort Erie", "Hamilton", "La
186! Salle", "Niagara Regional Municipality", "Paris", "Port Colborne", "St. Catherines",
186! "Welland", "West Haldimand County") then LHIN = "Hamilton Niagara Haldimand Brant";
187 else if City_Village = ("Acton", "Halton Hills", "Milton", "Mississauga", "Oakville")
-
----
-
388
180
180
200
187! then LHIN = "Mississauga Halton";
188 else if City_Village = ("Algoma", "Cochrane Region", "Elliot Lake", "Espanola",
-
388
76
188! "Kapuskasing", "Kirkland Lake", "Manitoulin", "Nipissing", "North Bay", "Parry Sound",
188! "Sault Ste. Marie Central", "Sault Ste. Marie East", "Sault Ste. Marie North", "Sturgeon
188! Falls", "Timmins", "Timiskaming", "Greater Sudbury") then LHIN = "North East";
189 else if City_Village = ("Alliston", "Barrie", "Bracebridge", "Collingwood", "Georgian
-
388
76
189! Bay", "Gravenhurst", "Huntsville", "Innisfil", "Lake Simcoe North Shore", "Lake Simcoe
189! Southeast Shore", "Midland", "Orillia", "Penetanguishene", "Simcoe", "Wasaga Beach") then
189! LHIN = "North Simcoe Muskoka";
190 else if City_Village = ("Dryden", "Fort Frances", "Kenora Region", "Lake of The Woods
-
388
76
190! East Shore", "Lake Superior East Shore", "Lake Superior North Shore", "Neebing",
190! "Northwestern Ontario", "Rainy River Region", "Sioux Lookout", "Thunder Bay") then LHIN =
190! "North West";
191 else if City_Village = ("Amherstview", "Belleville", "Brockville", "Frontenac County,
-
388
76
191! Addington County, Loyalist Shores And Southwest Leeds", "Gananoque", "Kingston", "Napanee",
191! "Quinte Shores, East Northumberland County & Prince Edward County", "Smiths Falls", "South
191! Leeds And Grenville United Counties", "Trenton") then LHIN = "South East";
192 else if City_Village = ("Aylmer", "Bruce Peninsula", "Goderich", "Hanover", "Huron",
-
388
76
192! "Kincardine", "Listowel", "London-Middlesex", "Heaford", "Owen Sound", "Oxford", "Perth",
192! "Port Stanley", "St. Mary's", "St. Thomas North", "St. Thomas South", "Stratford",
192! "Tillsonburg") then LHIN = "South West";
193 else if City_Village = ("Toronto", "Weston") then LHIN = "Toronto Central";
- ---- -
388 180 180
200
194 else if City_Village = ("Cambridge", "Elmira", "Fergus", "Guelph", "Kitchener",
-
----
-
388
180
180
200
194! "Waterloo") then LHIN = "Waterloo Wellington";
ERROR 388-185: Expecting an arithmetic operator.

ERROR 76-322: Syntax error, statement will be ignored.

ERROR 180-322: Statement is not valid or it is used out of proper order.

ERROR 200-322: The symbol is not recognized and will be ignored.

195 run;

NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
185:13 185:29 185:108 187:13 187:29 187:79 193:13 193:29 193:40
194:13 194:29 194:85
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.SALES_LHIN may be incomplete. When this step was stopped there were
0 observations and 5 variables.
WARNING: Data set WORK.SALES_LHIN was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.00 seconds

 

How can I resolve the error in the code?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You used = in such a manner that SAS expects the result on the right of the = to be something that assigns a value.

= as a comparison is for a single value.

 

You want to use the IN operator when looking to see if a value is one out of many

f City_Village IN ("Aurora", "Bradford", "Concord", "Holland Landing", "Keswick", "Maple", "Markham", "Newmarket", "Ontario Centre", "Richmond Hill", "Stouffville", "Thornhill", "Willowdale West", "Willowdale East", "Willowdale South", "King City") then LHIN = "Central";

If you are going to bother to include log include the whole thing: actual code and all the messages/errors/warnings/ notes. It is a good idea to paste such text from the log into a text box opened on the forum with the </> icon that appears above the message window to preserve the diagnostic characters in the correct position when provided.

 

I suspect if you do that then the _ underscore characters appear under the equal sign.

 

Note that SAS has a very powerful tool called a Format that means this code would not be needed.

proc format;
value $city_to_region

"Aurora", "Bradford", "Concord", "Holland Landing", "Keswick", 
"Maple", "Markham", "Newmarket", "Ontario Centre", "Richmond Hill", 
"Stouffville", "Thornhill", "Willowdale West", "Willowdale East", 
"Willowdale South", "King City"  = "Central"

"Ajax", "Bowmanville", "Cobourg", "Courtice", 
"East Durham Regional Municipality", 
"Kawartha Lakes And Haliburton County", 
"Lindsay", "Oshawa", "Peterborough", "Pickering", 
"Port Hope", "Port Perry", "Uxbridge", 
"West Durham Regional Municipality", 
"West Northumberland County", "Whitby"  = "Central East"


/* repeat for other region definitions*/
;
run;

/* use: */
Proc freq data=sales;
   tables City_Village;
   format City_Village $city_to_region. ;
run;

You would need to make sure that the custom format is available in any session that you want to use it such as rerunning the proc format code.

Note that the above Proc freq will also show you spelling differences because the value left of the = in the format definition must be the same, so if you have a value of "AUrora" it will show up by itself as not assigned to a group.

 

The format code is going to much easier to modify when you need to add a new city than into a bunch of it/then/else.

 

Note that there is an option Other to indicate what to do with values not listed. This can be useful to reduce scope of a report, such as you want to compare "Central East" with all others.

 

 

The groups formed by custom formats are honored by analysis procedures, such as Freq above, reporting procedures like Proc Report or Tabulate, and generally for graphing purposes.

View solution in original post

1 REPLY 1
ballardw
Super User

You used = in such a manner that SAS expects the result on the right of the = to be something that assigns a value.

= as a comparison is for a single value.

 

You want to use the IN operator when looking to see if a value is one out of many

f City_Village IN ("Aurora", "Bradford", "Concord", "Holland Landing", "Keswick", "Maple", "Markham", "Newmarket", "Ontario Centre", "Richmond Hill", "Stouffville", "Thornhill", "Willowdale West", "Willowdale East", "Willowdale South", "King City") then LHIN = "Central";

If you are going to bother to include log include the whole thing: actual code and all the messages/errors/warnings/ notes. It is a good idea to paste such text from the log into a text box opened on the forum with the </> icon that appears above the message window to preserve the diagnostic characters in the correct position when provided.

 

I suspect if you do that then the _ underscore characters appear under the equal sign.

 

Note that SAS has a very powerful tool called a Format that means this code would not be needed.

proc format;
value $city_to_region

"Aurora", "Bradford", "Concord", "Holland Landing", "Keswick", 
"Maple", "Markham", "Newmarket", "Ontario Centre", "Richmond Hill", 
"Stouffville", "Thornhill", "Willowdale West", "Willowdale East", 
"Willowdale South", "King City"  = "Central"

"Ajax", "Bowmanville", "Cobourg", "Courtice", 
"East Durham Regional Municipality", 
"Kawartha Lakes And Haliburton County", 
"Lindsay", "Oshawa", "Peterborough", "Pickering", 
"Port Hope", "Port Perry", "Uxbridge", 
"West Durham Regional Municipality", 
"West Northumberland County", "Whitby"  = "Central East"


/* repeat for other region definitions*/
;
run;

/* use: */
Proc freq data=sales;
   tables City_Village;
   format City_Village $city_to_region. ;
run;

You would need to make sure that the custom format is available in any session that you want to use it such as rerunning the proc format code.

Note that the above Proc freq will also show you spelling differences because the value left of the = in the format definition must be the same, so if you have a value of "AUrora" it will show up by itself as not assigned to a group.

 

The format code is going to much easier to modify when you need to add a new city than into a bunch of it/then/else.

 

Note that there is an option Other to indicate what to do with values not listed. This can be useful to reduce scope of a report, such as you want to compare "Central East" with all others.

 

 

The groups formed by custom formats are honored by analysis procedures, such as Freq above, reporting procedures like Proc Report or Tabulate, and generally for graphing purposes.

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

From SAS Users blog
Want more? Visit our blog for more articles like these.
5 Steps to Your First Analytics Project Using SAS

For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 297 views
  • 1 like
  • 2 in conversation