I am in the midst of classifying survey "Place of employment" responses with NAICS Industrial codes. I got a great help the other day with the
suggestion to use the Findw function. I used the following
If findw(Place_of_employment,'CONSTRUCTION',,'spit')>0 then NAICS_Sector = "Goods";
The data is kind of hierarchic. Place of employment are contact contact tracers survey input
Place of employment then classify between
Services Goods
What I need to do now is to go down one more level depending if the employment is in the Goods or Services sector
Something like this (I am only showing one category from Goods the Services is set up the same way ). I can't copy a few lines from the sample SAS data set so I have attached a few rows in text format to give and idea of the type of survey data that I am working with. I have started filling out other fields that will eventually be used for frequencies and other analysis. I haven't yet done the government level which will take care of a lot of rows.
Level 1 Goods
Level 2 Manufacturing
Level 3 Food Manufacturing
Wood Products Manufacturing
Plastics and Rubber Products
Primary Metals
Computer and Electronic Product Manufacturing and so on
The Findw approach helped classify Level 1 (Goods or Services)
Since I still have to use the Place_of_employment field to glean any information I was thinking of using something like
the colon modifier approach? What are better approaches (besides colon modifier if that is even a legitimate approach) that would help
to fill the Sector_Type field in the data set. If I can get it to that level that is probably sufficient in the first go-around.
Thank you.
You may get lucky with some values but the way Place_of_employment was collected is going to have a very hard time getting any such detail from something like "Lake Ridge" [maybe if your target geography is extremely limited you might know more] or "Johnson & Sons" unless there is much more information available in your data than shown.
If the survey was conducted by telephone you might have some luck with one of the services that collects information about organizations by telephone. Some of these would be survey sample design companies where typically you might ask them to generate a sample of XXX number of companies in Industry YYYY. Yes $$$ would be involved. May save a lot of time which could be considered money as well. Ask for the right details and they might have it. Then merge the information back to your data set.
If the number records isn't prohibitive web searches with company names and some geography may allow you get enough information. Create a data set to merge back to your data. If by any chance you happened to collect a URL this might be a pretty quick process.
Lesson learned from this survey is "collect all the types of information you need" in the body of the survey. I wouldn't expect a company to report on NAICS but you can ask "type of organization" or "what does your organization do?" questions. Of course that doesn't help this one.
I do feel your pain. I have worked with surveys where we were trying to classify the "role" our respondent filled in an organization. We asked for the person most familiar with certain aspects of billing. Our developed over several cycles of the survey expected response list had about 10 main categories. We still had to clean up more than 30% of responses that selected "other" with an open-ended response collected for a reporting fit. 30% of over 50,000 surveys.
Thank you for the response. Fortunately, the Place of employment field refers to entities in OR. For example, Lake Ridge is a high school. I havent' finished up the NAICS_sector classification yet (between Goods and Services). I was basically just jumping ahead a bit to see if SAS had some functions or methods to do the Level 2 classification. There still is a lot of manual work to be done - if I come to a place of employment that is unfamiliar, I will do a google search on the firm just to help delineate whether it is a Goods or Services firm. But then another code line needs to be added. So the code blocks can get long because of the repetitive nature of the exercise.
I can get my way through the Level 1 classification using the Findw method you provided. Thanks.
I was just crossing my fingers that there maybe some work-arounds in SAS that I am not familiar with (and there is plenty that I'm not familiar with!)
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.