BookmarkSubscribeRSS Feed
wlierman
Lapis Lazuli | Level 10

I am in the midst of classifying survey "Place of employment" responses with NAICS Industrial codes.  I got a great help the other day with the

 

suggestion to use the Findw function. I used the following

If findw(Place_of_employment,'CONSTRUCTION',,'spit')>0 then NAICS_Sector = "Goods";

 

The data is kind of hierarchic.  Place of employment are contact contact tracers survey input

 

Place of employment  then classify between

 

                                     Services                                                 Goods

 

What I need to do now is to go down one more level depending if the employment is in the Goods or Services sector

Something like this (I am only showing one category from Goods the Services is set up the same way ).  I can't copy a few lines from the sample SAS data set so I have attached a few rows in text format to give and idea of the type of survey data that I am working with.  I have started filling out other fields that will eventually be used for frequencies and other analysis.  I haven't yet done the government level which will take care of a lot of rows.

       

    Level 1     Goods

    Level 2         Manufacturing

    Level 3                Food Manufacturing

                               Wood Products Manufacturing

                               Plastics and Rubber Products

                               Primary Metals

                               Computer and Electronic Product Manufacturing      and so on

 

The Findw approach helped classify Level 1 (Goods or Services)

 

Since I still have to use the Place_of_employment field to glean any information I was thinking of using something like

the colon modifier approach?  What are better approaches (besides colon modifier if that is even a legitimate approach) that would help 

to fill the Sector_Type field in the data set.  If I can get it to that level that is probably sufficient in the first go-around.

 

Thank you.

               

                       

 

 

 

 

 

 

2 REPLIES 2
ballardw
Super User

You may get lucky with some values but the way Place_of_employment was collected is going to have a very hard time getting any such detail from something like "Lake Ridge" [maybe if your target geography is extremely limited you might know more] or "Johnson & Sons" unless there is much more information available in your data than shown.

 

If the survey was conducted by telephone you might have some luck with one of the services that collects information about organizations by telephone. Some of these would be survey sample design companies where typically you might ask them to generate a sample of XXX number of companies in Industry YYYY. Yes $$$ would be involved. May save a lot of time which could be considered money as well. Ask for the right details and they might have it. Then merge the information back to your data set.

 

If the number records isn't prohibitive web searches with company names and some geography may allow you get enough information. Create a data set to merge back to your data. If by any chance you happened to collect a URL this might be a pretty quick process.

 

Lesson learned from this survey is "collect all the types of information you need" in the body of the survey. I wouldn't expect a company to report on NAICS but you can ask "type of organization" or "what does your organization do?" questions. Of course that doesn't help this one.

 

I do feel your pain. I have worked with surveys where we were trying to classify the "role" our respondent filled in an organization. We asked for the person most familiar with certain aspects of billing. Our developed over several cycles of the survey expected response list had about 10 main categories. We still had to clean up more than 30% of responses that selected "other" with an open-ended response collected for a reporting fit. 30% of over 50,000 surveys.

 

 

 

wlierman
Lapis Lazuli | Level 10

Thank you for the response.  Fortunately, the Place of employment field refers to  entities in OR.  For example, Lake Ridge is a high school. I havent' finished up the NAICS_sector classification yet (between Goods and Services).  I was basically just jumping ahead a bit to see if SAS had some functions or methods to do the Level 2 classification.  There still is a lot of manual work to be done - if I come to a place of employment that is unfamiliar, I will do a google search on the firm just to help delineate whether it is a Goods or Services firm.  But then another code line needs to be added. So the code blocks can get long because of the repetitive nature of the exercise.

 

I can get my way through the Level 1 classification using the Findw method you provided.  Thanks.

 

I was just crossing my fingers that there maybe some work-arounds in SAS that I am not familiar with (and there is plenty that I'm not familiar with!)

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 326 views
  • 0 likes
  • 2 in conversation