BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
souji
Obsidian | Level 7

Hi,

 

I have input data set with variable Job , some observation has computer in the Job variable , and other observations has computer and Electrical. I want  keep all computer in separate new variable and Electrical also into new variable.

please help me how to separate the values with and combined values ( computer and Electrical)

 

Thank You

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@souji 

If it's really only either 1 word or 3 words then using the SCAN() function is a possibility.

data have;
  infile datalines truncover;
  input job $40.;
  datalines;
computer and Electrical 
Electrical 
computer
;

data want;
  set have;
  job1=scan(job,1);
  job2=scan(job,3);
run;
proc print;
run;

If above doesn't solve your problem then please amend the data step have with additional representative use cases and post it. 

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

Show us a small example of what the input data set looks like, and what the output data set should look like.

--
Paige Miller
PGStats
Opal | Level 21

You don't say what you want the new variables to contain. To create binary variables that signal the presence of a word in the Job string, use the findw function:

 

computer = findw(job, "computer", " ", 'i');

electrical = findw(job, "electrical", " ", 'i');

PG
souji
Obsidian | Level 7

Thank you,

 

here the details: Input data set has the variable JOB, The variable JOB should contain only one value listed like 'computer' in observations.

 But some observations have two values listed  like ' computer and Electrical'  , it is separated by 'and'.

I need to use the word 'and' to identify the second job, if I found that put the second job in a new variable called SecondJob.

 

Please help me to separate the second job (Electrical) and put that into SecondJob variable

 

Patrick
Opal | Level 21

@souji 

If it's really only either 1 word or 3 words then using the SCAN() function is a possibility.

data have;
  infile datalines truncover;
  input job $40.;
  datalines;
computer and Electrical 
Electrical 
computer
;

data want;
  set have;
  job1=scan(job,1);
  job2=scan(job,3);
run;
proc print;
run;

If above doesn't solve your problem then please amend the data step have with additional representative use cases and post it. 

PaigeMiller
Diamond | Level 26

@souji wrote:

here the details: Input data set has the variable JOB, The variable JOB should contain only one value listed like 'computer' in observations.

 But some observations have two values listed  like ' computer and Electrical'  , it is separated by 'and'.

I need to use the word 'and' to identify the second job, if I found that put the second job in a new variable called SecondJob.

 


This is a different explanation than the original explanation you gave. 

 

As I said, you need to SHOW us an example of what the input data set looks like, and SHOW us what the final data set should look like. Showing us examples will be a better way to help us provide the proper code than the words you are using. 

--
Paige Miller
Astounding
PROC Star

In real life, it would be shocking if all your jobs were a single word.  Surely, you should be prepared to separate out:

 

Chief cook and bottle washer

 

Here's a way to do that.

 

data want;
   set have;
   split = index(job, ' and ');
   if split then do;
      job2 = substr(job, split + 5);
      job = substr(job, 1, split);
   end;
   drop split;
run;
souji
Obsidian | Level 7

Can you please explain me, why did you put split and +5 in substr function

 

data want;
   set have;
   split = index(job, ' and ');
   if split then do;
      job2 = substr(job, split + 5);
      job = substr(job, 1, split);
   end;
   drop split;
run;

Astounding
PROC Star

SPLIT is the location where the 5 characters " and " appear.  If those characters do not appear, SPLIT will be 0 and the rest of the processing gets skipped.

 

If " and " does appear, SPLIT is the location on the line where it begins.  So everything after " and " begins 5 characters later.  That's why split+5 appears, and why the SUBSTR function for JOB2 does not have a third parameter.  (You may need to read a little about SUBSTR.)  Following that, the formula for JOB uses SPLIT to select all the characters before "and ".

souji
Obsidian | Level 7

can you please provide document to read SUBSTR..I am still in learning stage

 

thank you

FreelanceReinh
Jade | Level 19

Hi @souji,

 

Assuming that you don't have a local SAS installation (in which case the Help menu and the SAS Documentation Viewer would be available), the online documentation is one of the best sources of information on SAS programming. You can enter keywords (e.g. SUBSTR) into the search field on the documentation homepage, https://support.sas.com/en/documentation.html, but if you know already which category the keyword belongs to (e.g., it's a function), the page providing "Quick Links" to all those categories may be even more helpful. This is because the general search will often result in a large number of hits and, unfortunately (especially for beginners), the most appropriate link will not always be the first of the list.

 

So, in your example, you would click the link "Functions" (in section "Language Elements" under "Syntax - Quick Links"), then the letter "S" and scroll down the alphabetical list until you find "SUBSTR" -- but be prepared to find two or more entries for the same keyword. Avoid any links that refer to (the special topics) DS2 or FedSQL (unless that's what you're interested in) and prefer links that refer to general SAS programming, in this case from the source "Document: SAS Functions and CALL Routines: Reference." This leaves you with two links because the SUBSTR function (as an exceptional case) exists in two variants: one "right of =" (this is appropriate here, see Astounding's code) and one "left of =". Under the most important section "Syntax" you will virtually always find a section containing one or more examples. You should copy these into a SAS session and play around with them to see (and try to understand) how modifications to the code (e.g. function arguments) change the results.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 7085 views
  • 2 likes
  • 6 in conversation