Hi,
I have input data set with variable Job , some observation has computer in the Job variable , and other observations has computer and Electrical. I want keep all computer in separate new variable and Electrical also into new variable.
please help me how to separate the values with and combined values ( computer and Electrical)
Thank You
If it's really only either 1 word or 3 words then using the SCAN() function is a possibility.
data have;
infile datalines truncover;
input job $40.;
datalines;
computer and Electrical
Electrical
computer
;
data want;
set have;
job1=scan(job,1);
job2=scan(job,3);
run;
proc print;
run;
If above doesn't solve your problem then please amend the data step have with additional representative use cases and post it.
Show us a small example of what the input data set looks like, and what the output data set should look like.
You don't say what you want the new variables to contain. To create binary variables that signal the presence of a word in the Job string, use the findw function:
computer = findw(job, "computer", " ", 'i');
electrical = findw(job, "electrical", " ", 'i');
Thank you,
here the details: Input data set has the variable JOB, The variable JOB should contain only one value listed like 'computer' in observations.
But some observations have two values listed like ' computer and Electrical' , it is separated by 'and'.
I need to use the word 'and' to identify the second job, if I found that put the second job in a new variable called SecondJob.
Please help me to separate the second job (Electrical) and put that into SecondJob variable
If it's really only either 1 word or 3 words then using the SCAN() function is a possibility.
data have;
infile datalines truncover;
input job $40.;
datalines;
computer and Electrical
Electrical
computer
;
data want;
set have;
job1=scan(job,1);
job2=scan(job,3);
run;
proc print;
run;
If above doesn't solve your problem then please amend the data step have with additional representative use cases and post it.
@souji wrote:
here the details: Input data set has the variable JOB, The variable JOB should contain only one value listed like 'computer' in observations.
But some observations have two values listed like ' computer and Electrical' , it is separated by 'and'.
I need to use the word 'and' to identify the second job, if I found that put the second job in a new variable called SecondJob.
This is a different explanation than the original explanation you gave.
As I said, you need to SHOW us an example of what the input data set looks like, and SHOW us what the final data set should look like. Showing us examples will be a better way to help us provide the proper code than the words you are using.
In real life, it would be shocking if all your jobs were a single word. Surely, you should be prepared to separate out:
Chief cook and bottle washer
Here's a way to do that.
data want;
set have;
split = index(job, ' and ');
if split then do;
job2 = substr(job, split + 5);
job = substr(job, 1, split);
end;
drop split;
run;
Thank you
Can you please explain me, why did you put split and +5 in substr function
data want;
set have;
split = index(job, ' and ');
if split then do;
job2 = substr(job, split + 5);
job = substr(job, 1, split);
end;
drop split;
run;
SPLIT is the location where the 5 characters " and " appear. If those characters do not appear, SPLIT will be 0 and the rest of the processing gets skipped.
If " and " does appear, SPLIT is the location on the line where it begins. So everything after " and " begins 5 characters later. That's why split+5 appears, and why the SUBSTR function for JOB2 does not have a third parameter. (You may need to read a little about SUBSTR.) Following that, the formula for JOB uses SPLIT to select all the characters before "and ".
can you please provide document to read SUBSTR..I am still in learning stage
thank you
Hi @souji,
Assuming that you don't have a local SAS installation (in which case the Help menu and the SAS Documentation Viewer would be available), the online documentation is one of the best sources of information on SAS programming. You can enter keywords (e.g. SUBSTR) into the search field on the documentation homepage, https://support.sas.com/en/documentation.html, but if you know already which category the keyword belongs to (e.g., it's a function), the page providing "Quick Links" to all those categories may be even more helpful. This is because the general search will often result in a large number of hits and, unfortunately (especially for beginners), the most appropriate link will not always be the first of the list.
So, in your example, you would click the link "Functions" (in section "Language Elements" under "Syntax - Quick Links"), then the letter "S" and scroll down the alphabetical list until you find "SUBSTR" -- but be prepared to find two or more entries for the same keyword. Avoid any links that refer to (the special topics) DS2 or FedSQL (unless that's what you're interested in) and prefer links that refer to general SAS programming, in this case from the source "Document: SAS Functions and CALL Routines: Reference." This leaves you with two links because the SUBSTR function (as an exceptional case) exists in two variants: one "right of =" (this is appropriate here, see Astounding's code) and one "left of =". Under the most important section "Syntax" you will virtually always find a section containing one or more examples. You should copy these into a SAS session and play around with them to see (and try to understand) how modifications to the code (e.g. function arguments) change the results.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.