Hi....I am trying to separate a string of characters into separate variables. The separation is to occur at the first numeric number not including blanks and 'AND'. I am not sure what is the best way to approach this as first thought was to first get rid of the 'AND' first. Any suggestions. Thanks in Advance.
data Have; length SrMasterID 8 FormulaText $ 150; format SrMasterID F12. FormulaText $char150.; informat SrMasterID best12. FormulaText $char150.; infile datalines4 dlm='\' missover dsd; input SrMasterID : best12. FormulaText : $char150.; datalines4; 1578\AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110 1581\BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S 1582\CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120 ;;;; Want: SrMasterID Department Course 1578 AM 12020S 1578 AM 13020S 1578 AM 14030S 1578 AM 15040S 1578 AM 18040S 1578 AM 110 1581 BAA 01125S 1581 BAA 03130S 1581 BAA 04125S 1581 BAA 05040S 1581 BAA 06030S 1581 BAA 07040S 1581 BAA 08540S 1581 BAA 08730S 1581 BAA 09040S 1581 BAA 10040S 1581 BAA 16040S 1582 CAR 01140S 1582 CAR 02120S 1582 CAR 03120S 1582 CAR 04130S 1582 CAR 05140S 1582 CAR 120
FYI - your data seems to have some random whitespace characters that I sort of cleaned up manually. If that's true in your actual data may need to use COMPRESS to clean it up ahead of time. You'll see this in the output with extra spaces around values.
data Have;
length SrMasterID 8 FormulaText $ 150;
format SrMasterID F12. FormulaText $char150.;
informat SrMasterID best12. FormulaText $char150.;
infile datalines4 dlm='\' TRUNCOVER dsd;
input SrMasterID FormulaText ;
datalines4;
1578\ AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110
1581\ BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S
1582\ CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120
;;;;
data want;
set have;
nWords = countw(formulaText);
do i=1 to nWords by 2;
term = trim(scan(formulaText, i));
firstDigit = anydigit(term);
Department = trim(substr(term, 1, firstDigit-1));
Course = trim(substr(term, firstDigit));
output;
end;
keep srMasterID term firstDigit Department Course;
run;
@twildone wrote:
Hi....I am trying to separate a string of characters into separate variables. The separation is to occur at the first numeric number not including blanks and 'AND'. I am not sure what is the best way to approach this as first thought was to first get rid of the 'AND' first. Any suggestions. Thanks in Advance.
data Have; length SrMasterID 8 FormulaText $ 150; format SrMasterID F12. FormulaText $char150.; informat SrMasterID best12. FormulaText $char150.; infile datalines4 dlm='\' missover dsd; input SrMasterID : best12. FormulaText : $char150.; datalines4; 1578\AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110 1581\BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S 1582\CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120 ;;;; Want: SrMasterID Department Course 1578 AM 12020S 1578 AM 13020S 1578 AM 14030S 1578 AM 15040S 1578 AM 18040S 1578 AM 110 1581 BAA 01125S 1581 BAA 03130S 1581 BAA 04125S 1581 BAA 05040S 1581 BAA 06030S 1581 BAA 07040S 1581 BAA 08540S 1581 BAA 08730S 1581 BAA 09040S 1581 BAA 10040S 1581 BAA 16040S 1582 CAR 01140S 1582 CAR 02120S 1582 CAR 03120S 1582 CAR 04130S 1582 CAR 05140S 1582 CAR 120
FYI - your data seems to have some random whitespace characters that I sort of cleaned up manually. If that's true in your actual data may need to use COMPRESS to clean it up ahead of time. You'll see this in the output with extra spaces around values.
data Have;
length SrMasterID 8 FormulaText $ 150;
format SrMasterID F12. FormulaText $char150.;
informat SrMasterID best12. FormulaText $char150.;
infile datalines4 dlm='\' TRUNCOVER dsd;
input SrMasterID FormulaText ;
datalines4;
1578\ AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110
1581\ BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S
1582\ CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120
;;;;
data want;
set have;
nWords = countw(formulaText);
do i=1 to nWords by 2;
term = trim(scan(formulaText, i));
firstDigit = anydigit(term);
Department = trim(substr(term, 1, firstDigit-1));
Course = trim(substr(term, firstDigit));
output;
end;
keep srMasterID term firstDigit Department Course;
run;
@twildone wrote:
Hi....I am trying to separate a string of characters into separate variables. The separation is to occur at the first numeric number not including blanks and 'AND'. I am not sure what is the best way to approach this as first thought was to first get rid of the 'AND' first. Any suggestions. Thanks in Advance.
data Have; length SrMasterID 8 FormulaText $ 150; format SrMasterID F12. FormulaText $char150.; informat SrMasterID best12. FormulaText $char150.; infile datalines4 dlm='\' missover dsd; input SrMasterID : best12. FormulaText : $char150.; datalines4; 1578\AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110 1581\BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S 1582\CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120 ;;;; Want: SrMasterID Department Course 1578 AM 12020S 1578 AM 13020S 1578 AM 14030S 1578 AM 15040S 1578 AM 18040S 1578 AM 110 1581 BAA 01125S 1581 BAA 03130S 1581 BAA 04125S 1581 BAA 05040S 1581 BAA 06030S 1581 BAA 07040S 1581 BAA 08540S 1581 BAA 08730S 1581 BAA 09040S 1581 BAA 10040S 1581 BAA 16040S 1582 CAR 01140S 1582 CAR 02120S 1582 CAR 03120S 1582 CAR 04130S 1582 CAR 05140S 1582 CAR 120
Hi Rezza….that was quick and it works perfectly...thanks a ton.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.