BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
twildone
Pyrite | Level 9

Hi....I am trying to separate a string of characters into separate variables. The separation is to occur at the first numeric number not including blanks and 'AND'. I am not sure what is the best way to approach this as first thought was to first get rid of the 'AND' first. Any suggestions. Thanks in Advance.

 

 

 

data Have;
    length SrMasterID 8 FormulaText $ 150;
    format SrMasterID F12. FormulaText $char150.;
    informat SrMasterID  best12. FormulaText $char150.;
    infile datalines4 dlm='\' missover dsd;
    input SrMasterID : best12. FormulaText : $char150.;
datalines4;
1578\AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110
1581\BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S
1582\CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120
;;;;

Want:
SrMasterID	Department 	Course
1578	AM	12020S
1578	AM	13020S
1578	AM	14030S 
1578	AM	15040S
1578	AM	18040S
1578	AM	110
1581	BAA	01125S
1581	BAA	03130S
1581	BAA	04125S
1581	BAA	05040S
1581	BAA	06030S
1581	BAA	07040S
1581	BAA	08540S
1581	BAA	08730S
1581	BAA	09040S
1581	BAA	10040S
1581	BAA	16040S
1582	CAR	01140S
1582	CAR	02120S
1582	CAR	03120S
1582	CAR	04130S
1582	CAR	05140S
1582	CAR	120
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

FYI - your data seems to have some random whitespace characters that I sort of cleaned up manually. If that's true in your actual data may need to use COMPRESS to clean it up ahead of time. You'll see this in the output with extra spaces around values.

 

data Have;
    length SrMasterID 8 FormulaText $ 150;
    format SrMasterID F12. FormulaText $char150.;
    informat SrMasterID  best12. FormulaText $char150.;
    infile datalines4 dlm='\' TRUNCOVER dsd;
    input SrMasterID  FormulaText  ;
datalines4;
1578\ AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110
1581\ BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S
1582\ CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120
;;;;

data want;
set have;

nWords = countw(formulaText);

do i=1 to nWords by 2;
term = trim(scan(formulaText, i));
firstDigit = anydigit(term);
Department = trim(substr(term, 1, firstDigit-1));
Course = trim(substr(term, firstDigit));
output;
end;

keep srMasterID term firstDigit Department Course;
run;

@twildone wrote:

Hi....I am trying to separate a string of characters into separate variables. The separation is to occur at the first numeric number not including blanks and 'AND'. I am not sure what is the best way to approach this as first thought was to first get rid of the 'AND' first. Any suggestions. Thanks in Advance.

 

 

 

data Have;
    length SrMasterID 8 FormulaText $ 150;
    format SrMasterID F12. FormulaText $char150.;
    informat SrMasterID  best12. FormulaText $char150.;
    infile datalines4 dlm='\' missover dsd;
    input SrMasterID : best12. FormulaText : $char150.;
datalines4;
1578\AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110
1581\BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S
1582\CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120
;;;;

Want:
SrMasterID	Department 	Course
1578	AM	12020S
1578	AM	13020S
1578	AM	14030S 
1578	AM	15040S
1578	AM	18040S
1578	AM	110
1581	BAA	01125S
1581	BAA	03130S
1581	BAA	04125S
1581	BAA	05040S
1581	BAA	06030S
1581	BAA	07040S
1581	BAA	08540S
1581	BAA	08730S
1581	BAA	09040S
1581	BAA	10040S
1581	BAA	16040S
1582	CAR	01140S
1582	CAR	02120S
1582	CAR	03120S
1582	CAR	04130S
1582	CAR	05140S
1582	CAR	120

 

View solution in original post

2 REPLIES 2
Reeza
Super User

FYI - your data seems to have some random whitespace characters that I sort of cleaned up manually. If that's true in your actual data may need to use COMPRESS to clean it up ahead of time. You'll see this in the output with extra spaces around values.

 

data Have;
    length SrMasterID 8 FormulaText $ 150;
    format SrMasterID F12. FormulaText $char150.;
    informat SrMasterID  best12. FormulaText $char150.;
    infile datalines4 dlm='\' TRUNCOVER dsd;
    input SrMasterID  FormulaText  ;
datalines4;
1578\ AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110
1581\ BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S
1582\ CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120
;;;;

data want;
set have;

nWords = countw(formulaText);

do i=1 to nWords by 2;
term = trim(scan(formulaText, i));
firstDigit = anydigit(term);
Department = trim(substr(term, 1, firstDigit-1));
Course = trim(substr(term, firstDigit));
output;
end;

keep srMasterID term firstDigit Department Course;
run;

@twildone wrote:

Hi....I am trying to separate a string of characters into separate variables. The separation is to occur at the first numeric number not including blanks and 'AND'. I am not sure what is the best way to approach this as first thought was to first get rid of the 'AND' first. Any suggestions. Thanks in Advance.

 

 

 

data Have;
    length SrMasterID 8 FormulaText $ 150;
    format SrMasterID F12. FormulaText $char150.;
    informat SrMasterID  best12. FormulaText $char150.;
    infile datalines4 dlm='\' missover dsd;
    input SrMasterID : best12. FormulaText : $char150.;
datalines4;
1578\AM12020S AND AM13020S AND AM14030S AND AM15040S AND AM18040S AND AM110
1581\BAA01125S AND BAA03130S AND BAA04125S AND BAA05040S AND BAA06030S AND BAA07040S AND BAA08540S AND BAA08730S AND BAA09040S AND BAA10040S AND BAA16040S
1582\CAR01140S AND CAR02120S AND CAR03120S AND CAR04130S AND CAR05140S AND CAR120
;;;;

Want:
SrMasterID	Department 	Course
1578	AM	12020S
1578	AM	13020S
1578	AM	14030S 
1578	AM	15040S
1578	AM	18040S
1578	AM	110
1581	BAA	01125S
1581	BAA	03130S
1581	BAA	04125S
1581	BAA	05040S
1581	BAA	06030S
1581	BAA	07040S
1581	BAA	08540S
1581	BAA	08730S
1581	BAA	09040S
1581	BAA	10040S
1581	BAA	16040S
1582	CAR	01140S
1582	CAR	02120S
1582	CAR	03120S
1582	CAR	04130S
1582	CAR	05140S
1582	CAR	120

 

twildone
Pyrite | Level 9

Hi Rezza….that was quick and it works perfectly...thanks a ton.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 454 views
  • 2 likes
  • 2 in conversation