BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
apeeape
Fluorite | Level 6
I have a set of data, the variable unit_number is supposed to contain only the numeric data, and unit_alphabet is supposed to contain only the one character data. However the original data may contain unit_number information in the unit_alpha variable to indicate that there are two unit_number information. If there is more than one unit_number information, there will be a comma followed by number in the unit_alphabet variable to separate the unit_number information. And I want each row of data only contain one unit number information, if the unit_number information is more than one (so there is a comma in the unit_alphabet variable followed by number), I want to add another row of data to indicate this. And it is possible there there are two character in the unit_alphabet variable, I want the first character to be assigned to the first unit_number, and the second character to be assigned to the second unit_number.

The data set I have:
Data have;
Input Name $ Sex $ Unit_number $ unit_alphabet$: $20.;
cards;
Amy F 103 ,104
Bob M 203 A
Carolin F 204 B,205B
David M 208 ,210
Emily F 513 C,515
Run;

I want the data set looks like:
Name Sex Unit_Number Unit_Alphabet
Amy F 103
Amy F 104
Bob M 203 A
Carolin F 204 B
Carolin F 205 B
David M 208
David M 210
Emily F 513 C
Emily F 515

Thank you!!
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Try this: (it should work even if there are more than 2 commas... even though you say that won't happen, you get the functionality at no extra charge!)

 

Data have;
Input Name $ Sex $ units & $100.;
cards;
Amy F 103 ,104
Bob M 203 A
Carolin F 204 B,205B
David M 208 ,210
Emily F 513 C,515
Run;
data want;
    set have;
    length thisstr $ 100;
    units=compress(units);
    thisstr='ZZZZ';
    count=0;
    do while(not missing(thisstr));
        count=count+1;
        thisstr=scan(units,count,',');
        call missing(unit_number);
        unit_alpha=' ';
        where = anyalpha(thisstr);
        if where=0 then unit_number=thisstr;
        else do;
            unit_number=substr(thisstr,1,where-1);
            unit_alpha=substr(thisstr,where);
        end;
        if not missing(thisstr) then output;
    end;
    keep name sex unit_number unit_alpha;
run;

        
--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

So, is this really the way the actual data appears, with inconsistent spacing before the comma (sometimes there is a space and sometimes not) and no indication of a missing value for some records; or is it just poor typing?

 

Would simply removing the blanks after the person's sex (in the unit number or unit alphabet variable) cause problems?

 

Can there ever be more than one comma, indicating the result would have to wind up on 3 or more lines?

--
Paige Miller
apeeape
Fluorite | Level 6
Yes this is the way the actual data appears, all the data after the space is under the unit_alphabet variable. Eg(“,104”,”B,205B”,”,210”). Some data may not have unit_alphabet records. The maximum number of comma is one, so the result will not be more than 2 lines for each data. Thank you!
PaigeMiller
Diamond | Level 26

Try this: (it should work even if there are more than 2 commas... even though you say that won't happen, you get the functionality at no extra charge!)

 

Data have;
Input Name $ Sex $ units & $100.;
cards;
Amy F 103 ,104
Bob M 203 A
Carolin F 204 B,205B
David M 208 ,210
Emily F 513 C,515
Run;
data want;
    set have;
    length thisstr $ 100;
    units=compress(units);
    thisstr='ZZZZ';
    count=0;
    do while(not missing(thisstr));
        count=count+1;
        thisstr=scan(units,count,',');
        call missing(unit_number);
        unit_alpha=' ';
        where = anyalpha(thisstr);
        if where=0 then unit_number=thisstr;
        else do;
            unit_number=substr(thisstr,1,where-1);
            unit_alpha=substr(thisstr,where);
        end;
        if not missing(thisstr) then output;
    end;
    keep name sex unit_number unit_alpha;
run;

        
--
Paige Miller
apeeape
Fluorite | Level 6
This works out perfectly!! Thank you very much for your help!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 460 views
  • 3 likes
  • 2 in conversation