BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASTad
Fluorite | Level 6

Hi All,

 

I'm trying to write a data step with IF THEN ELSE statements.

 

I have 2 columns and I'm trying to create a third column based on these two columns.

table:

column 1   column2   new_column 

A                    1                  X

A                    4                   X

A                    5                  X

B                    2                   Y

B                    3                    Y

B                    8                    Y

C                    1                    X,Y

C                    2                   X,Y

C                    7                     X,Y

 

IF a value in column 1 has value '1' in column 2 then the new column should be X and all the other rows should be X related to that column1 value A.

 

IF a value in column 1 has value '2' in column 2 then the new column should be Y and all the other rows should be Y related to that column1 value B.

 

IF a value in column 1 has value ('1','2') in column 2 then the new column should be (X,Y) and all the other rows should be X related to that column1 value C.

 

Could you help me with the syntax for this logic. The data set I'm working on had more than 300,000 rows. Thank you.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

This is a little more difficult, because you have to base the result on more than one observation.  But then you have to go back and insert the proper result on previous observation(s).

 

Here's one way:

 

proc sort data=have;

by column1 column2;

run;

 

data want;

   length new_column $ 3;

   do until (last.column1);

      set have;

      by column1;

      if column2=1 then new_column='X';

      else if column2=2 then do;

         if new_column=' ' then new_column='Y';

         else if new_column='X' then new_column = 'X,Y';

      end;

   end;

   do until (last.column1);

      set have;

      by column1;

      output;

   end;

run;

 

The top loop examines all records for a value of COLUMN1, and creates NEW_COLUMN.  The bottom loop reads those same records back in, and outputs (including the value for NEW_COLUMN).

View solution in original post

3 REPLIES 3
Astounding
PROC Star

This is a little more difficult, because you have to base the result on more than one observation.  But then you have to go back and insert the proper result on previous observation(s).

 

Here's one way:

 

proc sort data=have;

by column1 column2;

run;

 

data want;

   length new_column $ 3;

   do until (last.column1);

      set have;

      by column1;

      if column2=1 then new_column='X';

      else if column2=2 then do;

         if new_column=' ' then new_column='Y';

         else if new_column='X' then new_column = 'X,Y';

      end;

   end;

   do until (last.column1);

      set have;

      by column1;

      output;

   end;

run;

 

The top loop examines all records for a value of COLUMN1, and creates NEW_COLUMN.  The bottom loop reads those same records back in, and outputs (including the value for NEW_COLUMN).

LinusH
Tourmaline | Level 20
Since your logic is a bit awkward you need to tell the real data and business requirement logic, otherwise any sugestion might not work in your real task. Data set sort order and business keys...
Data never sleeps
SASTad
Fluorite | Level 6

Thank you so much.

 

Best.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1405 views
  • 1 like
  • 3 in conversation