BookmarkSubscribeRSS Feed
Toffeeman
Calcite | Level 5

Program 1:

 

DATA EUROPE_CAR;
SET SASHELP.CARS;
IF MAKE IN ('BMW');
IF MAKE EQ 'Audi' OR MAKE EQ 'BMW' THEN DO;
KEEP TYPE;
END;
IF MAKE EQ 'Audi' THEN DO;
KEEP Origin;
END;
RUN;

 

Program 2:

DATA EUROPE_CARS;
SET SASHELP.CARS;
IF MAKE IN ('Audi');
IF MAKE EQ 'BMW' THEN DO;
KEEP TYPE;
END;
IF MAKE EQ 'Audi' THEN DO;
KEEP Origin;
END;
RUN;

 

Question:

I expect program 1 to keep TYPE variable only while Program 2 to keep ORIGIN variable; both programs keep TYPE and ORIGIN.

 

Anyone can help? Thanks.

 

 

10 REPLIES 10
Reeza
Super User
The KEEP statement is a Global statement, not a conditional statement. Placing it in the IF condition doesn't impact the output.

Your initial IF statement limits your data set to only Audi cars though because it subsets the data so the second IF (BMW) will never be true.
Toffeeman
Calcite | Level 5

Is there any other way to keep the variables that i wanted base on the conditions on both programs?

LinusH
Tourmaline | Level 20

We might help you if you tell us why you want to do this.

As @Reeza said, KEEP is global. Meaning that executes regardless of conditions in the data step. Which is logical. The data step needs to know how the output data should look like, before creating and add observations to it.

 

What would you suspect that the output data set to like if there are makes BMW and Audi?

Perhaps you are confusing assignment logic with subsetting (where you use WHERE).

Data never sleeps
kannand
Lapis Lazuli | Level 10

Take a look at my comments and clarify if you can so we understand what you need:

 

Program 1:
 
DATA EUROPE_CAR;
SET SASHELP.CARS;

IF MAKE IN ('BMW');
/* =======================================================
The above statement will eliminate all other records that are not BMW. 
Which means, your "IF" statement below does not have to check for "Audi" as 
it has already been dropped out of the dataset. Clarify if
this is what you are expecting ======================================================= */ IF MAKE EQ 'Audi' OR MAKE EQ 'BMW' THEN DO; KEEP TYPE; /* This KEEP statement keeps the TYPE. Sure.
But you are not dropping MAKE. Therefore, you will have
both in the output. You need to clarify what you are
expecting in this step */ END; IF MAKE EQ 'Audi' THEN DO; KEEP Origin; /*same comment as above */ END; RUN; Program 2: /*======================================================= comments from Program:1 also apply for Program#2.
Please clarify so we can help ======================================================= */ DATA EUROPE_CARS; SET SASHELP.CARS; IF MAKE IN ('Audi'); IF MAKE EQ 'BMW' THEN DO; KEEP TYPE; END; IF MAKE EQ 'Audi' THEN DO; KEEP Origin; END; RUN;
Kannan Deivasigamani
Toffeeman
Calcite | Level 5

OK.  I have another situation:

 

Dataset:

Customer    HP_info MP_info Email_info    Home_Phone   Mobile_Phone   Email   

A1               Y           Y            Y                  123                  445                   xxx

A2               Y           Y            N                   119                  110                   

A3                N          Y            N                                          124

 

What I wanted is:

A1 dataset

Customer    HP_info MP_info Email_info    Home_Phone   Mobile_Phone   Email   

A1               Y           Y            Y                  123                  445                   xxx

 

A2 dataset

Customer    HP_info MP_info Email_info    Home_Phone   Mobile_Phone 

A2               Y           Y            N                   119                  110            

   

A3 dataset

Customer    HP_info MP_info Email_info    Mobile_Phone 

    A3                N          Y            N             124            

Patrick
Opal | Level 21

Both data sets and variables get created during the compilation phase of a data step but you only process data during the execution phase. That's why both data sets and variables can't get created during data step execution.

http://support.sas.com/documentation/cdl/en/lrcon/68089/HTML/default/viewer.htm#p08a4x7h9mkwqvn16jg3...

 

The way around this is by either dynamically create SAS code and then execute this SAS code as a next step or to create hash tables which can get created during exection phase (not the variables though). All these techniques are advanced and require quite a bit of coding.

 

It is normally not a good idea to split up data the way you want it. There is by group processing, where clauses and if statements to process data in chunks or conditionally.

data have;
  infile datalines truncover;
  input (Customer HP_info MP_info Email_info Home_Phone Mobile_Phone Email) ($);
  datalines;
A1 Y Y Y 123 445 xxx
A2 Y Y N 119 110 
A3 N Y N 124
run;

data  A1  A2(drop=Email)  A3;
  set have;
  if customer='A1' then
    output A1;
  else if customer='A2' then
    output A2;
  else if customer='A3' then
    output A3;
run;

 

Tom
Super User Tom
Super User

Can you explain WHY you want the output shown?

If it is predetermined then just code what you want to happen.

 

data a1 a2(drop=email) a3(drop=home_phone email);
  set have ;
  if customer='A1' then output A1;
  else if customer='A2' then output A2;
  else if customer='A3' then output A3;
run;

 If instead it is some type of report then look at the NOZERO option on the DEFINE statement in PROC REPORT.

data have ;
 length Customer $10 HP_info MP_info Email_info $1 Home_Phone Mobile_Phone $10 Email $20 ;
 input customer -- email;
cards;
A1 Y  Y  Y 123   445 xxx
A2 Y  Y  N 119   110 .  
A3 N  Y  N   .   124 .
;;;;
proc report data=have ;
  by customer ;
  define _all_ / display nozero ;
run;
Toffeeman
Calcite | Level 5

Here is the 2nd situation i highlighted earlier.  Below only show 3 customers but i could have 100 customers, and my end result will be in excel listing.  All info from the customers are unknown different combinations and we only know if the info is available through N or Y indicator.

 

Dataset:

Customer    HP_info MP_info Email_info    Home_Phone   Mobile_Phone   Email   

A1               Y           Y            Y                  123                  445                   xxx

A2               Y           Y            N                   119                  110                   

A3                N          Y            N                                          124

 

What I wanted is:

A1 dataset

Customer    HP_info MP_info Email_info    Home_Phone   Mobile_Phone   Email   

A1               Y           Y            Y                  123                  445                   xxx

 

A2 dataset

Customer    HP_info MP_info Email_info    Home_Phone   Mobile_Phone 

A2               Y           Y            N                   119                  110            

   

A3 dataset

Customer    HP_info MP_info Email_info    Mobile_Phone 

    A3                N          Y            N             124            

Tom
Super User Tom
Super User

So in that case with 3 binary variables there 8 possible combinations. So you can split your data into 8 separate datasets if you want.

 

data 
  yyy
  yyn (drop=Email_info)
  yny (drop=MP_info)
  ynn (drop=MP_info Email_info)
  nyy (drop=HP_info)
  nyn (drop=HP_info Email_info)
  nny (drop=HP_info MP_info)
  nnn (drop=HP_info MP_info Email_info)
;
  set have;
  select (cats(of HP_info MP_info Email_info)) ;
  when ('YYY') output yyy;
  when ('YYN') output yyn;
  when ('YNY') output yny;
  when ('YNN') output ynn;
  when ('NYY') output nyy;
  when ('NYN') output nyn;
  when ('NNY') output nny;
  when ('NNN') output nnn;
  end;
run;

 

NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.YYY has 1 observations and 7 variables.
NOTE: The data set WORK.YYN has 1 observations and 6 variables.
NOTE: The data set WORK.YNY has 0 observations and 6 variables.
NOTE: The data set WORK.YNN has 0 observations and 5 variables.
NOTE: The data set WORK.NYY has 0 observations and 6 variables.
NOTE: The data set WORK.NYN has 1 observations and 5 variables.
NOTE: The data set WORK.NNY has 0 observations and 5 variables.
NOTE: The data set WORK.NNN has 0 observations and 4 variables.

 

Astounding
PROC Star

There are ways for macro language to do something similar to what you ask, but it is simpler to create multiple data sets:

 

data audi_only (keep=origin)

        bmw_only (keep=type);

   set sashelp.cars;

   if make='Audi' then output audi_only;

   else output bmw_only;

run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1569 views
  • 0 likes
  • 7 in conversation