DATA Step, Macro, Functions and more

Understand Each Conditions Affect on My Data Set

Reply
Super Contributor
Posts: 395

Understand Each Conditions Affect on My Data Set

Hello everyone,

 

I have a sample code as below and I have few condition statements to decrease size of the data set. At this stage, my purpose is to get effect of conditions. What I mean is to find out how much percentage/count of decrease conditions seperately.

 

Here is my sample code;

 

Data Have;
Length CustomerID 8 ModelYear $ 4 Vehicle $ 20;
Infile Datalines Missover;
Input CustomerID ModelYear Vehicle;
Datalines;
001 2013 Ford
002 2014 Citroen
003 2015 Volkswagen
004 2016 Audi
005 2013 BMW
006 2013 Ford
007 2014 Citroen
008 2015 Volkswagen
009 2016 Audi
010 2013 BMW
011 2013 Ford
012 2014 Citroen
013 2015 Volkswagen
014 2016 Audi
015 2013 BMW
016 2013 Ford
017 2014 Citroen
018 2015 Volkswagen
019 2016 Audi
020 2013 BMW
;
Run;
Proc Sql;
Create Table Want As
Select *
From Have
/*First condition*/Where Vehicle NOT IN('Ford','BMW') 
/*Second condition*/And ModelYear In('2015'); 
Quit;

And here is my desired output;

Sample.png

What SAS code should I write to get my desired output?

 

Thank you,

Super User
Posts: 19,815

Re: Understand Each Conditions Affect on My Data Set

They're not mutually exclusive though. A record could meet both conditions. 

 

For this use a SQL query. 

 

Select 

sum(condition)  as condition1,

sum(condition) as condition2 

from have ;

quit; 

Super Contributor
Posts: 395

Re: Understand Each Conditions Affect on My Data Set

Thank you so much but this statement works like true/false logic. I mean if the condition is true then it brings "1" if it is false brings "0". How can I display these values as percentage and count in Proc Sql statement?

 

Thank you

Super User
Posts: 19,815

Re: Understand Each Conditions Affect on My Data Set

For a binary variable 

 

SUM -> count

MEAN -> percentage 

Super Contributor
Posts: 395

Re: Understand Each Conditions Affect on My Data Set

Thank you so much Smiley Happy

PROC Star
Posts: 7,474

Re: Understand Each Conditions Affect on My Data Set

Since you don't have any 2015 Ford or BMWs, the following will come close to what you want. You could use proc tabulate to get the precise output you indicated.

 

Data Have;
Length CustomerID 8 ModelYear $ 4 Vehicle $ 20;
Infile Datalines Missover;
Input CustomerID ModelYear Vehicle;
Datalines;
001 2013 Ford
002 2014 Citroen
003 2015 Volkswagen
004 2016 Audi
005 2013 BMW
006 2013 Ford
007 2014 Citroen
008 2015 Volkswagen
009 2016 Audi
010 2013 BMW
011 2013 Ford
012 2014 Citroen
013 2015 Volkswagen
014 2016 Audi
015 2013 BMW
016 2013 Ford
017 2014 Citroen
018 2015 Volkswagen
019 2016 Audi
020 2013 BMW
;
Run;
proc format;
value cond
1='Vehicle NOT IN('Ford','BMW')'
2="ModelYear in('2015')"
other=.
;
run;
data want;
set have;
if vehicle in ('Ford','BMW') then condition=1;
else if ModelYear in ('2015') then condition=2;
run;

proc freq data=want;
tables condition/missing nocum;
format condition cond.;
run;

 

HTH,

Art, CEO, AnalystFinder.com

 

Super User
Posts: 19,815

Re: Understand Each Conditions Affect on My Data Set


art297 wrote:

Since you don't have any 2015 Ford or BMWs, the following will come close to what you want. You could use proc tabulate to get the precise output you indicated.

 

 


@art297 I no longer trust the 'sample' data to reflect the actual situation, experience says the next question is what if there's multiples/by groups/overlaps Smiley Happy

PROC Star
Posts: 7,474

Re: Understand Each Conditions Affect on My Data Set

@Reeza: I agree, but that's also something newbees have to learn.

 

Ask a Question
Discussion stats
  • 7 replies
  • 229 views
  • 2 likes
  • 3 in conversation