BookmarkSubscribeRSS Feed
Mirisage
Obsidian | Level 7

Hi SAS Forum,

I have this dataset (readily readable).

data disaster;

informat date date9.;

input country    Eco_zone  $ 5-7 date   Disaster $22-29;

format date date9.;

datalines;

10  111    30SEP2010 Break

10  111    31OCT2010 Pandemic

10  111    30NOV2010 Break

10  111    30JUN2011 Pandemic

10  111    01JAN2012 Tonardo

40  333    30NOV2010 Break

40  333    31DEC2010 Break

40  333    30NOV2010 Flood

100 111    30APR2011 Tonardo

100 111    31MAY2011 Flood

100 111    31JUL2011 Break

100 111    31AUG2011 Pandemic

80  111    31DEC2011 Flood

40  666    03JAN2012 Break

40  999    03JAN2012 Pandemic

90  111    31DEC2011 Break

70  111    31OCT2010 Break

70  111    30NOV2010 Break

70  111    30JUN2011 Break

;

run;

Below are the varaibles that consititute a unique record. 

country      Eco_zone       date

Q:

I need to identify follwing records:

1. First “Break” incident in a given country

2. First consecutive “Break” followed by “Pandemic” pair

Answer:

Blue highlighted shueld be the correct records.

1

30SEP2010

10

111

Break

2

31OCT2010

10

111

Pandemic

3

30NOV2010

10

111

Break

4

30JUN2011

10

111

Pandemic

5

01JAN2012

10

111

Tonardo

6

30NOV2010

40

333

Break

7

30NOV2010

40

333

Flood

8

31DEC2010

40

333

Break

9

03JAN2012

40

666

Break

10

03JAN2012

40

999

Pandemic

11

31OCT2010

70

111

Break

12

30NOV2010

70

111

Break

13

30JUN2011

70

111

Break

14

31DEC2011

80

111

Flood

15

31DEC2011

90

111

Break

16

30APR2011

100

111

Tonardo

17

31MAY2011

100

111

Flood

18

31JUL2011

100

111

Break

19

31AUG2011

100

111

Pandemic

My following (poor) code identify only one record correctly.

Could you help me to identify other qualified records too.

proc sort data=disaster out=one;

by country    Eco_zone  date;

run;

proc print;run;

data two;

set  one;

by country    Eco_zone  date;

if (first.country and last.country) and disaster = "Break" then want_flag =1;

run;

Thank a lot.

Mirisa

6 REPLIES 6
Haikuo
Onyx | Level 15

"I have this dataset (readily readable)."  You have no idea how much I appreciate this! Sometimes I spend more time to get the data in than working on the question itself.

data disaster;

     informat date date9.;

     input country    Eco_zone $ 5-7 date Disaster $22-29;

     format date date9.;

     datalines;

10  111 30SEP2010 Break

10  111 31OCT2010 Pandemic

10  111 30NOV2010 Break

10  111 30JUN2011 Pandemic

10  111 01JAN2012 Tonardo

40  333 30NOV2010 Break

40  333 31DEC2010 Break

40  333 30NOV2010 Flood

100 111    30APR2011 Tonardo

100 111    31MAY2011 Flood

100 111    31JUL2011 Break

100 111    31AUG2011 Pandemic

80  111 31DEC2011 Flood

40  666 03JAN2012 Break

40  999 03JAN2012 Pandemic

90  111 31DEC2011 Break

70  111 31OCT2010 Break

70  111 30NOV2010 Break

70  111 30JUN2011 Break

;

run;

proc sort data=disaster out=have;

     by country Eco_zone date;

run;

data want;

     set have nobs= nobs;

     by country;

     if _n_<nobs then

           set have(firstobs=2 keep= disaster rename=disaster=_dis);

     if first.country then

           call missing(_f, _f1);

     if last.country then

           call missing(_dis);

     if disaster='Break' and _dis='Pandemic' then

           _f+1;

     if disaster='Break' then

           _f1+1;

     if (_f1=1 and disaster='Break') or (_f=1 and  (disaster='Break' and _dis='Pandemic' or disaster='Pandemic' and lag(disaster)='Break')) then

           flag=1;

     drop _:;

run;

Regards,

Haikuo

updated after reading more of your question.

Mirisage
Obsidian | Level 7

Hi Hai.kuo,

Great! Thank you very much.

You may have no idea how much I appreciate this, I have been trying this several days during my commute to and from the workplace.

Thanks again and I have a lot to study in this code.

Regards

Mirisa

P.S. I tried several times to click the "green" button but it did not appear in my computer.

Ksharp
Super User
data disaster;
informat date date9.;
input country    Eco_zone  $ 5-7 date   Disaster $22-29;
format date date9.;
datalines;
10  111    30SEP2010 Break
10  111    31OCT2010 Pandemic
10  111    30NOV2010 Break
10  111    30JUN2011 Pandemic
10  111    01JAN2012 Tonardo
40  333    30NOV2010 Break
40  333    31DEC2010 Break
40  333    30NOV2010 Flood
100 111    30APR2011 Tonardo
100 111    31MAY2011 Flood
100 111    31JUL2011 Break
100 111    31AUG2011 Pandemic
80  111    31DEC2011 Flood
40  666    03JAN2012 Break
40  999    03JAN2012 Pandemic
90  111    31DEC2011 Break
70  111    31OCT2010 Break
70  111    30NOV2010 Break
70  111    30JUN2011 Break
;
run;
proc sort data=disaster ; by country Eco_zone date;run;
data want(drop=foundP foundB);
 set disaster;
 by country Eco_zone;  
 retain foundB foundP 0;
 if first.Eco_zone then do; foundB=0 ;foundP=0; end;
 if (not foundB and disaster='Break') then do;output;foundB=1;end;
 if (not foundP and disaster='Pandemic' and lag(disaster)='Break' and lag(Eco_zone)=Eco_zone  and lag(country)=country )   /* have two group variables*/
then do;
  output;foundP=1;
end;
run;

Xia Keshan

消息编辑者为:xia keshan

Mirisage
Obsidian | Level 7

Hi Ksharp,

Your code worked very well.

Thank you very much.

I am experimenting to understand how do these magical logics do the job.

Thanks again!

Mirisa

Ksharp
Super User

One more thing need to remind you . if data like :

10  111    30SEP2010 Break

10  111    31OCT2010 Break

10  111    30NOV2010 Pandemic

10  111    30JUN2011 Pandemic

10  111    01JAN2012 Tonardo

my code will get

10  111    30SEP2010 Break

10  111    31OCT2010 Break

10  111    30NOV2010 Pandemic

10  111    30JUN2011 Pandemic

10  111    01JAN2012 Tonardo

Mirisage
Obsidian | Level 7

Thanks Ksharp for this additional note.

Regards

Mirisa

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1132 views
  • 1 like
  • 3 in conversation