BookmarkSubscribeRSS Feed
podarum
Quartz | Level 8

Hi,

If I have a dataset by City, and date and price. How can I get a count of consecutive months that the price is missing by City.

Have

City     Month   Price

Tor       01          345

Tor       02           .

Tor       03           .

Tor       04          367

Tor       05          678

Mon     01           .

Mon     02         235

Mon     03          .

Mon     04          .

Win      01        234

Win      02       456

Win      03         .

Win      04         .

Win      05         .

Want

City     ConsMonth

Tor             2

Mon           1

Win           5

Thanks

14 REPLIES 14
Linlin
Lapis Lazuli | Level 10

How did you come up the number in data set want?

podarum
Quartz | Level 8

Well, first of all it would help if I gave out the right want results.:smileyblush: Sorry about that..

It should be

Want

City   ConsMonth

Tor             2

Mon           2

Win            3

How many consecutive blank months (shown by prices)... Again, sorry about that.

MikeZdeb
Rhodochrosite | Level 12

hi ... here's one way ...

data x;

input city :$3.  month price @@;

datalines;

Tor    01  345  Tor    02    .  Tor    03    .  Tor    04  367  Tor    05  678

Mon    01    .  Mon    02  235  Mon    03    .  Mon    04    .  Win    01  234

Win    02  456  Win    03    .  Win    04    .  Win    05    .

;

proc summary data=x missing nway;

by city price notsorted;

output out=stats;

run;

proc sql;

select city, max(_freq_) as consmonths from stats group by city;

quit;

city  consmonths

Mon            2

Tor            2

Win            3

Ksharp
Super User

Mike,

Your code is not right ,try the data below.

Tor   01      345
Tor   02       345
Tor   03       .
Tor   04      367
Tor   05      678
Mon 01       .
Mon 02     235
Mon 03      .
Mon 04      .
MikeZdeb
Rhodochrosite | Level 12

hi ... you are correct, I forgot the WHERE in SUMMARY ...

proc summary data=x missing nway;

by city price notsorted;

output out=stats (where=(missing(price)));

run;

proc sql;

select city, max(_freq_) as consmonths from stats group by city;

quit;

Ksharp
Super User

Mike,

First of all, I like your code.

But After the discussion with mkeintz at another session , I found an interesting problem, if missing value is a special like .A .Z  .Result is totally different.

Ksharp

data_null__
Jade | Level 19

You can fix that with a FORMAT.

proc format;

   value miss(default=8) ._-.Z = .;

   run;

proc summary data=x missing nway;

   by city price notsorted;

   output out=stats(where=(missing(price)));

   format price miss.;

   run;

MikeZdeb
Rhodochrosite | Level 12

hi ... correct again, so group all possible missing values with a format ......

data x;

input city :$3.  month price @@;

datalines;

Tor    01  345  Tor    02    .H  Tor    03    .  Tor    04  367  Tor    05  678

Mon    01    .  Mon    02   235  Mon    03    .  Mon    04    . 

Win    01  234  Win    02   456  Win    03    .  Win    04    .L Win    05   .L

;

proc format;

value price ._ - .Z = ' ';

run;

proc summary data=x missing nway;

by city price notsorted;

output out=stats (where=(missing(price)));

format price price.;

run;

proc sql;

select city, max(_freq_) as consmonths from stats group by city;

quit;


city  consmonths

Mon            2

Tor            2

Win            3

Ksharp
Super User

How about:

data have;
input City $    Month   Price ;
cards;
Tor       01          345
Tor       02           .
Tor       03           .
Tor       04          367
Tor       05          678
Mon     01           .
Mon     02         235
Mon     03          .
Mon     04          .
Win      01        234
Win      02       456
Win      03         .
Win      04         .
Win      05         .
;
run;
data have;
 set have;
 if city ne lag(city) or not missing(price) then n=0;
 if missing(price) then n+1;
run;

proc sql;
create table want(where=(n ne 0)) as
 select city,max(n) as n
  from have 
   group by city ;
quit;

Ksharp

Message was edited by: xia keshan

mkeintz
PROC Star

This is a case where the nested DOW loop approach, in combination with the NOTSORTED by option,  is a compact solution:

data want (keep=city nmonths);

  do until (last.city);

    do n=1 by 1 until (last.price);

      set have;

      by city price notsorted;

    end;

  nmonths=max(nmonths,ifn(missing(price),n,0));

  end;

run;;


I understood this problem to be finding the maximum number of consecutive months with a missing pri9ce.  This program assumes that the data are sorted by city and month.

But see data _null_'s followup below to treat the entire range of missing values (._, .A, .B ... .Z, .) as a single BY level for price.

Message was edited by: Mark Keintz

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
data_null__
Jade | Level 19

Add the GROUPFORMAT option to the BY statement and apply format to group all missing values together in the case of special missing.

data want (keep=city nmonths);

  do until (last.city);

    do n=1 by 1 until (last.price);

      set x;

      by city price notsorted groupformat;

    end;

  nmonths=max(nmonths,ifn(missing(price),n,0));

  format price miss.;

  end;

run;

podarum
Quartz | Level 8

Thanks all..  I like Mikeintz's method because it does not count other missing values that belong to the same city but consecutively numbered lower than the larger consecutive series..  for example if there are 5 missing months and then 6 priced months and then again 4 missing months. then it counts only 5 and not 5+4 = 9 ..

MikeZdeb
Rhodochrosite | Level 12

hi .. yes, that solution to your problem is very nice

but, in case you did not notice, the SUMMARY+SQL solution also does this

"for example if there are 5 missing months and then 6 priced months and then again 4 missing months. then it counts only 5 and not 5+4 = 9"

Linlin
Lapis Lazuli | Level 10

Ksharp's code also gave you 5, not 5+4=9.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 2473 views
  • 3 likes
  • 6 in conversation