Solved: Re: Month wise new observation

india2016 · Posted 06-19-2017 06:08 AM

DATA KK;
input ID $ Month $9.;
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan

12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb

12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

I want, the new ID's which were added in new month (e.g. we can see '15SD15' and '16DK14' new ID's in Feb, '24FG12' and '35HG16' in Mar)which I want as output

Patrick · Posted 06-19-2017 07:21 AM

@india2016

Using a hash table is one way to go. Method REF() only adds a key if it doesn't exist already in the hash.

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

data _null_;
  if _n_=1 then
    do;
      dcl hash h1();
      h1.defineKey('id');
      h1.defineData('id','month','mnth');
      h1.defineDone();
    end;
  set kk end=last;
  by mnth;
  h1.ref();
  if last then h1.output(dataset:'want(where=(mnth ne 1))');
run;

proc print data=want noobs;
run;

View solution in original post

RW9 · Posted 06-19-2017 06:25 AM

Step 1 is to put month into a useable format, what you have doesn't sort. Once you have that, just sort and output the first of each id:

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

proc sort data=kk;
  by id mnth;
run;

data want;
  set kk;
  by id;
  if first.id then output;
run;

Gives you a list of id's first time they occur.

india2016 · Posted 06-19-2017 06:33 AM

12KN15 Jan 1
14KP12 Jan 1
15SD15 Feb 2
16DK14 Feb 2
18KP16 Jan 1
24FG12 Mar 3
35HG16 Mar 3

I got these output. Actually, I didn't want first month's observations because I want only those observations which are new from the second month. Could we solve this by right join?

RW9 · Posted 06-19-2017 06:50 AM

Don't know what you think right join will do? Simply add a bit to the output:

data want;
  set kk;
  by id;
  if first.id and mnth ne 1 then output;
run;

india2016 · Posted 06-19-2017 07:06 AM

I want only those ID's which are not in previous month.

In "Feb"

15SD15
16DK14

these are new ID's as compare to "Jan"

In "Mar"

24FG12 Mar
35HG16 Mar

these are new ID's As Compare to "Jan" and "Feb"

and these 4 ID's wants as Output.

Patrick · Posted 06-19-2017 07:21 AM

@india2016

Using a hash table is one way to go. Method REF() only adds a key if it doesn't exist already in the hash.

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

data _null_;
  if _n_=1 then
    do;
      dcl hash h1();
      h1.defineKey('id');
      h1.defineData('id','month','mnth');
      h1.defineDone();
    end;
  set kk end=last;
  by mnth;
  h1.ref();
  if last then h1.output(dataset:'want(where=(mnth ne 1))');
run;

proc print data=want noobs;
run;

india2016 · Posted 06-19-2017 07:44 AM

Thanks a lot, @Patrick @RW9

please give me one solution

how can I add one column ,

12KN15 Jan
14KP12 Jan
18KP16 Jan 3
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb 8
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar 15

which gives me a count of ID's in particular month.

RW9 · Posted 06-19-2017 07:48 AM

Do you have t have it only on the last record? Reason is my SQL can be simply updated to add count, but it would be on every row. Otherwise, simple datastep:

data want (drop=c);
  set have; 
  retain c;
  by month;
  if _n_=1 then c=1;
  else c=c+1;
  if last.month then new_col=c;
  else new_col=.;
run;

RW9 · Posted 06-19-2017 07:35 AM

Ah, ok I see, then several ways:

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

proc sql;
  create table WANT as
  select  A.ID,
          A.MONTH,
          B.MONTH as TME
  from    KK A
  left join KK B
  on      A.ID=B.ID
  and     A.MNTH=B.MNTH+1
  where   A.MONTH ne "Jan" and B.MONTH="";
quit;

india2016 · Posted 06-20-2017 02:29 AM

@RW9

Why we are using0

A.MNTH=B.MNTH+1 in Join Condition ??

RW9 · Posted 06-20-2017 03:54 AM

Its a way of saying, add to the data the previous month so A.MNTH is one month more than B.MNTH.

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away