BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
india2016
Pyrite | Level 9

 

DATA KK;
input ID $ Month $9.;
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan

12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb

12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

 

 

 


I want, the new ID's which were added in new month (e.g. we can see '15SD15' and '16DK14' new ID's in Feb, '24FG12'  and '35HG16'  in Mar)which I want as output

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@india2016

Using a hash table is one way to go. Method REF() only adds a key if it doesn't exist already in the hash.

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

data _null_;
  if _n_=1 then
    do;
      dcl hash h1();
      h1.defineKey('id');
      h1.defineData('id','month','mnth');
      h1.defineDone();
    end;
  set kk end=last;
  by mnth;
  h1.ref();
  if last then h1.output(dataset:'want(where=(mnth ne 1))');
run;

proc print data=want noobs;
run;

  

 

View solution in original post

10 REPLIES 10
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Step 1 is to put month into a useable format, what you have doesn't sort.  Once you have that, just sort and output the first of each id:

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

proc sort data=kk;
  by id mnth;
run;

data want;
  set kk;
  by id;
  if first.id then output;
run;

Gives you a list of id's first time they occur.

india2016
Pyrite | Level 9

12KN15 Jan 1
14KP12 Jan 1
15SD15 Feb 2
16DK14 Feb 2
18KP16 Jan 1
24FG12 Mar 3
35HG16 Mar 3

 

 

I got these output. Actually, I didn't want first month's observations because I want only those observations which are new from the second month. Could we solve this by right join?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Don't know what you think right join will do?  Simply add a bit to the output:

data want;
  set kk;
  by id;
  if first.id and mnth ne 1 then output;
run;
india2016
Pyrite | Level 9

 

 

 

I want only those ID's which are not in previous month.

 

 

In "Feb" 

 

15SD15
16DK14

 

these are new ID's as compare to "Jan"

 

 

 

 

In "Mar"

 

24FG12 Mar
35HG16 Mar

 

these are new ID's As Compare to "Jan" and "Feb" 

 

 

 

 

and these 4 ID's wants as Output.

 

 

Patrick
Opal | Level 21

@india2016

Using a hash table is one way to go. Method REF() only adds a key if it doesn't exist already in the hash.

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

data _null_;
  if _n_=1 then
    do;
      dcl hash h1();
      h1.defineKey('id');
      h1.defineData('id','month','mnth');
      h1.defineDone();
    end;
  set kk end=last;
  by mnth;
  h1.ref();
  if last then h1.output(dataset:'want(where=(mnth ne 1))');
run;

proc print data=want noobs;
run;

  

 

india2016
Pyrite | Level 9

Thanks a lot, @Patrick@RW9

 

 please give me one solution 

 

 

how can I add one column ,

 

12KN15 Jan
14KP12 Jan
18KP16 Jan  3
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb  8
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar  15

 

 

which gives me a count of ID's in particular month.

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Do you have t have it only on the last record?  Reason is my SQL can be simply updated to add count, but it would be on every row.  Otherwise, simple datastep:

data want (drop=c);
  set have; 
  retain c;
  by month;
  if _n_=1 then c=1;
  else c=c+1;
  if last.month then new_col=c;
  else new_col=.;
run;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, ok I see, then several ways:

data kk;
  input id $ month $9.;
  mnth=month(input(cats('01',month,'2017'),date9.));
datalines;
12KN15 Jan
14KP12 Jan
18KP16 Jan
12KN15 Feb
14KP12 Feb
18KP16 Feb
15SD15 Feb
16DK14 Feb
12KN15 Mar
14KP12 Mar
18KP16 Mar
15SD15 Mar
16DK14 Mar
24FG12 Mar
35HG16 Mar
;
run;

proc sql;
  create table WANT as
  select  A.ID,
          A.MONTH,
          B.MONTH as TME
  from    KK A
  left join KK B
  on      A.ID=B.ID
  and     A.MNTH=B.MNTH+1
  where   A.MONTH ne "Jan" and B.MONTH="";
quit;
india2016
Pyrite | Level 9

@RW9

Why we are using0

 

A.MNTH=B.MNTH+1 in Join Condition ??

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its a way of saying, add to the data the previous month so A.MNTH is one month more than B.MNTH.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1143 views
  • 3 likes
  • 3 in conversation