BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brulard
Pyrite | Level 9

hi,

 

I have dataset containing  IDs, with different rank, count, and dates. 

 

If you could offer a tip to meet the following: if condition true, then flag 1 for all same IDs.

 

Example HAVE:

 

Id  Rank Count Date

001  01     01      2017

001  02     01      9999

002  01     03     9999

003  01    02      2018

003  02    02     9999

004  01    02       9999

 

Example WANT

New Var Conditions: 

midflag:where year 9999 and count<3  /*optional variable*/

endflag:where midlflag=1 and ID (appears once or is duplicate from different row) 

 

Id  Rank Count Date       MIDflag  Endflag

001  01     01      2017      0           1

001  02     01      9999      1           1

002  01     03     9999      0            0

003  01    02      2018      0            1

003  02    02     9999      1           1

004  01    02       9999     1         1

 

Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You can interleave the dataset with itself on an ID by ID basis, such that for each id:

  1. The subset of cases with date=9999 and count<3 are read an extra time prior to reading all cases for the id.  If any such preliminary observations are found, then ENDFLAG must be 1
    1. RETAIN the ENDFLAG
    2. but don't keep the above records (they are going to be re-read).  Hence the subsetting IF statement  "IF INKEEP;"
  2. Followed by reading all cases for the ID, with MIDFLAG set record by record.

 

data have;
  input Id  Rank Count Date;
datalines;
001  01    01     2017
001  02    01     9999
002  01    03     9999
003  01    02     2018
003  02    02     9999
004  01    02     9999
run;

data want;
  set have (where=(date=9999 and count<3) in=inmid)
      have (in=inkeep);
  by id;

  if first.id then ENDflag=0;
  if inmid then ENDflag=1;
  retain endflag;

  if inkeep;
  if date=9999 and count<3 then midflag=1;
  else midflag=0;
run;

 

 

    

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
Reeza
Super User

Is the count in the criteria (count<3) the same as the count in the data set, which looks to be a character variable (leading zeros?)?

 

brulard
Pyrite | Level 9
it is a number formatted 1.
mkeintz
PROC Star

You can interleave the dataset with itself on an ID by ID basis, such that for each id:

  1. The subset of cases with date=9999 and count<3 are read an extra time prior to reading all cases for the id.  If any such preliminary observations are found, then ENDFLAG must be 1
    1. RETAIN the ENDFLAG
    2. but don't keep the above records (they are going to be re-read).  Hence the subsetting IF statement  "IF INKEEP;"
  2. Followed by reading all cases for the ID, with MIDFLAG set record by record.

 

data have;
  input Id  Rank Count Date;
datalines;
001  01    01     2017
001  02    01     9999
002  01    03     9999
003  01    02     2018
003  02    02     9999
004  01    02     9999
run;

data want;
  set have (where=(date=9999 and count<3) in=inmid)
      have (in=inkeep);
  by id;

  if first.id then ENDflag=0;
  if inmid then ENDflag=1;
  retain endflag;

  if inkeep;
  if date=9999 and count<3 then midflag=1;
  else midflag=0;
run;

 

 

    

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
brulard
Pyrite | Level 9
hi,
thank you for the tip, + explanation... as an aside, if there are any key books or articles you have to recommend, please let me know!
novinosrin
Tourmaline | Level 20
data have;
  input Id  Rank Count Date;
datalines;
001  01    01     2017
001  02    01     9999
002  01    03     9999
003  01    02     2018
003  02    02     9999
004  01    02     9999
run;

proc sql;
create table want1 as
select *,(date=9999 and count<3) as MIDflag,max(calculated MIDflag) as Endflag
from have
group by Id
order by id, rank;
quit;
brulard
Pyrite | Level 9
thank you for the tip! very helpful

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3938 views
  • 2 likes
  • 4 in conversation