BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SeanZ
Obsidian | Level 7

I have a dataset with variable id and time among many. I created a variable count to count the number of observations for each id in each time. For example, for id 1 and time 1, there are totally 30 observations, the count variable will show 1, 2, until 30, for id 1 time 2, there are totally 36 observations, count will be 1 to 36. Now I want to drop any id that has less than 36 observations. What code should I use? I tried to use where count=36, but apparently that does not work as it will only keep observations with count variable = 36. Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

proc sql;

create table want as

  select *

   from have

    group by id,time

     having count(*) ge 36 ;

quit;

Ksharp

View solution in original post

8 REPLIES 8
umashankersaini
Quartz | Level 8


Hi,

Try to use :

if count <36 then delete;

Regards

Uma Shanker Saini

SeanZ
Obsidian | Level 7

This cannot work b/c I want only ids that have full 36 observations. Please see example below.

id     time     count

1     1          1

1     1          2

...

1     1          30

1     2          1

1     2          2

...

1     2          36

I want to delete all obs with id=1 and time=1 since only 30 observations for id=1 and time=1. I want to keep all obs with id=1 and time=2.

umashankersaini
Quartz | Level 8

Hi,

rewriting my understanding..

you are having a data, It has 30 observation with ID=1 and time =1 and it also has 36 observation with ID=1 and time =2.

You just want to keep only those observation which have ID=1 and Time =2.

i am just writing only one if statement for this...

If Id=1 and time =2 then output datasetname; * it would create a new data set *

or

If id=1 and time=1 then delete;

else output DSN;

or

Data have;

set want;

if id=1 and time=2;

run;

Kindly correctly me, if i am wrong..

SeanZ
Obsidian | Level 7

HI, you are right, but I have a huge dataset with different id and time. It's impossible for me to write the way you showed. I cannot predict which id and time combination meets requirement.

Ksharp
Super User

proc sql;

create table want as

  select *

   from have

    group by id,time

     having count(*) ge 36 ;

quit;

Ksharp

Scott_Mitchell
Quartz | Level 8

Hi Sean,

Is this what you are trying to achieve?

PROC SQL;

CREATE TABLE WANT AS

SELECT A.*

FROM

  (SELECT *

  FROM WORK.HAVE

     ) A /*Creates an Alias for use in the where clause*/,

  (

     SELECT ID,

         TIME1,

         MAX(COUNT) AS MAX_COUNT

     FROM WORK.HAVE

  GROUP BY ID, TIME1

  HAVING CALCULATED MAX_COUNT >= 36

     ) B /*Creates an Alias for use in the where clause*/

  WHERE A.ID = B.ID

  AND A.TIME1 = B.TIME1

  ORDER BY ID, TIME1, COUNT

  ;

QUIT;

If you aren't familiar with SQL you could also use a SAS merge as follows, which is more efficient.

DATA WANT;

  MERGE HAVE (IN=A WHERE=(COUNT >= 36))

       HAVE (IN=B);

  BY ID TIME1;

  IF A AND B;

RUN;

I hope that this is what you were after.

Regards,

Scott

SeanZ
Obsidian | Level 7

Hi Scott. You are right in this way, but a bit long in coding compared to the one solution above. Thanks anyway.

kuridisanjeev
Quartz | Level 8

Hi @Seanz.

If i understand correctly here is the solution.

with the bellow code you will get observations that ID have full 4 observations.others can be deleted.

Data A;

input ID time;

cards;

1 1

1 1

1 1

1 2

1 2

1 2

1 2

1 3

1 3

;

run;

Data Aa;

set a;

by time;

if First.time then count=0;

count+1;

run;

Proc Sql;

Select Max(Time) into :Values separated by ' '  from aa

Group by Time

having max(Count) lt 4;

quit;

Data Aaa(drop=count);

set AA;

if time in (&values) then delete;

run;

Thanks &Regards.

Sanjeev.K

Message was edited by: sajeev kuridi

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 14427 views
  • 10 likes
  • 5 in conversation