DATA Step, Macro, Functions and more

delete rows

Accepted Solution Solved
Reply
Contributor
Posts: 52
Accepted Solution

delete rows

Hi,

 

How do I delete rows with less than 60 observations with the following type of data set.

Fundid200801200802200803200804200805200806200807200808
1        
2        
3        
4        
5        

Accepted Solutions
Solution
‎04-13-2018 07:30 PM
Esteemed Advisor
Posts: 5,482

Re: delete rows

Count the number of non-missing numeric variables :

 

data want;
set have;
if n(of _numeric_) >= 61; /* Including FundId, if FundId is numeric */
run;
PG

View solution in original post


All Replies
Super User
Posts: 23,332

Re: delete rows

Observation is typically a row, so the way you're using that term doesn't align. 

 

How are you defining an observation? How would you calculate how many obs in a data set to implement the rule?

Can you show an expanded sample data set and an illustration of what you would expect as output?

 

 

Contributor
Posts: 52

Re: delete rows

For example, I have the following dataset,

Fundid

200801

200802

200803

200804

200805

200806

200807

200808

1

 3

4 5

6 7

 8

 

 

 

2

 56

 23

 6

 3

 4

56 

 

 

3

 65

 87

 3

 6

 8

23 

 

 

4

 

 

 

 

 

 

 

 

5

 

 

 

 

 

 

 

 

 

 I would like to keep those fundid's with atleast 6 observations and delete those with less than 6 observations. So I would want the output to look like this,

Fundid

200801

200802

200803

200804

200805

200806

200807

200808

2

 56

 23

 6

 3

 4

56 

 

 

3

 65

 87

 3

 6

 8

23 

 

 

Super User
Posts: 23,332

Re: delete rows

Ok, my translation would be to : You want observations (rows) with 6 non-missing values in the date variables (200801-200808). 

 

1. Create an array of the date variables

2. Use N() function to determine the number of non-missing values

3. If results #2 < 6 then delete. 

 

array dates(*) Month200802 - Month201202; /*your column names are not valid SAS names so I have no idea how you've named them*/

if n(of dates(*)) < 6 then delete;

PS. Observations are rows, variables are columns, variable values are the elements.  Although I managed to figure out what you meant, your terminology is non-standard that others may not be able to follow.

 


@Amalik wrote:

For example, I have the following dataset,

Fundid

200801

200802

200803

200804

200805

200806

200807

200808

1

 3

4 5

6 7

 8

 

 

 

2

 56

 23

 6

 3

 4

56 

 

 

3

 65

 87

 3

 6

 8

23 

 

 

4

 

 

 

 

 

 

 

 

5

 

 

 

 

 

 

 

 

 

 I would like to keep those fundid's with atleast 6 observations and delete those with less than 6 observations. So I would want the output to look like this,

Fundid

200801

200802

200803

200804

200805

200806

200807

200808

2

 56

 23

 6

 3

 4

56 

 

 

3

 65

 87

 3

 6

 8

23 

 

 


 

Solution
‎04-13-2018 07:30 PM
Esteemed Advisor
Posts: 5,482

Re: delete rows

Count the number of non-missing numeric variables :

 

data want;
set have;
if n(of _numeric_) >= 61; /* Including FundId, if FundId is numeric */
run;
PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 134 views
  • 0 likes
  • 3 in conversation