How to delete some rows in the data set?

YangYY · Posted 09-25-2019 01:26 AM

1

AF

Afghanistan

0.213

0.188

0.0997

0.0891

0.08

0.0727

0.066

0.0597

0.0552

0.0423

0.0385

0.039

0.0487

0.0518

0.0394

0.0529

0.0637

0.0854

0.154

0.242

0.294

0.412

0.35

2

AL

Albania

1.68

1.31

0.776

0.732

0.613

0.672

0.652

0.499

0.565

0.958

0.968

1.03

1.2

1.38

1.34

1.38

1.28

1.3

1.46

1.48

1.56

1.79

1.68

3

DZ

Algeria

2.97

2.98

2.95

2.96

3.05

3.3

2.92

3.53

2.99

2.82

2.67

2.81

2.83

2.7

3.22

2.99

3.19

3.16

3.42

3.3

3.29

3.46

4

AS

American Samoa

.

5

AD

Andorra

7.47

7.18

6.91

6.74

6.49

6.66

7.07

7.24

7.66

7.98

8.02

7.79

7.59

7.32

7.36

7.3

6.75

6.52

6.43

6.12

5.87

5.92

6

AO

Angola

0.42

0.405

0.401

0.431

0.281

0.769

0.712

0.489

0.471

0.574

0.58

0.573

0.721

0.498

0.996

0.98

1.1

1.2

1.18

1.23

1.24

1.25

1.33

Hi,

As you can see those are some of my output data. Now I want to delete all the rows which shows "." in the columns(row 4 is just one of them. There are more rows like row 4 need to be deleted.). Can anyone tell me how to do that?

Anything will be appreciated!!

Thank you!!

KachiM · Posted 09-25-2019 02:02 AM

@YangYY

If your input data set is "HAVE" then use:

data want;
   set have;
   array k _numeric_;
   if dim(k) = n(of k[*]);
run;

YangYY · Posted 09-25-2019 02:11 AM

Hi,

Thank you for your reply.

I have tried your code and it works. However,the output data shows there are 152 rows but the given answer saids it should be 176 rows.

Do you have another solution that I can try?

Thank you

KachiM · Posted 09-25-2019 04:00 AM

@YangYY

If all Numeric variables in an observation are missing then use this:

data want;
   set have;
   array k _numeric_;
   if nmiss(of k[*]) = dim(k) then delete;
run;

YangYY · Posted 09-26-2019 12:08 AM

I tried this code but it doesn't work.
But thanks!

KachiM · Posted 09-26-2019 12:41 AM

@YangYY

May be your specification is not understood. Let us work with an example.

data have;
input x :$8. y1 y2 y3;
datalines;
AAAAA 10 20 30
BBBBB 10 .  .
CCCCC  . .  .
;
run;

In this dataset, I understood that you want the first two rows only. So my code with suggestion made by @Astounding :

data want;
   set have;
   if nmiss(of _numeric_ ) = dim(k) then delete;
run;

produces the first two records.

Can you explain your issue using this example?

YangYY · Posted 09-26-2019 08:43 AM

Yes. But the log shows error.
72 data combined;
73 set combined;
74 if nmiss(of _numeric_) = dim(k) then delete;
ERROR: The DIM, LBOUND, and HBOUND functions require an array name for the first argument.
75 run;

KachiM · Posted 09-26-2019 10:40 AM

@YangYY

The revised code was not tested. Here is the solution:

data want;
   set have;
   array k _numeric_;
   if nmiss(of _numeric_ ) =  dim(k) then delete;
run;

Also, the code shown by @Astounding can be adapted as:

data want;
   set have;
   if n(of _numeric_) NE 0;
run;

Hope this solves your difficulty. All the best.

YangYY · Posted 09-28-2019 12:00 AM

After trying both code, the output data shows 269 observations which is same as the original data. And the observation with all value missing are not deleted.
And I am sure I understand the data "want" and data"have".
Still thanks.

KachiM · Posted 09-28-2019 01:58 AM

@YangYY

Both the programs work fine. I suspect your input dataset. Proc Print your dataset for visualization and check for any inconsistency, particularly for the DELIMITERs.

I was curious and tried your example data set. It works fine. See the output:

YangYY · Posted 09-28-2019 08:09 AM

KachiM · Posted 09-28-2019 10:56 AM

@YangYY

I tried again it still doesn't work. 

Can you please try to copy my code and use the data "combined" to try your code again and see if it work?

 

data world_attr;
set mapsgfk.world_attr;
run;

proc sort data=world_attr;
by IDNAME;
run;

data world_attr;
set world_attr;
rename IDNAME=country;
run;

data combined;
merge world_attr co2_emission;
by country;
run;

I looked at the data for 176 countries in the Excel sheet giving CO2 Emissions. I don't see world_attr data set. You have not told the location for that.

Further, Excel sheet, I do not see "American Samoa" as you have given in your first post.

I insist that you closely examine your input data and DO the homework before asking further questions. Show your log for all your steps. If it is voluminous use sample of 20 rows. I am not your programmer to do all.

Good luck for finding the solution.

your work.

PeterClemmensen · Posted 09-25-2019 02:08 AM

Do you want to delete observations where all values are missing or observations where at least one value is missing?

The DATA to DATA Step Macro
Blog: SASnrd

YangYY · Posted 09-25-2019 02:12 AM

I want to delete the observations where all values are missing.
Thank you

Astounding · Posted 09-25-2019 07:14 AM

First, simplify the code. This comparison works but doesn't require arrays:

if n(of _numeric_)=0 ;

To find the 24 lost observations, you have to roll up your sleeves and examine the data. Perhaps there are some character variables to the right that need to be considered. Perhaps some numeric variables contain special missing values (such as .A or .B which are different from .) that should not be deleted. Or perhaps the answer you were given is wrong and 152 is correct.

How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Re: How to delete some rows in the data set?

Registration is open