SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Subsetting observations

Reply
New User
Posts: 1

Subsetting observations

Hello,

 

Suppose I have this dataset:

 

Name   Value

A              --

A              22

A              --

A              --

A              35

A              --

A              --

B              --

B              3

B              7

B              --

B              --

B              90

B              --

 

What I want to accomplish is that for every "Name" I want to subset the observations between first time the value was observed and last time the value was observed.

Desired dataset:

Name       Value

A               22

A               -- 

A               --

A               35

B               3

B               7

B               --

B               -- 

B               90

 

Please help. Thanks.

PROC Star
Posts: 1,095

Re: Subsetting observations

I think this is going in the right direction.

 

Tom

 

data have;

length NAME $8;

input NAME VALUE;

SeqNo = _N_;

cards;

A .

A 22

A .

A .

A 35

A .

A .

B .

B 3

B 7

B .

B .

B 90

B .

run;

data inter1;

set have;

by NAME;

retain GoodSwitch;

if first.NAME then

GoodSwitch = 0;

if ^missing(VALUE) then

GoodSwitch = 1;

run;

proc sort data=inter1 out=inter2;

by descending SeqNo;

run;

data inter3;

set inter2;

by NAME notsorted;

retain FirstFound;

if first.NAME then

FirstFound = 0;

if ^missing(VALUE) then

FirstFound = 1;

if FirstFound = 0 then

GoodSwitch = 0;

run;

proc sort data=inter3 out=inter4;

by SeqNo;

run;

data want;

set inter4;

if GoodSwitch then

output;

run;

Respected Advisor
Posts: 4,654

Re: Subsetting observations

The double do until is ideal for this

 

data have;
input Name $  Value :??best.;
datalines;
A              --
A              22
A              --
A              --
A              35
A              --
A              --
B              --
B              3
B              7
B              --
B              --
B              90
B              --
C              --
;

data want;
do i=1 by 1 until(last.name);
    set have; by name;
    if not missing(value) then do;
        if missing(first) then first = i;
        last = i;
        end;
    end;
do i=1 by 1 until(last.name);
    set have; by name;
    if i >= first then
        if i <= last then output;
    end;
drop first last i;
run;

proc print; run;

PG
Ask a Question
Discussion stats
  • 2 replies
  • 330 views
  • 0 likes
  • 3 in conversation