BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Hello all,

 

I would like to create subset of observations based on a variable limiting the count to 2 or 1 depending on the value in that variable.

 

data want;

  infile want_file;

    input

      @ 001 ln_no                   9.

      @ 010 ln_status             $1.

      @ 011 ln_excep_stat      $1

    ;

 

ln_status can have values of '1', 2', '3', '4' or '5'  ;

ln_excep_stat can have values of  '1', '2', '3' or '4';

 

If ln_excep_stat is '2', '3' or '4' and ln_stat is '2', '3' or '4', need to limit the observations to 2 obs if there are more than 2 per ln_no.

If ln_excep_stat is '1' or '5' and ln_stat is '1' or '5', need to limit to 1 if there are more than 2.

 

I really appreciate your time and effort on this.

 

thank you in advance,

 

Neal.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

@saslovethemost wrote:

for the 1 record, I want the bottom one, for the two records, I want to have bottom two records.


Your example above was incorrect then or is there some other variable that indicates order?

 

Either way, here's one long winded approach, yes there are easier ways.

 

ata cobdla;
set cobdla;
by ln_no;
if first.ln_no then count=1;
else count+1;
run;

proc sort data=cobdla; 
by ln_no descending count;
run;

data want;
set cobdla;
by ln_no;
retain max;

if first.ln_no then max=count;

if (ln_status in ('2','3','4') or ln_excep_stat in ('2','3','4')) and count >= (max - 1) then flag=1;
else if (ln_status in ('1', '5') or ln_excep_stat in ('1','5'))  and count=max then flag=2;
else flag=3;

if flag in (1,2);

run;

View solution in original post

18 REPLIES 18
PaigeMiller
Diamond | Level 26

Please provide the data, using these instructions: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

Do not provide the data in any other form.

 

Also, show us the code you have tried already.

--
Paige Miller
saslovethemost
Quartz | Level 8
data combdla;
 infile datalines dsd truncover;
 input ln_no:9. ln_status:$1. ln_excep_stat:$1. name:$4.;
 datalines;
 999114964 2 3 NEAL
 999114964 2 3 DAVE
 999114964 2 3 PHIL
;

data combdlq;
  set combdla;
  if ln_status=('2' or '3' or '4') and
     ln_excep_stat=('2' or '3' or '4') then
     do;
         noobs=2;
    end;
  else ln_status=('1' or '5') and
          ln_excep_stat =('1' or '5')
          do;
              noobs=2;
          end;
;

I have tried the above code, not getting the right results. is there a way that SAS can do it in a different way than structured language like COBOL?

saslovethemost
Quartz | Level 8

sorry, in the second data statement should be like the following:

data combdlq;
  set combdla;
  if ln_status=('2' or '3' or '4') and
     ln_excep_stat=('2' or '3' or '4') then
     do;
         noobs=2;
    end;
  else if ln_status=('1' or '5') and
             ln_excep_stat =('1' or '5')
          do;
              noobs=1;
          end;
  else
       do;
             noobs =1;
       end; 
;
PaigeMiller
Diamond | Level 26

It's still not clear to me what the requirement "need to limit the observations to 2 obs if there are more than 2 per ln_no" means. Show us the desired output.

--
Paige Miller
saslovethemost
Quartz | Level 8
I think I am trying to get 2 observations out of 3 input datalines.

output should be like below:
999114964 2 3 NEAL
999114964 2 3 DAVE
PaigeMiller
Diamond | Level 26

It's very tough to generalize from one example, so that code can be written that works in all cases. Are there rules that determine why you pick NEAL and DAVE in this case, and not two other records?

--
Paige Miller
Reeza
Super User
It would be really helpful if you could lay out your problem. Show us what you have, what you've tried and what you expect as output.

Currently, your input looks like your output with the exception of the last record. But how did you know you need 2 vs 1?

I would also argue that this would be much easier to first read in and then filter or at least control the output. Is it guaranteed that the ln_status will be the same across the IDs? What uniquely identifies each group?
saslovethemost
Quartz | Level 8
data cobdla;
   infile datalines dsd truncover;
   input ln_no:9. ln_status:$1. ln_excep_stat:$1. name:$4.;
   datalines;
            999114961 2 3 NEAL
            999114961 2 3 DAVE
            999114961 2 3 PHIL
            999114962 3 4 JOHN
            999114962 3 4 RICK
            999114962 3 4 LILI
            999114963 4 5 JOHN
            999114963 4 5 RICK
            999114963 4 5 LILI
            999114964 1 5 JOHN
            999114964 1 5 RICK
;
sorry for the confusion I created, 

Here is what I want: last two records for each ln_no if the ln_status and ln_excep_stat is in '2' '3' or '4', if they are '1' and '5' I want only 1 record.

 

 

Tom
Super User Tom
Super User

A hint for using in-line data in a data step.  Do not indent the data.  Figuring out what column to read what column is causing errors, editing etc. will be much harder.  To make it easier also do not indent the DATALINES (or CARDS) statement that comes before that data or the semi-colon line (or four semi-colon line) that marks the end of the data (and the end of the data step).

Reeza
Super User

For the 1 record, does it matter which one? I noticed for the loan status you said last two but in your example here you actually picked the first two. Can you confirm? Its much easier if it's the first two rather than last two - because you don't know how many there are until you're done.

saslovethemost
Quartz | Level 8

for the 1 record, I want the bottom one, for the two records, I want to have bottom two records.

Reeza
Super User

@saslovethemost wrote:

for the 1 record, I want the bottom one, for the two records, I want to have bottom two records.


Your example above was incorrect then or is there some other variable that indicates order?

 

Either way, here's one long winded approach, yes there are easier ways.

 

ata cobdla;
set cobdla;
by ln_no;
if first.ln_no then count=1;
else count+1;
run;

proc sort data=cobdla; 
by ln_no descending count;
run;

data want;
set cobdla;
by ln_no;
retain max;

if first.ln_no then max=count;

if (ln_status in ('2','3','4') or ln_excep_stat in ('2','3','4')) and count >= (max - 1) then flag=1;
else if (ln_status in ('1', '5') or ln_excep_stat in ('1','5'))  and count=max then flag=2;
else flag=3;

if flag in (1,2);

run;
saslovethemost
Quartz | Level 8

Thanks Reeza for the reply., please see the reply I sent to Tom.

 

I pasted it below for your reference.

 

"Thanks Tom for the reply, sorry I misguided you with my code, but I do not want the no of obs values, I want the actual data mentioned in the datalines but subset of it. For a ln_no if there are more than 2 records, I want to retain last 2 records for ln_stat and ln_excep_status '2' or '3' or '4' . same way, I want to retain 1 record if ln_stat and ln_excep_status '1' or '5'  if there are more than 1 record. If there is only one record, I want to retain that one record. sorry for all the confusion".

Reeza
Super User

@saslovethemost wrote:

Thanks Reeza for the reply., please see the reply I sent to Tom.

 

I pasted it below for your reference.

 

"Thanks Tom for the reply, sorry I misguided you with my code, but I do not want the no of obs values, I want the actual data mentioned in the datalines but subset of it. For a ln_no if there are more than 2 records, I want to retain last 2 records for ln_stat and ln_excep_status '2' or '3' or '4' . same way, I want to retain 1 record if ln_stat and ln_excep_status '1' or '5'  if there are more than 1 record. If there is only one record, I want to retain that one record. sorry for all the confusion".


That's exactly what my code does. Did you run it and check the output?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1400 views
  • 0 likes
  • 4 in conversation