BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Nitish1003
Obsidian | Level 7

Hi I am trying to find out difference between dates as per Name.

data test;
INPUT NAME $ DOB;
INFORMAT DOB DATE9.;
FORMAT DOB DATE9.;
DATALINES;
NITISH 10MAR2016
PRIYA 10DEC2016
PRIYA 01JAN2016

SAKSHI 20DEC2016

SAKSHI 20APR2016
SAKSHI 02JAN2016
;
RUN;

So output will be like

NAME            DIFF
NITISH        (10MAR2016-10MAR2016) i.e. 0 days
PRIYA         (10DEC2016-10DEC2016)
SAKSHI       (20DEC2016-20APR2016)
SAKSHI       (20DEC2016-02JAN2016)
SAKSHI       (20APR2016-02JAN2016)

 

Please help me to reach to solution. Itried using proc sort and appling lag in my dataset but was not able to achieve the results

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I think you're looking for something like:

 

proc sql;                  
  create table want as    
    select a.*, count(a.name) as count, b.dob as dob2,
           abs(dob2-a.dob) as diff           
      from test a join test b
        on a.name eq b.name
          group by a.name
            order by a.id
  ;              
quit; 

Art, CEO, AnalystFinder.com

View solution in original post

8 REPLIES 8
collinelliot
Barite | Level 11

I don't understand your desired results. Priya, for example, has "PRIYA (10DEC2016-10DEC2016)" but I don't see any result for her second row of data with 01JAN2016.

 

Shouldn't there be a result for PRIYA 10DEC2016 and 01JAN2016 with a difference of 344 days?

 

If you can tighten up what you expect to see, it will be easier to come up with a solution. 

 

My apologies if I simply don't understand.

 

Thanks,

 

Collin

Nitish1003
Obsidian | Level 7

Sorry for the typo the output will be like 

NAME            DIFF
NITISH        (10MAR2016-10MAR2016) i.e. 0 days
PRIYA         (10DEC2016-01JAN016)
SAKSHI       (20DEC2016-20APR2016)
SAKSHI       (20DEC2016-02JAN2016)
SAKSHI       (20APR2016-02JAN2016)

 

This would ne my deisre reult. Please Collin if you check it and let me know how to solve this problem.

 

Thanks

Nitish

mkeintz
PROC Star

You want to calculate the maximum date range of DOB for each name, and then assign that diff to every record in the data set.  If your data test is grouped (but not neccessarily in ascending or descending order) by name, then:

 


data want (drop=mindob maxdob);
 do until (last.name);
   set test;
   by name notsorted;
   mindob=min(mindob,dob);
   maxdob=max(maxdob,dob);
 end;
 diff=maxdob-mindob;
 do until (last.name);
   set test;
   by name notsorted;
   output;
 end;
run;

 

This program;

  1. Read in all the records for a single id, while updating MINDOB and MAXDOB for that ID
  2. Calculates DIFF
  3. Re-reads all the records for the same id, and outputs the record, now with correct DIFF
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
art297
Opal | Level 21

I think you're looking for something like:

 

proc sql;                  
  create table want as    
    select a.*, count(a.name) as count, b.dob as dob2,
           abs(dob2-a.dob) as diff           
      from test a join test b
        on a.name eq b.name
          group by a.name
            order by a.id
  ;              
quit; 

Art, CEO, AnalystFinder.com

Nitish1003
Obsidian | Level 7

Hi Sir

Your solution allmost worked for me but facing one issue.I am getting duplicate data i.e.

Example for Priya 

Name     DOB             count     DOB2              diff

Priya    01JAN2016       4         10DEC2016    344

Priya    10DEC2016      4         01JAN2016     344

Is there way to Over come this because i want only one row.

 

art297
Opal | Level 21

Is the following what you want?:

 

data test;
  INPUT NAME $ DOB;
  INFORMAT DOB DATE9.;
  FORMAT DOB DATE9.;
  id=_n_;
  DATALINES;
NITISH 10MAR2016
PRIYA 10DEC2016
PRIYA 01JAN2016
SAKSHI 20DEC2016
SAKSHI 20APR2016
SAKSHI 02JAN2016
;
proc sql;                  
  create table want as    
    select a.*, count(a.name) as count, b.dob as dob2,
           abs(dob2-a.dob) as diff           
      from test a join test b
        on a.name eq b.name
          group by a.name
            having a.id le b.id
              order by a.id
  ;              
quit;

Art, CEO, AnalystFinder.com

Nitish1003
Obsidian | Level 7
Thanks sir for the help. I missed the Id part in first solution.Thanks a lot for help 🙂
mkeintz
PROC Star

@Nitish1003

 

The reason you saw duplicates in the output of @art297's first response is because that is how you displayed the desired output in your initial post.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 3877 views
  • 4 likes
  • 4 in conversation