BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jackie32
Calcite | Level 5

Hi,

I'm not sure if this is the right community to post in, but this is my issue. I have a dataset with millions of rows set up like this:

Record NumberDOS
1

1/1/2006

26/5/2007
11/24/2009
37/1/2011
45/22/2012
38/14/2010

What I am trying to make is a data set that looks like this (with a first DOS and a binary for if they had a date of service in each year):

Record NumberFirst DOS20062007200820092010201120122013
11/1/200610010000
26/5/200701000000
38/14/201000001100
45/22/201200000010

I have no idea how to get from A to B. I'm very new to SAS and any help would be much appreciated!

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
DBailey
Lapis Lazuli | Level 10

I think this might be passed to the underlying database to do the work:

proc sql;

create table want as

select

     Record_Number,

     Min(dos) as First_DOS,

     max(case when dos>='01jan2006:0:0'dt and dos < '01Jan2007:0:0'dt  then 1 else 0 end) as DOS_2006,

     max(case when dos>='01jan2007:0:0'dt and dos < '01Jan2008:0:0'dt  then 1 else 0 end) as DOS_2007

from

     have

group by record_number;

quit;

View solution in original post

9 REPLIES 9
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well you have two parts to your request there.  Firstly getting a transposed list of years, the second part being getting the first DOS date.  See the steps below:

data have;
  record_num=1; dos='01JAN2006'd; output;
  record_num=2; dos='06MAY2006'd; output;
  record_num=1; dos='24JAN2009'd; output;
run;

data inter;  /* Assign a text year */
  set have;
  length year $4.;
  year=put(year(dos),4.);
  flag=1;
run;

proc sort data=inter nodupkey;  /* Get rid of duplicates as we only need one per rec-num */
  by record_num year;
run;
proc transpose data=inter out=t_want;  /* Transpose the data to get yearly values */
  by record_num;
  var flag;
  id year;
  idlabel year;
run;

proc sql;   /* Merge on the minimum date */
  create table WANT (drop=_name_) as
  select  A.*,
          B.MIN_DOS format=date9.
  from    T_WANT A
  left join (select distinct RECORD_NUM,MIN(DOS)as MIN_DOS from HAVE group by RECORD_NUM) B
  on      A.RECORD_NUM=B.RECORD_NUM;
quit;

jackie32
Calcite | Level 5

Thanks- I can't do the first step though. The data already exists through an ODBC connection and is 65+ million rows, so I'm not sure how to go through and reformat the dates. I tried the second step without doing the first, just in case because it is set up like 25JAN2014 (I misspoke before) but also has 0:00:0000:00 included in the date, so I'm not sure how to complete the first step.

data_null__
Jade | Level 19

Look up the DATEPART function.

stat_sas
Ammonite | Level 13

data have;

input RecordNumber DOS :mmddyy10.;

format DOS mmddyy10.;

datalines;

1 1/1/2006

2 6/5/2007

1 1/24/2009

3 7/1/2011

4 5/22/2012

3 8/14/2010

;

data want(drop=i);

set have;

array yr{*} y2006-y2013;

do i=2006 to 2013;

if year(dos)=i then yr{i-2005}=1;

else yr{i-2005}=0;

end;

run;

art297
Opal | Level 21

My suggestion is very similar to stat's:

data have;

  informat record_number 8.;

  informat dos anydtdtm.;

  infile cards dlm=',';

  input record_number dos;

  cards;

1,01JAN2006 14:30:08

2,05JUN2007 14:30:08

1,24JAN2009 14:30:08

3,01JUL2011 14:30:08

4,22MAY2012 14:30:08

3,22AUG2010 14:30:08

;

proc sort data=have out=want;

  by record_number descending dos;

run;

data want;

  format First_DOS mmddyy10.;

  set want (rename=(dos=First_DOS));

  by record_number;

  array years(2006:2012) _2006-_2012;

  retain years;

  if first.record_number then do _n_=2006 to 2012;

    years(_n_)=0;

  end;

  years(year(datepart(First_DOS)))=1;

  First_DOS=datepart(First_DOS);

  if last.record_number then output;;

run;

DBailey
Lapis Lazuli | Level 10

I think this might be passed to the underlying database to do the work:

proc sql;

create table want as

select

     Record_Number,

     Min(dos) as First_DOS,

     max(case when dos>='01jan2006:0:0'dt and dos < '01Jan2007:0:0'dt  then 1 else 0 end) as DOS_2006,

     max(case when dos>='01jan2007:0:0'dt and dos < '01Jan2008:0:0'dt  then 1 else 0 end) as DOS_2007

from

     have

group by record_number;

quit;

jackie32
Calcite | Level 5

Thank you SO much! This worked amazingly well - I seriously appreciate it so much.

jackie32
Calcite | Level 5

Got it to work! Thank you everyone for your help- I'd be lost otherwise! I really appreciate all your input!

Ksharp
Super User
data have;
input RecordNumber DOS :mmddyy10.;
format DOS mmddyy10.;
datalines;
1 1/1/2006
2 6/5/2007
1 1/24/2009
3 7/1/2011
4 5/22/2012
3 8/14/2010
;
run;

proc sort data=have; by RecordNumber DOS;run;
data have;
 set have;
 by  RecordNumber;
 retain first_dos var 1;
 if first.RecordNumber then first_dos=DOS;
 year=year(DOS);
 format  first_dos date9.;
run;
proc transpose data=have out=temp(drop=_name_) prefix=_;
 by RecordNumber first_dos;
 id year;
 var var;
run;
proc stdize data=temp out=want missing=0 reponly;run;

Xia Keshan

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2571 views
  • 7 likes
  • 7 in conversation