BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jackie32
Calcite | Level 5

Hi,

I'm not sure if this is the right community to post in, but this is my issue. I have a dataset with millions of rows set up like this:

Record NumberDOS
1

1/1/2006

26/5/2007
11/24/2009
37/1/2011
45/22/2012
38/14/2010

What I am trying to make is a data set that looks like this (with a first DOS and a binary for if they had a date of service in each year):

Record NumberFirst DOS20062007200820092010201120122013
11/1/200610010000
26/5/200701000000
38/14/201000001100
45/22/201200000010

I have no idea how to get from A to B. I'm very new to SAS and any help would be much appreciated!

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
DBailey
Lapis Lazuli | Level 10

I think this might be passed to the underlying database to do the work:

proc sql;

create table want as

select

     Record_Number,

     Min(dos) as First_DOS,

     max(case when dos>='01jan2006:0:0'dt and dos < '01Jan2007:0:0'dt  then 1 else 0 end) as DOS_2006,

     max(case when dos>='01jan2007:0:0'dt and dos < '01Jan2008:0:0'dt  then 1 else 0 end) as DOS_2007

from

     have

group by record_number;

quit;

View solution in original post

9 REPLIES 9
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well you have two parts to your request there.  Firstly getting a transposed list of years, the second part being getting the first DOS date.  See the steps below:

data have;
  record_num=1; dos='01JAN2006'd; output;
  record_num=2; dos='06MAY2006'd; output;
  record_num=1; dos='24JAN2009'd; output;
run;

data inter;  /* Assign a text year */
  set have;
  length year $4.;
  year=put(year(dos),4.);
  flag=1;
run;

proc sort data=inter nodupkey;  /* Get rid of duplicates as we only need one per rec-num */
  by record_num year;
run;
proc transpose data=inter out=t_want;  /* Transpose the data to get yearly values */
  by record_num;
  var flag;
  id year;
  idlabel year;
run;

proc sql;   /* Merge on the minimum date */
  create table WANT (drop=_name_) as
  select  A.*,
          B.MIN_DOS format=date9.
  from    T_WANT A
  left join (select distinct RECORD_NUM,MIN(DOS)as MIN_DOS from HAVE group by RECORD_NUM) B
  on      A.RECORD_NUM=B.RECORD_NUM;
quit;

jackie32
Calcite | Level 5

Thanks- I can't do the first step though. The data already exists through an ODBC connection and is 65+ million rows, so I'm not sure how to go through and reformat the dates. I tried the second step without doing the first, just in case because it is set up like 25JAN2014 (I misspoke before) but also has 0:00:0000:00 included in the date, so I'm not sure how to complete the first step.

data_null__
Jade | Level 19

Look up the DATEPART function.

stat_sas
Ammonite | Level 13

data have;

input RecordNumber DOS :mmddyy10.;

format DOS mmddyy10.;

datalines;

1 1/1/2006

2 6/5/2007

1 1/24/2009

3 7/1/2011

4 5/22/2012

3 8/14/2010

;

data want(drop=i);

set have;

array yr{*} y2006-y2013;

do i=2006 to 2013;

if year(dos)=i then yr{i-2005}=1;

else yr{i-2005}=0;

end;

run;

art297
Opal | Level 21

My suggestion is very similar to stat's:

data have;

  informat record_number 8.;

  informat dos anydtdtm.;

  infile cards dlm=',';

  input record_number dos;

  cards;

1,01JAN2006 14:30:08

2,05JUN2007 14:30:08

1,24JAN2009 14:30:08

3,01JUL2011 14:30:08

4,22MAY2012 14:30:08

3,22AUG2010 14:30:08

;

proc sort data=have out=want;

  by record_number descending dos;

run;

data want;

  format First_DOS mmddyy10.;

  set want (rename=(dos=First_DOS));

  by record_number;

  array years(2006:2012) _2006-_2012;

  retain years;

  if first.record_number then do _n_=2006 to 2012;

    years(_n_)=0;

  end;

  years(year(datepart(First_DOS)))=1;

  First_DOS=datepart(First_DOS);

  if last.record_number then output;;

run;

DBailey
Lapis Lazuli | Level 10

I think this might be passed to the underlying database to do the work:

proc sql;

create table want as

select

     Record_Number,

     Min(dos) as First_DOS,

     max(case when dos>='01jan2006:0:0'dt and dos < '01Jan2007:0:0'dt  then 1 else 0 end) as DOS_2006,

     max(case when dos>='01jan2007:0:0'dt and dos < '01Jan2008:0:0'dt  then 1 else 0 end) as DOS_2007

from

     have

group by record_number;

quit;

jackie32
Calcite | Level 5

Thank you SO much! This worked amazingly well - I seriously appreciate it so much.

jackie32
Calcite | Level 5

Got it to work! Thank you everyone for your help- I'd be lost otherwise! I really appreciate all your input!

Ksharp
Super User
data have;
input RecordNumber DOS :mmddyy10.;
format DOS mmddyy10.;
datalines;
1 1/1/2006
2 6/5/2007
1 1/24/2009
3 7/1/2011
4 5/22/2012
3 8/14/2010
;
run;

proc sort data=have; by RecordNumber DOS;run;
data have;
 set have;
 by  RecordNumber;
 retain first_dos var 1;
 if first.RecordNumber then first_dos=DOS;
 year=year(DOS);
 format  first_dos date9.;
run;
proc transpose data=have out=temp(drop=_name_) prefix=_;
 by RecordNumber first_dos;
 id year;
 var var;
run;
proc stdize data=temp out=want missing=0 reponly;run;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1102 views
  • 7 likes
  • 7 in conversation