BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
woodcook
Calcite | Level 5

Hello,

Please am new here. I am having a problem with my data set. I have been able to arrange it in a row format given that I have multiple failure time.

current data.

patid    start_time     f ailed_time     time1      time2            time3        ...         time20

1          1/20/1981     1/20/2009     3/4/1999    4/8/2000     8/18/2003   ...         .

2           3/15/2000     5/1/2012      2/3/2002    3/9/2008        .            ....         .

NB. All id do not have time for all time points. Just that when a time point for an id ends it ends all other subsequent time points for such id has . till time20.

The result I am expecting is: single row into multiple  follo

patid date0             date1

1      1/20/1981       3/4/1999

1       3/4/1999       4/8/2000

1      4/8/2000       8/18/2003

1     8/18/2003      1/20/2009

2   3/15/2000        2/3/2002

2   2/3/2002         3/9/2008

2   3/9/2008        5/1/2012.

ie

patid     date0            date1

1        start_time         time1

1          time1              time2

  1         time2              time3

1           time3             failed_time

2          start_time       time1

2          time1              time2

2          time2             failed_time

Thanks in anticipation of a swift response.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I'm confused.  The variables shown are not the same ones shown in your proc contents' result.

One possibility, other than the above discrepancy, is that the file was imported as mixed variable type.  If some of the start_dates are actually dates shown as numbers but in a character field, the following might work:

data have;

  infile cards truncover;

  input pat_id start_time $14. (fail_time time1-time13 time14__e) (:mmddyy10.);

  format fail_time time1--time14__e mmddyy10.;

  cards;

1  7690           1/20/2009     3/4/1999  4/8/2000  8/18/2003

2  14684          5/1/2012      2/3/2002  3/9/2008

3  1/20/1981      1/20/2009     3/4/1999  4/8/2000  8/18/2003

4  3/15/2000      5/1/2012      2/3/2002  3/9/2008

;

data want (keep=pat_id date0 date1);

  set have;

  if input(left(start_time), ?? mmddyy10.) gt 0 then

   start_t=input(left(start_time),mmddyy10.);

  else start_t=input(start_time,5.);

  format date0 date1 mmddyy10.;

  array t start_t time1--time14__e;

      do i=1 to dim(t) while (not missing(t(i)));

          date0=t(i);

          date1=ifn(missing(t(i+1)),fail_time,t(i+1));

          output;

       end;

run;

Otherwise, we would really have to see a couple of the actual records.

View solution in original post

25 REPLIES 25
Haikuo
Onyx | Level 15

Try this one.

data have;

infile cards truncover;

input id (start_time fail_time time1-time20) (:mmddyy10.);

format start_time fail_time time1-time20 mmddyy10.;

cards;

1          1/20/1981     1/20/2009     3/4/1999    4/8/2000     8/18/2003 

2           3/15/2000     5/1/2012      2/3/2002    3/9/2008

;

data want (keep=id date0 date1);

set have;

format date0 date1 mmddyy10.;

array t start_time time1-time20;

      do i=1 to dim(t) while (not missing(t(i)));

          date0=t(i);

          date1=ifn(missing(t(i+1)),fail_time,t(i+1));

          output;

       end;

run;

proc print;run;

Haikuo

woodcook
Calcite | Level 5

Hello Haiuko/Linli,

I guess the code was okay. But at my end I got an error message the following message;

1124    array _time(*) start_time time: fail_time;

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

1125    do i=1 to dim(_time);

1126    date0=_time(i);

1127    if date0 ne . then output;

1128    end;

1129    format date0 mmddyy10.;

                     ---------

                     48

ERROR 48-59: The format $MMDDYY was not found or could not be loaded.

1130    run;

I guess the problem is with the start_time. I merged several files to obtain the data am supposed to "arrange using the code you gave me. I noticed that the start_time is in number format. I tried formatting it using mmddyy10 but it does seem to work. I have to export the data file to excel where I converted it to number. Because it appear to be in a format I can't tell (it appears in excel with the little green mark in front), i converted it in numeric format and then changed it into the date format 23/03/1981. Re-imported it into sas but still appear in the numeric form. I then ran the code you sent but got the above error message. Sas is still new to me, am just trying to get my way round it.

While am at it trying to find out what the problem is, should you have an idea of what may have gone wrong please I would appreciate.Meanwhile, I didn't use the input file part, because my data was store in excel file so I just imported the whole file.

art297
Opal | Level 21

run the following on your dataset and post the results back here:

proc contents data=the_name_of_your_SAS_dataset;

run;

proc print data=the_name_of_your_SAS_dataset (obs=5);

run;

woodcook
Calcite | Level 5

Hello Arthur,

I don't have the authorization to share that. The sample I posted above was a sample I carved out from what my data set looks like. If you could help with the above sample that would be great.

Thanks.

Haikuo
Onyx | Level 15

Since we kinda know how you data look like, now you need at least to tell us the type of those variables involved, character or numeric? You don't need to post every thing that proc content spit out, just info on type should do.

Haikuo

PGStats
Opal | Level 21

... then only give the column names and type with the result of :

proc sql;

select name, type from dictionary.columns

where upcase(memname)="YOUR_DATASET_NAME_UPPERCASE" and libname="WORK";

quit;

PG

PG
woodcook
Calcite | Level 5

Hello PG/Haiuto,

please is the info requested.

Alphabetic List of Variables and Attributes

               #    Variable      Type    Len    Format       Informat    Label

               3    Fail_time     Num       8    MMDDYY10.                Fail_time

               1    Pat_ID        Num       8    BEST12.                  Pat_ID

               2    start_time    Char     14    $14.         $14.        start_time

               4    time1         Num       8    MMDDYY10.                time1

               5    time2         Num       8    MMDDYY10.                time2

               6    time3         Num       8    MMDDYY10.                time3

               7    time4         Num       8    MMDDYY10.                time4

               8    time5         Num       8    MMDDYY10.                time5

               9    time6         Num       8    MMDDYY10.                time6

              10    time7         Num       8    MMDDYY10.                time7

              11    time8         Num       8    MMDDYY10.                time8

              12    time9         Num       8    MMDDYY10.                time9

              13    time10        Num       8    MMDDYY10.                time10

              14    time11        Num       8    MMDDYY10.                time11

              15    time12        Num       8    MMDDYY10.                time12

              16    time13        Num       8    MMDDYY10.                time13

              17    time14__e     Num       8    MMDDYY10.                time14

Haikuo
Onyx | Level 15

Ok, it seems that 'start_time' is the culprit. You need somehow convert it to numeric. Here is an approach based on my first response, please note that 'start_time' is $14., then it is converted to 'start_t' as mmddyy10., the latter goes into array().

data have;

infile cards truncover;

input id start_time :$14. (fail_time time1-time20) (:mmddyy10.);

format fail_time time1-time20 mmddyy10.;

cards;

1          1/20/1981     1/20/2009     3/4/1999    4/8/2000     8/18/2003

2           3/15/2000     5/1/2012      2/3/2002    3/9/2008

;

data want (keep=id date0 date1);

set have;

start_t=input(left(start_time),mmddyy10.);

format date0 date1 mmddyy10.;

array t start_t time1-time20;

      do i=1 to dim(t) while (not missing(t(i)));

          date0=t(i);

          date1=ifn(missing(t(i+1)),fail_time,t(i+1));

          output;

       end;

run;

proc print;run;

BTW, you don't need to refer my name in your reply, while if you do, please refer it correctly, thanks.

Haikuo

art297
Opal | Level 21

This is just a restatement of Haikuo's suggested code, but with variable names that match those shown on the proc contents results that you posted:

data have;

  infile cards truncover;

  input pat_id start_time $14. (fail_time time1-time13 time14__e) (:mmddyy10.);

  format fail_time time1--time14__e mmddyy10.;

  cards;

1  1/20/1981      1/20/2009     3/4/1999  4/8/2000  8/18/2003

2  3/15/2000      5/1/2012      2/3/2002  3/9/2008

;

data want (keep=pat_id date0 date1);

  set have;

  start_t=input(left(start_time),mmddyy10.);

  format date0 date1 mmddyy10.;

  array t start_t time1--time14__e;

      do i=1 to dim(t) while (not missing(t(i)));

          date0=t(i);

          date1=ifn(missing(t(i+1)),fail_time,t(i+1));

          output;

       end;

run;

woodcook
Calcite | Level 5

It ran but it dropped the first 53 observations. with the following error message for all the observations dropped

NOTE: Invalid argument to function INPUT at line 1857 column 9.

PatID=1 start_time=28764 Fail_time=06/06/2009 time1=. time2=. time3=. time4=. time5=. time6=.

time7=. time8=. time9=. time10=. time11=. time12=. time13=. time14=. start_t=. date0=. date1=.

_I_=. time15=. time16=. time17=. time18=. time19=. time20=. i=1 _ERROR_=1 _N_=1.

please how do i fix this?

art297
Opal | Level 21

I'm confused.  The variables shown are not the same ones shown in your proc contents' result.

One possibility, other than the above discrepancy, is that the file was imported as mixed variable type.  If some of the start_dates are actually dates shown as numbers but in a character field, the following might work:

data have;

  infile cards truncover;

  input pat_id start_time $14. (fail_time time1-time13 time14__e) (:mmddyy10.);

  format fail_time time1--time14__e mmddyy10.;

  cards;

1  7690           1/20/2009     3/4/1999  4/8/2000  8/18/2003

2  14684          5/1/2012      2/3/2002  3/9/2008

3  1/20/1981      1/20/2009     3/4/1999  4/8/2000  8/18/2003

4  3/15/2000      5/1/2012      2/3/2002  3/9/2008

;

data want (keep=pat_id date0 date1);

  set have;

  if input(left(start_time), ?? mmddyy10.) gt 0 then

   start_t=input(left(start_time),mmddyy10.);

  else start_t=input(start_time,5.);

  format date0 date1 mmddyy10.;

  array t start_t time1--time14__e;

      do i=1 to dim(t) while (not missing(t(i)));

          date0=t(i);

          date1=ifn(missing(t(i+1)),fail_time,t(i+1));

          output;

       end;

run;

Otherwise, we would really have to see a couple of the actual records.

woodcook
Calcite | Level 5

Hello Arthur,

It works but I noticed that a date that was supposed to be 1978 was output as 2038 ie 60 years were added to the "start_time date". Don't know why. If it is going to be too much trouble then I will have to run with it. The worse case scenario is I will have to correct the date manually. Though I have more than 60,000 patid.

Thanks a lot.

art297
Opal | Level 21

That might confirm my original thought that the data had been imported from Excel.

You can get the correct result by running the code as:

data want (keep=pat_id date0 date1);

  set have;

  if input(left(start_time), ?? mmddyy10.) gt 0 then

   start_t=input(left(start_time),mmddyy10.);

  else start_t=input(start_time,5.)-21916;

  format date0 date1 mmddyy10.;

  array t start_t time1--time14__e;

      do i=1 to dim(t) while (not missing(t(i)));

          date0=t(i);

          date1=ifn(missing(t(i+1)),fail_time,t(i+1));

          output;

       end;

run;

The correction, of subtracting 21916 from the numeric dates, is explained in the following paper:

http://www2.sas.com/proceedings/sugi29/068-29.pdf

woodcook
Calcite | Level 5

Hello Arthur,

that's perfect. I am really grateful for you help and time. also my thanks goes to Haikuo (pardon me I have to mention your name again) , Linlin and PG for your assistance as well. Thanks all and do have a great evening.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 25 replies
  • 9024 views
  • 6 likes
  • 7 in conversation