BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
devsas
Pyrite | Level 9

Thanks in advance!

Basically I want to create new columns with dates starting from date_from date and going until days_supply -1 and populate the valid dates with perday number. details below

data have;

input id $ id2 $ date_from drug $ days_supply perday;

cards;

xxxxxx1 yyyyyy1 6/24/2016 HYDROMORPHONE HCL 8 60
xxxxxx2 yyyyyy2 7/5/2016 MORPHINE SULFATE ER 14 120
xxxxxx3 yyyyyy3 7/5/2016 FENTANYL 15 60
xxxxxx4 yyyyyy4 7/5/2016 TRAMADOL HCL 14 40

;

run

 

I want the output to look like this-

 

id1 id2 drug 6/24/2016 6/25/2016 6/26/2016 6/27/2016 6/28/2016 6/29/2016 6/30/2016 7/1/2016 7/2/2016 7/3/2016 7/4/2016 7/5/2016 7/6/2016 7/7/2016 7/8/2016 7/9/2016 7/10/2016 7/11/2016 7/12/2016 7/13/2016 7/14/2016 7/15/2016 7/16/2016 7/17/2016 7/18/2016 7/19/2016
xxxxxx1 yyyyyy1 HYDROMORPHONE HCL 60 60 60 60 60 60 60 60                                    
xxxxxx2 yyyyyy2 MORPHINE SULFATE ER                       120 120 120 120 120 120 120 120 120 120 120 120 120 120  
xxxxxx3 yyyyyy3 FENTANYL                       60 60 60 60 60 60 60 60 60 60 60 60 60 60 60
xxxxxx4 yyyyyy4 TRAMADOL HCL                       40 40 40 40 40 40 40 40 40 40 40 40 40 40  
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Getting closer then.  The variable names are still not valid because of the slashes.  I would recommend putting the year first in the variable names, such as d_20160624.  That way, the variable names will be in the same order as the dates they represent.

 

Follow-up questions to get the next step:

 

Do you want a variable for every date in the time period even if no meds were used on that date?

 

Do you know the time period, or do you need to find it based on the data?

View solution in original post

9 REPLIES 9
Astounding
PROC Star

You would be well advised to actually discuss your intended result.  There may be much easier ways to get there.

 

Regarding what you asked for:

 

  • The column names you illustrated are not valid names for variables in SAS.  You would have to supply different names.
  • The number of new variables could be in the thousands.  How would that be useful to you?
devsas
Pyrite | Level 9

Thanks a lot for your reply. To answer your questions, i know about that..perhaps i was thinking columns starting with d_6/24/2016 and so on. I wont have thousands of variable names, maximum 300-400 variable names for all dates in a year.

The intended result output is the same as i posted above. I tried using this code, but its not getting me much.

 

data want;
set have;
array i _numeric_;
do i=date_from to ((date_from + days_supply) -1);
i= perday;
end;
run;

Astounding
PROC Star

Getting closer then.  The variable names are still not valid because of the slashes.  I would recommend putting the year first in the variable names, such as d_20160624.  That way, the variable names will be in the same order as the dates they represent.

 

Follow-up questions to get the next step:

 

Do you want a variable for every date in the time period even if no meds were used on that date?

 

Do you know the time period, or do you need to find it based on the data?

devsas
Pyrite | Level 9

Thanks again! To answer your questions.

 

1. ideally, yes, but if there is a date on which none of the id's used the drug, its fine not to have that date column. 

 

2. Based on the data-meaning start the columns from the first date which appears and end with the last date possible. For example, in the given sample, the first date column starts from 20160624 and end with 20160719.

Astounding
PROC Star

OK, I might just take the easy way out here.  The results will be difficult to work with no matter what, but all the necessary dates will be there.  So ...

 

data halfway_there;

set have;

do date = date_from to date_from + days_supply - 1;

   column_header = 'd_' || put(date, yymmddn8.);

   output;

end;

drop date;

run;

 

proc transpose data=halfway_there out=want;

   by id1 id2 drug;

   var days_supply;

   id column_header;

run;

 

There may be an extra variable or two hanging around that you need to drop, and you will need to sort your data set ahead of time.  But it will get you there with a minimal amount of coding.

 

(It's also untested.  I don't expect problems, but let me know if any debugging is needed.)

devsas
Pyrite | Level 9

thanks so much, it worked. var was perday number not days_supply, but that was a minor thing and yes i dropped _name_ and_label_ variables.

ballardw
Super User

How are you going to use the result?

There is great potential for a difficult to manage dataset with hundreds of variables.

Shmuel
Garnet | Level 18

As your final datae will hold variables for all dates created in all observations,

even if some of them are nor relevant and shall be missing value, I would prefer

calculate the minimum of date_from, the maximum of date_last, how many dates should be in the array

and create a list of variable names per date as a mcaro variable.

On next step declare the list of variables and an array to parse the per_day:

 

data _NULL_;

 set have end=eof;

       retain min_date max_date;

       length var_names $1000;   /* up to 800 variable names each of 12 characters long and a space as seperator */

       if  _N_ = 1 then do;

           min_date = date_from;

           max_date = date_from + days_supply;

      end; else do;

           min_date = min( min_date, date_from);

           max_date = max(max_date, date_from + days_supply);

      end;

      if eof then do;

         dates_count = max_date - min_date +1;

         do i=min_date to max_date;

               vname =  cat('V_', put(i , yymmdd10.);

               var_names = catx(' ', var_names , vname);

        end;

      PUT dates_count=   min_date= date9.   max_date= date9. ;  /* check in LOG */ 

       call symput('dates_count',  left(dates_count));

       call symput('date_min', left(min_date));;

       call symput('var_names' , trim(var_names);

   end;

run;

 

data want;

  set have;

        length &var_names 8;   /* thus adding/declaring the variables */

        array dtx {&dates_count} &var_names;

        retain date_min = symget('date_min');

        i1 = date_from - date_min +1;

        do i = i1 to i1 + days_supply -1;

              dtx(i) = perday;

       end;

run;

devsas
Pyrite | Level 9

Thanks for the solution, i will try this one too.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1830 views
  • 1 like
  • 4 in conversation