BookmarkSubscribeRSS Feed
sasecn
Quartz | Level 8

I have a given data that indicates user active proportion for each month. It depends on the active days during each month. Given this, I tried to write loop to get active proportion for each quarter. Just cannot find a better way to index the loop. For example, one user active only in January, then the active proportion during first quarter (Jan. Feb. and March) would be 31/(31+28+31), or the average of active proportion for Jan, Feb and March.

Thanks for any help!

given data like:

userIDac12022ac22022ac32022ac42022ac52022ac62022ac72022ac82022ac92022ac102022ac112022ac122022
11.001.001.001.001.001.001.001.001.001.001.001.00
20.000.381.001.001.001.001.001.001.001.001.001.00

wanted data like: (e.g. first quarter active proportion = (0 + 0.38 +1) / 3

userIDacq12022acq22022acq32022acq42022
11.001.001.001.00
20.461.001.001.00
5 REPLIES 5
PaigeMiller
Diamond | Level 26

You are working with variables which have calendar information in the variable name. This is never a good idea, and it is the cause of all of your programming problems. Better to create a long data set with calendar information in a variable called MONTH.

 

So lets create example data in the improved (long) layout.

 

data example;
    do user_id=1 to 2;
        do month=1 to 12;
	        mdy=mdy(month,1,2022);
	        /* create random value for proportion */
	        prop=rand('uniform');
	        output;
        end;
    end;
run;

 

Now the programming to get quarterly averages is simple, no looping needed.

 

proc summary data=example nway;
    class user_id mdy;
    format mdy yyq6.;
    var prop;
    output out=_stats_ mean=quarter_mean;
run;

 

--
Paige Miller
sasecn
Quartz | Level 8

Thanks for your reply. It works on the sample data I provided. The real data I am working on has more other types of variables. And I cannot change the whole data structure to long data. Ideally, I would like just add quarterly active proportion variable after each quarter. Part of the data like:

userIDac12022ac22022ac32022acq12022ac42022ac52022ac62022acq22022ac72022
11.001.001.001.001.001.001.001.001.00
20.000.381.000.791.001.001.001.001.00
PaigeMiller
Diamond | Level 26

@sasecn wrote:

And I cannot change the whole data structure to long data.


Yes you can. And yes you should. This makes the calculations much much much much easier. From there, you can create the report in any format you want.

--
Paige Miller
ballardw
Super User

@sasecn wrote:

Thanks for your reply. It works on the sample data I provided. The real data I am working on has more other types of variables. And I cannot change the whole data structure to long data. Ideally, I would like just add quarterly active proportion variable after each quarter. Part of the data like:

userID ac12022 ac22022 ac32022 acq12022 ac42022 ac52022 ac62022 acq22022 ac72022
1 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
2 0.00 0.38 1.00 0.79 1.00 1.00 1.00 1.00 1.00

Please note that inserting or adding the quarter value is changing the structure of the data set. You can't add a "column" to a data set without replacing the entire set in SAS. Any of the steps you take to modify the data before creating that final set is entirely up to you and the provider would never know.

 

Additional to the bit about the date information in the variable name you chose a very complicated method. If you had used a Year/Mo approach, such as AC20221 (better AC202201) then you could reference variables as lists and get them to process in order: Ac202201-Ac202203 for instance in many place would process those three sequential named variables in order.

 

The process you are currently contemplating means that the code that works this time would need significant revision, i.e. new variable names in more places and conditional executions when you get the next data with 1, 2 or 3 more months added. For each month.

Kurt_Bremser
Super User

Maxim 19: Long Beats Wide.

Do not keep data (e.g. dates or date-related information) in structure (variable names). If you have other, non-sequential variables, keep them in one dataset with one observation per userid, but transform the other values into a long dataset.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 739 views
  • 0 likes
  • 4 in conversation