BookmarkSubscribeRSS Feed
Macro
Obsidian | Level 7

Hello Great SAS Gurus,

I have a longitudinal data set infile, 1000 rows of records, 63 columns (variables). Within the 63 variables, 3 varibles are fixed (not time dependent), called VarF and VarN, and cstatus_0. Other 60 variables are time-indexed, they are cstatus_k, status_k, m_k, for k = 1 to 20 and k is the year index. cstastus_k and status_k variables are simply indicator variables. My goal is quite simple, to summarize this data set to get one record of summarization consisting of time-indexed variables. This could be achieved through proc sql, data step, or proc means. Say, if I would like to do this by proc sql

proc sql;

  create table outfile as

  select   

    sum((cstatus_(k-1)=0)*VarF) format=comma25. as A_k,

    sum((cstatus_(k-1)=0)*VarN) format=comma25. as B_k,

    sum((cstatus_(k-1)=0)*m_k) format=comma25. as C_k,

    sum(calculated C_k *VarN) as D_k,

    sum(status_k=1) as E_k,

    sum((status_k=1)*VarN) as F_k

  from infile;

quit;

I actually want the above summary be done for k=1 to 20 so that I can have a summary file outfile consisting one record including following variables like

A_1, B_1, C_1, D_1, E_1, F_1, A_2, B_2, C_2, D_2, E_2, F_2, ...., A_20, B_20, C_20, D_20, E_20, F_20

I prefer not to transpose my infile to a long format file (ie, create a column to record the year index). Are there any efficient ways to achive my goal through abbreviated variable list, arrays, macro variables, looping with proc sql, or data step, or other summary procedures like proc means? Thanks.

11 REPLIES 11
Reeza
Super User

Can you post a small example of your data and expected output.

Too much text for a programmer/math person like me Smiley Wink

Macro
Obsidian | Level 7

 

VarF

 

VarN

 

M1

 

Status1

 

CStatus1

 

M2

 

Status2

 

CStatus2

 

M3

 

Status3

 

CStatus3

 

100000

 

11054.4

 

0.000297

 

0

 

0

 

0.000331

 

0

 

0

 

0.000368

 

0

 

0

 

100000

 

10963.28

 

0.000297

 

0

 

0

 

0.000331

 

0

 

0

 

0.000368

 

0

 

0

 

100000

 

11652.98

 

0.000431

 

0

 

0

 

0.000446

 

0

 

0

 

0.000498

 

0

 

0

 

200000

 

23408.25

 

0.00148

 

0

 

0

 

0.001582

 

0

 

0

 

0.001655

 

0

 

0

 

120000

 

14283.6

 

0.001615

 

0

 

0

 

0.001707

 

0

 

0

 

0.001766

 

0

 

0

 

120000

 

14323.5

 

0.001615

 

0

 

0

 

0.001707

 

0

 

0

 

0.001766

 

0

 

0

 

100000

 

11643.45

 

0.004211

 

0

 

0

 

0.004503

 

0

 

0

 

0.004886

 

0

 

0

 

100000

 

10947.15

 

0.000512

 

0

 

0

 

0.000531

 

0

 

0

 

0.000609

 

0

 

0

 

100000

 

10851.9

 

0.000512

 

0

 

0

 

0.000531

 

0

 

0

 

0.000609

 

0

 

0

 

200000

 

5258.85

 

0.000578

 

0

 

0

 

0.000621

 

0

 

0

 

0.000679

 

0

 

0

The above is the a small example of infile, I reduce 1000 rows to 10 rows, and 20 time-indexed variables to 3 time-indexed variabls. Expected output will be

as above with 20 replaced by 3, one summary record with varibles A_1, ...., F_1, A_2, ..., F_2, A_3,..., F_3

Reeza
Super User

Can you include the output you'd want?

From your SQL code you also refer to cstatus_(k-1) - but K starts at 1?

Macro
Obsidian | Level 7

Actually my infile contains a column called cstatus0 (or cstatus_0), which is missing in the small sample file above.  The output is the output from proc sql. I don't know what output you are refering to.

Reeza
Super User

The results you'd like to see based on the input provided.  The SQL you provided won't work with the data provided. 

Have the battle in solving a problem is defining it, so both you and someone else can understand it.

UrvishShah
Fluorite | Level 6

Hi,

I think it quite eassy by using %DO Loop within proc sql...But i am able to understand the Select caluse in your Proc Sql...However, if you want perform repeated action on so many variables in SAS than use %DO Loop within Macro Definition...

%macro test;

        proc sql feedback;

               create table outfile as

               select

                        %do i = 1 %to 20;

                               sum((cstatus_&i. = 0)*VarF) format=comma25. as A_&i.,

                               sum((cstatus_&i. = 0)*VarN) format=comma25. as B_&i.,

                               sum((cstatus_&i. = 0)*m_&i.) format=comma25. as C_&i.,

                               sum(calculated C_&i. *VarN) as D_&i.,

                               sum(status_&i. = 1) as E_&i.,

                               sum((status_&i. = 1)*VarN) as F_&i.

                        %end;

               from infile;

         quit;

%mend;


%test;

Hope it works....

Thanks,

Urvish

Reeza
Super User

I think an array in a data step is probably better and easier to maintain.

Macro
Obsidian | Level 7

Hi Reeza,

Thank you for the response. I would like to see your data step way to solve the same problem. Like what Urvish did, you just show the code and main idea, don't worry about small details.

UrvishShah
Fluorite | Level 6

Yes Reeza it is possible by using SAS Arrays in a data step...

Thanks,

Urvish


Macro
Obsidian | Level 7

Hi Urvish,

Thanks for your code which works. I used a similar code as yours before, but I got an error message. Then I thought that proc sql may not allow loops. Finally I found that error was caused by using sum() function to calculated variable, which was not allowed in sql. I guess this might be the reason that Reeza said my code did not work. Once I replaced the calculated variable by the expression used to calculate it, and used your looping, then it worked. Another thing I recognized is that your code should be modified by the following:

%macro test;

        proc sql feedback;

               create table outfile as

               select

                        %do i = 1 %to 19;

                               sum((cstatus_&i. = 0)*VarF) format=comma25. as A_&i.,

                               sum((cstatus_&i. = 0)*VarN) format=comma25. as B_&i.,

                               sum((cstatus_&i. = 0)*m_&i.) format=comma25. as C_&i.,

                               sum(calculated C_&i. *VarN) as D_&i.,

                               sum(status_&i. = 1) as E_&i.,

                               sum((status_&i. = 1)*VarN) as F_&i.,

                        %end;

                             sum((cstatus_&20. = 0)*VarF) format=comma25. as A_&20.,  

                               sum((cstatus_&20. = 0)*VarN) format=comma25. as B_&20.,

                               sum((cstatus_&20. = 0)*m_&20.) format=comma25. as C_&20.,

                               sum(calculated C_&20. *VarN) as D_&20.,

                               sum(status_&20. = 1) as E_&20.,

                               sum((status_&20. = 1)*VarN) as F_&20.

               from infile;

         quit;

%mend;


%test;

since proc sql need to use comma to separate variables. I don't know why you use period after each &var. I removed them, it also worked. Thanks again for your help.

UrvishShah
Fluorite | Level 6

Hi...

Ya you are right...Actually when i saw your Question posted in this forum, i was at work...and i do not have SAS at my local so can't copy and paste your sample data to my server...So i thought it works but it's good that you god idea about how to get output...

Happy to discuss with you...

Thanks,

Urvish

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1196 views
  • 0 likes
  • 3 in conversation