sas code help

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

sas code help

hi i need help in writing a code for the following data set and output 

ID         variable_1              variable_2              figure

A          3456                    3452                    200                       

A          3456                    3453                   300                       

A          3456                    3454                    400                      

A          3456                    3455                   2400

A          3456                    3456                    4200

A          3456                    3457                    2020                      

A          3456                    3458                   2200                       

A          3456                    3459                    1200                        

A          3456                    3460                    20

A          3458                    3454                    2700

A          3458                    3455                    2200                       

A          3458                    3456                    7200

A          3458                    3457                    3200

A          3458                    3458                    2300

A          3458                    3459                    4200

A          3458                    3460                    200

A          3458                    3461                    2300

A          3458                    3462                    5200

I have similar data for more ID's...like B, C, D where variable_1 takes two values and correspondingly variable_2 takes values equal to, lower and higher values than variable_1. and for each combination, I have a different figure value. now I want to take sum  of figure values for 2 entries up and 3 entries down of row where variable_1 = variable_2. and I want the final output as:

ID     variable_1     sum_figure_up         sum_figure_down

A     3456               2800                         5420

A     3458               10400                         6700


for all ID's and variable_1 combinations. PLEASE HELP ASAP!!!


DONT USE SQL.


Accepted Solutions
Solution
‎02-24-2014 12:55 PM
Super User
Posts: 17,829

Re: sas code help

data want;

  set temp;

  by id line;

  retain ind1 ind2; *keep ind1/ind2 values between rows;

  diff=bill-line; *Calculate the difference;

  if first.line then do; *If this is the first line then set summations to 0;

       ind1=0; ind2=0;

   end;

  if diff in (-1, -2) then ind1=sum(bal, ind1); *Use a rolling sum to capture the sums;

  if diff in (1,2,3) then ind2=sum(bal, ind2);

  if last.line then output; *only output on last line;

  drop diff;

run;


View solution in original post


All Replies
Super User
Posts: 17,829

Re: sas code help

Can we assume that var2 will be sequential as in the example?

Occasional Contributor
Posts: 8

Re: sas code help

may or may not be. most probably will be sequential. go ahead and assume it sequential.

Super User
Posts: 17,829

Re: sas code help

1. Subtract the numbers.

2. Keep only those that are between -2 and 3, not including 0

3. Sum

Occasional Contributor
Posts: 8

Re: sas code help

could you help with the code?

how to take sum and put the final data as the output that I have shown?

Super User
Posts: 17,829

Re: sas code help

No, because the ASAP and no sql makes me think assignment or test.

Occasional Contributor
Posts: 8

Re: sas code help

no reeza. actually i have just joined a data analytics firm and getting myself familiar with sas coding and they have specific rule of not using proc sql as it can only be used to pull data from the datawarehouse. No assignment or test. LOL

For the sum, I can run proc summary with class as ID, var_1 and difference variable. but that is giving me the sum for all values of var_1. but I want sum for only first value of var_1 and I am not sure how to delete the rows with second var_1 values.(proc sort with nodupkey isn't working). That's why I asked for the help. this is a client deliverable that's why I wrote asap. Please help me with the code of sum part as in the output format but only for one combination of ID*var_1 for all the IDs.

output:

ID     variable_1     sum_figure_up         sum_figure_down

A     3456               2800                         5420

B     3462              some value              some value


like this.

Super User
Posts: 17,829

Re: sas code help

Perhaps post what you've tried.

Occasional Contributor
Posts: 8

Re: sas code help

this is what i have tried....

data temp;

infile DATALINES dsd missover;

input ID line bill bal;

CARDS;

788,3456, 3452,200                     

788,         3456,                    3453,                   300                     

788,          3456,                    3454,                    400                    

788,          3456,                    3455,                   2400

788,          3456,                    3456,                    4200

788,          3456,                    3457,                    2020                    

788,          3456,                    3458,                   2200                     

788,          3456,                    3459,                    1200                      

788,          3456,                    3460,                    20

788,          3458,                    3454,                    2700

788,          3458,                    3455,                    2200                     

788,          3458,                    3456,                    7200

788,      3458,                    3457,                    3200

788,      3458,                    3458,                    2300

788,       3458,                    3459,                    4200

788,        3458,                    3460,                    200

788,          3458,                    3461,                    2300

788,          3458,                    3462,                   5200

;

run;

data temp2;

set temp;

diff = bill-line;

run;

data temp3;

set temp2;

if diff in (-2,-1) then ind = 1;

if diff in (1,2,3) then ind = 2;

run;

proc summary data= temp3 nway missing;

class ID line ind;

var bal;

output out= temp4 sum=;

run;

data temp5(keep= ID line bal);

set temp4;

where ind in (1,2);

run;

Now the problem is that I am getting the second sum(when the diff is 1 2 3) in the same column, not in a different column. How to get that?

Also if there is a better way to do this, please tell.

current output:

IDlinebal
78834562800
78834565420
788345810400
78834586700
Super User
Posts: 17,829

Re: sas code help

data want;

  set temp;

  by id line;

  retain ind1 ind2;

  diff=bill-line;

  if diff in (-1, -2) then ind1=sum(bal, ind1);

  if diff in (1,2,3) then ind2=sum(bal, ind2);

  if last.line then output;

  drop diff;

run;


Occasional Contributor
Posts: 8

Re: sas code help

could you please explain how this will work....and also it is giving wrong output for second value of line for the same ID....

Solution
‎02-24-2014 12:55 PM
Super User
Posts: 17,829

Re: sas code help

data want;

  set temp;

  by id line;

  retain ind1 ind2; *keep ind1/ind2 values between rows;

  diff=bill-line; *Calculate the difference;

  if first.line then do; *If this is the first line then set summations to 0;

       ind1=0; ind2=0;

   end;

  if diff in (-1, -2) then ind1=sum(bal, ind1); *Use a rolling sum to capture the sums;

  if diff in (1,2,3) then ind2=sum(bal, ind2);

  if last.line then output; *only output on last line;

  drop diff;

run;


Occasional Contributor
Posts: 8

Re: sas code help

many thanks to all....but i did it using various data and proc steps....actually I have just started using sas and couldn't do the entire process in one step...but the methods suggested are really good...thanks again.

Valued Guide
Posts: 3,208

Re: sas code help

What kind of data analytics firm is thinking "proc sql" with SAS is just for extracting data out of a RDBMS?

You are joking?

SAS tables are almost as good relational as wiht a DBMS. That is why proc sql can run on SAS tables and RDBMS as well
Never mind the analytics fucntions a lot of RDBMS has build in their native SQL is something SAS Always had although sometimes in other ways.
As being not standardized for the more specialized functions it is difficult to approach them in a standardized way

When processing a SAS data-step it is a sequential approach.

You can do some evaluations on prior records when using retain or lag functions. Retain is easier to understand

The first.<var> and last.<var> are supporting grouping on  ordered data. By that you have your breakpoints for the calculations.    

---->-- ja karman --<-----
Valued Guide
Posts: 3,208

Re: sas code help

extreme simplified coding......

data want (drop=idcnt ind1 ind2 );
set temp;
by id line;

retain ind1 ind2; /* two new variables holding previous record values before cond */
    retain idcnt  ;  /* new variable holding counter after cond */  
    retain balpre balaft ; /* new variables holding accumulation, before output  */

if first.line then do;
     ind1=.;ind2=.;indcnt=.; /* set counters for grouped condition */
    end;  

if ( bill-line = 0) then do; /* condition to check */
  balpre=sum(ind1, ind2);
  idcnt=0; /* start summing */
end;

    ind2=ind1; ind1=bal; /* simulate lag */
    if (1 <= idcnt <= 3) then balaft=sum(balaft,bal) ;

if (last.line & idcnt < 3)   or idcnt=3 then do;
      output;
end;
idcnt=idcnt+1 ;
run;

Result

id    lined bill bal balpre balaft

788 3456 3459 1200 2800 5420 .

788 3458 3461 2300 10400 12120 .

It is simplfied to understand.

Default settings, Array-s and lag fucntionality and more to be added for the next coders levels.


---->-- ja karman --<-----
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 502 views
  • 0 likes
  • 4 in conversation