Re: I need help writing a macro that loops through a column of data

RikeshPunja · Posted 09-24-2020 07:21 AM

To help let me explain the data that I have and what I want to be done I have attached the below. I am sure it is simple.

I have an SQL table that is very very large. It looks like the below but much longer:

I just want a table that has a single line for each unique ID when the first time the amount changes above 0.

%macro test;
  
proc sql NOPRINT;
 connect to odbc (***************);
 create table work.TEST as 
 select * from connection to odbc 
	(Select ID, ID_2, Amount, Date  from Database.Source
	 where Amount > 0
	 and ID = 0001
	 order by ID, ID_2, Date asc
	 );
QUIT;

PROC SORT DATA = TEST;
	by ID ID_2 Date;
RUN;

DATA First_Amount;
SET TEST;
by ID ID_2 Date;
if first.ID and first.Date;
RUN;

%mend test;

I know I can create a table with the unique ID's but I am unsure how to write the code to pass through each ID in the macro above.

If anyone knows an easy solution to my output I'd really appreciate it.

PaigeMiller · Posted 09-24-2020 07:28 AM

I think using macros here is unnecessary, and complicates the code unnecessarily.

You can do this with PROC SORT and a DATA step. This will be much simpler.

/* UNTESTED CODE */
proc sort data=have out=have1;
    by id date;
run;
data want;
    set have1;
    by id date;
    prev_amount=lag(amount);
    if not first.id and prev_amount=0 and amount>0 then output;
    drop prev_amount;
run;

--
Paige Miller

RikeshPunja · Posted 09-24-2020 12:21 PM

Hi Paige Miller,

Thank you for your response. I tried as above but it didn't quite work. I received an empty table.

If you have any further ideas it would be greatly appreciated.

Thank you

PaigeMiller · Posted 09-24-2020 12:27 PM

@RikeshPunja wrote:

Thank you for your response. I tried as above but it didn't quite work. I received an empty table.

For your future reference, saying "it didn't work" never moves the conversation forward, it doesn't help, we have no idea what happened. We need to see the code you tried and the results (LOG or OUTPUT) where we can see that it didn't work and where we can see WHY it didn't work. Furthermore, we would need an example of a portion of your data as a SAS data step (and not in any other format and not as a screen capture).

--
Paige Miller

RichardDeVen · Posted 09-24-2020 01:24 PM

You can submit a pass-through query that does everything you want. Should be quite fast if SQL Server table is enormous because no detail records are passed into SAS for processing, only the final result set.

Example:

A deep nested query joins of rows of zero amounts to rows of nonzero amount. That join result is measured groupwise with row_number () over each id, and that finally selected for the first row of the group which is the row of non-zero amount first transitioning from a row of a zero amount.

SQL Passthrough

CREATE TABLE have
    (id integer, id2 integer, amount real, [date] date)
;

INSERT INTO have
VALUES
  (1, 10,  0, '2020-08-20')
, (2, 10,  0, '2020-08-20')
, (3, 10,  0, '2020-08-20')
, (4, 10,  0, '2020-08-20')
, (5, 10, 50, '2020-08-20')
, (1, 10, 50, '2020-08-21')
, (2, 10, 50, '2020-08-21')
, (3, 10, 50, '2020-08-21')
, (4, 10,  0, '2020-08-21')
, (1, 10, 50, '2020-08-22')
, (2, 10, 50, '2020-08-22')
, (3, 10, 50, '2020-08-22')
, (4, 10, 50, '2020-08-22')
;

Query

select * from 
( select
    seek.* ,
    row_number() over (
      partition by id
      order by [date]
    ) [seqnum]
  from
  (
    select 
      [nzeros].*
    from
      ( select id, [date] from have where amount = 0 ) [zeros]
    right join
      ( select id, [date] from have where amount > 0 ) [nzeros]
    on
      [zeros].id = [nzeros].id
      and [zeros].date < [nzeros].date
  ) [seek]
) [groups]
where 
  [groups].seqnum = 1
;

I need help writing a macro that loops through a column of data