Hi,
My task is to identify clients who spread the payment over time. If the event occurs, the value is 1, otherwise 0. I'm working in SAS guide so I prefer solution in query builder. I got example data table:
Customer MAY JUNE JULY Target
A 100 0 200 1
B 300 0 0 0
How can I achieve this?
simple task for array,
data have;
input Customer $ MAY JUNE JULY;
cards;
A 100 0 200
B 300 0 0
;
run;
data want;
set have;
array months MAY JUNE JULY;
target =0;
do over months;
target + (months > 0);
end;
target = (target > 1);
run;
but I would suggest to keep data in "long" format:
data have_long;
input Customer $ months $ amount;
cards;
A MAY 100
A JUNE 0
A JULY 200
B MAY 300
B JUNE 0
B JULY 0
;
run;
proc sql;
select Customer,
case when sum(amount > 0) > 1 then 1
else 0
enD as target
from
have_long
group by
Customer
;
quit;
All the best
Bart
simple task for array,
data have;
input Customer $ MAY JUNE JULY;
cards;
A 100 0 200
B 300 0 0
;
run;
data want;
set have;
array months MAY JUNE JULY;
target =0;
do over months;
target + (months > 0);
end;
target = (target > 1);
run;
but I would suggest to keep data in "long" format:
data have_long;
input Customer $ months $ amount;
cards;
A MAY 100
A JUNE 0
A JULY 200
B MAY 300
B JUNE 0
B JULY 0
;
run;
proc sql;
select Customer,
case when sum(amount > 0) > 1 then 1
else 0
enD as target
from
have_long
group by
Customer
;
quit;
All the best
Bart
If you want to do this in the query builder, you have to transpose first.
Otherwise you need a manually written data step with array processing, as shown by @yabwon
Hi @PatrykSAS,
Would a single numeric expression be more amenable to the Query Builder? (I don't know as I don't use EG.) If so, you could try something like this:
target=largest(2, of may--july)>0;
Note that the inequality is equivalent to "There are at least two positive numbers among the values of the variables in the name range list may--july in the current observation."
The query builder uses SQL, which does not support the use of OF in a function, so you would still need to write all columns separately.
Using the "point and click" facilities takes several steps, but each one is fairly easy.
1. First, sort by Customer (needed for the transpose).
2. Then run a Data -> Transpose that has Customer as the "Group Analysis By" variable, and all of the other variables as the transpose variables. The result should look like this:
Customer | NAME OF FORMER VARIABLE | Column1 |
A | MAY | 100 |
A | JUNE | 0 |
A | JULY | 200 |
B | MAY | 300 |
B | JUNE | 0 |
B | JULY | 0 |
3. Now, run a query to select only the rows where Column1 is greater than zero (get rid of your zero values).
4. Finally, you can either use a query that counts the number of records for each customer, or use Describe -> Summary Statistics to find the customers that have count 1, or greater than 1.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.