Hello
I have the following challenge.
Let's say that there is an input data set .
I want to check which numbers in last column are new in comparison to previous column(one column before last column).
I want to check which numbers in one column before last column are not existing in last column.
I think that in the program we need to check how many columns are in in input data set and then we can compare last column with one column before last.
For example:
In the following example if we compare last column with one column before last:
9,10,11 are new members
8,9 left
I want to create a new outcome data set with two columns: new and left
data tbl;
input x1801 x1802 x1803 x1804 x1805 x1806;
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
5 9 9 10
;
run;
If you compare the last column with the next-to-last column, you compare x1806 to x1805. With that, you'd get this output:
2 3 4 10
I think you meant to compare the 6th row with the contents of the 5th?
Hello
In each column I have list of customers which belong to club.
The target is to identify new customer and customers who left .
The comparison is just between last column and one column before last ( next-to-last column).
The issue is that the input table including columns from start of year (JAN) until lat month.
So in June the last column is 1806 and in July last column is 1807 and so on.
The first question is to perform analysis(Merge) of last column with next-to-last column.
In the merge I don't want to say to merge X1806 with X1805....
I want to say to merge last column with next-to-last column
thanks
@Ronein wrote:
Hello
In each column I have list of customers which belong to club.
The target is to identify new customer and customers who left .
The comparison is just between last column and one column before last ( next-to-last column).
The issue is that the input table including columns from start of year (JAN) until lat month.
So in June the last column is 1806 and in July last column is 1807 and so on.
The first question is to perform analysis(Merge) of last column with next-to-last column.In the merge I don't want to say to merge X1806 with X1805....
I want to say to merge last column with next-to-last column
thanks
Define an array (say var), and compare var{dim(var)} with var{dim(var)-1}.
@Ronein wrote:
Hello
In each column I have list of customers which belong to club.
The target is to identify new customer and customers who left .
The comparison is just between last column and one column before last ( next-to-last column).
The issue is that the input table including columns from start of year (JAN) until lat month.
So in June the last column is 1806 and in July last column is 1807 and so on.
The first question is to perform analysis(Merge) of last column with next-to-last column.In the merge I don't want to say to merge X1806 with X1805....
I want to say to merge last column with next-to-last column
thanks
Define an array (say var), and compare var{dim(var)} with var{dim(var)-1}.
You also suffer (once again) from a dumb data model that keeps data (months) in structure (variable names).
I'm quite sure that an intelligent model will make a solution (once we know what you really want) MUCH easier.
Try next code:
%let vars = x1801 x1802 x1803 x1804 x1805 x1806;
data tbl;
retain maxv; drop maxv;
maxv = countw("&vars");
input &vars;
array vx &vars;
if vx(maxv) = vx(maxv -1) then ... ;
else ..... ;
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
5 9 9 10
;
run;
It is nice but in your code the user need to enter manually list of columns(% let ).
The task is to do it automatically (without user definition of the columns names)
It is easy to get list of available variables in a data set, by:
proc sql;
select name into : vars separated by ' '
from dictionary.columns
where libname = "WORK' and memname = 'TBL' ;
quit;
%put &vars;
then use it in my original code.
Can you show what you want out at the end? Not following your logic as:
"9,10,11 are new members" - 11 is not in the data you present.
"8,9 left" - what does this mean?
Generally you can assign an array, and then use the elements:
data tbl;
input x1801 x1802 x1803 x1804 x1805 x1806;
array x{6};
if x{5} ne x{6} then not_equal="Y";
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
5 9 9 10
;
run;
Shows comparing last to position 5.
In last column we have :1,2,3,5,9,10,11
In one column before last column we have:1,2,3,5,8,9
Common members are: 1,2,3,5,9
New members(Exist in last but not in column before); 10,11
Left members(Exist in column before but not in last column); 8
There is No 11 in the test data you presented.
Sorry
Common memebers: 2,3,5,8,
New memebers: 4,10
Left members: 1,9
@Ronein wrote:
Hello
I have the following challenge.
Let's say that there is an input data set .
I want to check which numbers in last column are new in comparison to previous column(one column before last column).
I want to check which numbers in one column before last column are not existing in last column.
I think that in the program we need to check how many columns are in in input data set and then we can compare last column with one column before last.
For example:
In the following example if we compare last column with one column before last:
9,10,11 are new members
8,9 left
I want to create a new outcome data set with two columns: new and left
data tbl; input x1801 x1802 x1803 x1804 x1805 x1806; cards; 1 1 6 1 1 2 2 2 1 2 2 3 3 3 2 3 3 4 4 4 3 4 5 5 5 5 4 8 8 8 5 9 9 10 ; run;
You should make it a habit to check-read your posts. NO 11. Nowhere.
As @Kurt_Bremser remarked, your initial data model is not very good. But I know how it is: somebody gave you this Excel file for whatever...
Given this data
data tbl;
input x1801 x1802 x1803 x1804 x1805 x1806;
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
. . 5 9 9 10
;
run;
I would start by changing it to a long format:
data long;
set tbl;
array months x:;
length month $4;
if _N_=1 then
call symputx('last_month',substr(vname(months(dim(months))),2));
do _N_=1 to dim(months);
if months(_N_)=. then
continue;
month=substr(vname(months(_N_)),2);
member_id=months(_N_);
output;
end;
keep month member_id;
run;
I understand that you do not want to have to enter all the variable names every time, as the number of months may change. Hence the "x:" construct. To see if a member left the club, we have to see if they were still there the last month. So I used the SYMPUTX to put the last month into a macro variable.
Then, I would sort the data, and finally run through it to see when the individual members came and left:
proc sort;
by member_id month;
run;
data want;
do until(last.member_id);
set long;
by member_id;
if first.member_id then
new=month;
end;
if month<"&last_month" then
left=month;
drop month;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.