Re: compare last column with one column before last

Ronein · Posted 08-13-2018 03:52 AM

Hello

I have the following challenge.

Let's say that there is an input data set .

I want to check which numbers in last column are new in comparison to previous column(one column before last column).

I want to check which numbers in one column before last column are not existing in last column.

I think that in the program we need to check how many columns are in in input data set and then we can compare last column with one column before last.

For example:

In the following example if we compare last column with one column before last:

9,10,11 are new members

8,9 left

I want to create a new outcome data set with two columns: new and left

data tbl;
input x1801 x1802 x1803 x1804 x1805 x1806;
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
    5 9 9 10
;
run;

Kurt_Bremser · Posted 08-13-2018 04:08 AM

If you compare the last column with the next-to-last column, you compare x1806 to x1805. With that, you'd get this output:

I think you meant to compare the 6th row with the contents of the 5th?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ronein · Posted 08-13-2018 04:14 AM

Hello

In each column I have list of customers which belong to club.

The target is to identify new customer and customers who left .

The comparison is just between last column and one column before last ( next-to-last column).

The issue is that the input table including columns from start of year (JAN) until lat month.

So in June the last column is 1806 and in July last column is 1807 and so on.
The first question is to perform analysis(Merge) of last column with next-to-last column.

In the merge I don't want to say to merge X1806 with X1805....

I want to say to merge last column with next-to-last column

thanks

Kurt_Bremser · Posted 08-13-2018 04:21 AM

@Ronein wrote:

Hello

In each column I have list of customers which belong to club.

The target is to identify new customer and customers who left .

The comparison is just between last column and one column before last ( next-to-last column).

The issue is that the input table including columns from start of year (JAN) until lat month.

So in June the last column is 1806 and in July last column is 1807 and so on.
The first question is to perform analysis(Merge) of last column with next-to-last column.

In the merge I don't want to say to merge X1806 with X1805....

I want to say to merge last column with next-to-last column

thanks

Define an array (say var), and compare var{dim(var)} with var{dim(var)-1}.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 08-13-2018 04:21 AM

@Ronein wrote:

Hello

In each column I have list of customers which belong to club.

The target is to identify new customer and customers who left .

The comparison is just between last column and one column before last ( next-to-last column).

The issue is that the input table including columns from start of year (JAN) until lat month.

So in June the last column is 1806 and in July last column is 1807 and so on.
The first question is to perform analysis(Merge) of last column with next-to-last column.

In the merge I don't want to say to merge X1806 with X1805....

I want to say to merge last column with next-to-last column

thanks

Define an array (say var), and compare var{dim(var)} with var{dim(var)-1}.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 08-13-2018 04:31 AM

You also suffer (once again) from a dumb data model that keeps data (months) in structure (variable names).

I'm quite sure that an intelligent model will make a solution (once we know what you really want) MUCH easier.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Shmuel · Posted 08-13-2018 04:14 AM

Try next code:

%let vars = x1801 x1802 x1803 x1804 x1805 x1806;
data tbl;
    retain maxv; drop maxv;
    maxv = countw("&vars");

    input &vars;

    array vx &vars;
    if vx(maxv) = vx(maxv -1) then ... ;
    else ..... ;

cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
    5 9 9 10
;
run;

Ronein · Posted 08-13-2018 04:28 AM

It is nice but in your code the user need to enter manually list of columns(% let ).

The task is to do it automatically (without user definition of the columns names)

Shmuel · Posted 08-13-2018 06:14 AM

It is easy to get list of available variables in a data set, by:

proc sql;
   select name into : vars separated by ' '
   from dictionary.columns
   where libname = "WORK' and memname = 'TBL' ;
quit;
%put &vars;

then use it in my original code.

RW9 · Posted 08-13-2018 04:18 AM

Can you show what you want out at the end? Not following your logic as:

"9,10,11 are new members" - 11 is not in the data you present.

"8,9 left" - what does this mean?

Generally you can assign an array, and then use the elements:

data tbl;
  input x1801 x1802 x1803 x1804 x1805 x1806;
  array x{6};
  if x{5} ne x{6} then not_equal="Y";
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
    5 9 9 10
;
run;

Shows comparing last to position 5.

Ronein · Posted 08-13-2018 04:53 AM

In last column we have :1,2,3,5,9,10,11

In one column before last column we have:1,2,3,5,8,9

Common members are: 1,2,3,5,9

New members(Exist in last but not in column before); 10,11

Left members(Exist in column before but not in last column); 8

RW9 · Posted 08-13-2018 05:20 AM

There is No 11 in the test data you presented.

Ronein · Posted 08-14-2018 02:10 AM

Sorry

Common memebers: 2,3,5,8,

New memebers: 4,10

Left members: 1,9

Kurt_Bremser · Posted 08-13-2018 06:13 AM

@Ronein wrote:

Hello

I have the following challenge.

Let's say that there is an input data set .

I want to check which numbers in last column are new in comparison to previous column(one column before last column).

I want to check which numbers in one column before last column are not existing in last column.

I think that in the program we need to check how many columns are in in input data set and then we can compare last column with one column before last.

For example:

In the following example if we compare last column with one column before last:

9,10,11 are new members

8,9 left

I want to create a new outcome data set with two columns: new and left
data tbl;
input x1801 x1802 x1803 x1804 x1805 x1806;
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
    5 9 9 10
;
run;

You should make it a habit to check-read your posts. NO 11. Nowhere.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

s_lassen · Posted 08-13-2018 08:59 AM

As @Kurt_Bremser remarked, your initial data model is not very good. But I know how it is: somebody gave you this Excel file for whatever...

Given this data

data tbl;
input x1801 x1802 x1803 x1804 x1805 x1806;
cards;
1 1 6 1 1 2
2 2 1 2 2 3
3 3 2 3 3 4
4 4 3 4 5 5
5 5 4 8 8 8
. . 5 9 9 10
;
run;

I would start by changing it to a long format:

data long;
  set tbl;
  array months x:;
  length month $4;
  if _N_=1 then 
    call symputx('last_month',substr(vname(months(dim(months))),2));
  do _N_=1 to dim(months);
    if months(_N_)=. then 
      continue;
    month=substr(vname(months(_N_)),2);
    member_id=months(_N_);
    output;
    end;
  keep month member_id;
run;

I understand that you do not want to have to enter all the variable names every time, as the number of months may change. Hence the "x:" construct. To see if a member left the club, we have to see if they were still there the last month. So I used the SYMPUTX to put the last month into a macro variable.

Then, I would sort the data, and finally run through it to see when the individual members came and left:

proc sort;
  by member_id month;
run;

data want;
  do until(last.member_id);
    set long;
	by member_id;
	if first.member_id then
	  new=month;
	end;
  if month<"&last_month" then
    left=month;
  drop month;
run;

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away