BookmarkSubscribeRSS Feed
abhee
Calcite | Level 5

I have a query regarding SAS array.

I want to compare column values, the condition is as mentioned below:

1) if col1 = col2 then flag = 1

2) OR if col1 = col4 then flag = 1

3) OR if col4 = col22 then flag =1

array should check each condition one by on i.e., first col1 to col2, then col1 to col3 and so on....

If any of the above condition matches, pointer should move to next position

e.g.,

If 1) then next check should be col2 = col3...and so on...

If 2) then next check should be col4 = col5...and so on...

If 3) then next check should be col22 = col23...and so on...

Please find attached the sample example.

Thanks,

Abhee

6 REPLIES 6
DR_Majeti
Quartz | Level 8

Hi Abhee,

I need a clear information about logic , should we have to get flagging for every comparison of columns , OR

for every col1=col2 flag=1 , col1=col3 flag=1...

For every if col1=col2 , col1=col3 .................. col1=col50 then flag=1

If i am wrong please can you show it on a sample output  with five columns

abhee
Calcite | Level 5

Hi DR.Majeti,

Yes, we have to get flagging for every comparison of columns whenever it get two column with same value, we should create a flag for that.

e.g., suppose you have col1 = apple and col2 = apple then flag1 = 1; if col1 = apple and col2= mango then flag1 = 0. Similarly, if col1 = apple and col3 = apple then flag1 = 0, flag2 = 0, flag3 = 1 and so on....

Once we get col1 = apple and col2 = apple, we should compare col2 with other columns.

e.g., col2 with col3, col 2 with col4, col2 with col5 and so on.....and whenever we get match....create a flag for that.

Please see attached .text file in my query. There are columns d1, d2, d3......so on.....

Let me know if you need more information.

Thanks,

Abhee

Vince28_Statcan
Quartz | Level 8

Assume you read your text file in a dataset called HAVE

data want;

     set have;

     array dt {*} dt: ;

     flag=0;

     do i=2 to (dim(dt)-2) by 2 until (flag=1);

          if dt{i} = dt{i+1} then flag=1;

     end;

run;

Syntax not tested but logic is there. The until loop control is just saving some loop iteration when one condition is true early.

As for quick explanation: using {*} and dt: in the array statement is where its all at. It uses array indexing to "point" to a given variable in the data vector allowing you to easily do conditions or arithmetics accross variables from a relative-index standpoint.

Hope I understood your question properly

Vince

abhee
Calcite | Level 5

Hi Vince,


Thanks for the reply.

The scenario is as below:

we have to get flagging for every comparison of columns whenever it get two column with same value, we should create a flag for that and this is matching of columns not in continuous manner. There may be situation like the value of col1 matches with col15....

e.g., suppose you have col1 = apple and col2 = apple then flag1 = 1; if col1 = apple and col2= mango then flag1 = 0. Similarly, if col1 = apple and col3 = apple then flag1 = 0, flag2 = 0, flag3 = 1 and so on....

Once we get col1 = apple and col2 = apple, we should compare col2 with other columns.

e.g., col2 with col3, col 2 with col4, col2 with col5 and so on.....and whenever we get match....create a flag for that.

Fugue
Quartz | Level 8

Just to confirm:

You want to compare every column with every adjacent column and create a separate flag variable for each pairwise comparison?

Then, you want to compare every other column with the first matching column (of each pairwise comparison) if the values of two adjacent columns match? Presumably, you will want a separate flag variable for each of these subsequent pairwise comparisons?

Vince28_Statcan
Quartz | Level 8

I am really not clear with what all flag# refer to.

Similarly, if col1 = apple and col3 = apple then flag1 = 0, flag2 = 0, flag3 = 1

Shouldn't the above that you've given as an example have flag1=1, flag2=0, flag3=1 ?

How do you go about if col1=col4 and col2=col3? Should you only have flag4=1 and all other flags=0 because you mentioned the pointer should've been moved to 4 from there?

Taking my example up there again and adding a few things - note that since flags don't already exist and you can use relative dimensions to create arrays, you would need to use like a proc sql on dictionary.columns to figure get the flag count to be created in a macro variable first. For simplicity at this point, I'll use a %let flagdim

%let flagdim=50;

data want;

     set have;

     array dt {*} dt: ;

     array flag {*} flag1-flag&flagdim. (&flagdim.*0); /* initiate all flags to 0 instead of missing */

     flag=0;

     do i=1 to (dim(dt)-1);

               if dt(i) = dt(i+1) then do;

                    flag(i) = 1;

                    i+1;

               end; /* give flag=1 to current dt if the immediate successor is equal else it should be given to the other dates' flag??? */

               else do j=i+1 to (dim(dt)) until (flag(j)=1);

                    if dim(i) = dim(j) then do;

                         flag(j) = 1;

                         i=j+1;

                    end;

               end;

     end;

run;

I still really am not clear on how you are defining your flags. The example I copied above where you say if dt(1)=dt(3) only flag(3) should be equal to 1 is really not a usual approach to flagging. Shouldn't they both be flagged? Why is the later dt flagged when its not the immediate successor of the former dt() but the former dt() is flagged when its immediate successor is equal?

What are you looking to achieve with your flags afterwards? Maybe their definition isn't completely clear and we could help with that as well.

Looking back at your example again

if col1 = apple and col3 = apple then flag1 = 0, flag2 = 0, flag3 = 1

If col4 is not equal to apple, will you then replace flag3 with a 0?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 803 views
  • 0 likes
  • 4 in conversation