BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nanmz
Fluorite | Level 6

Hi everyone,

 

I am still pretty much new to SAS and would appreciate any sort of guidance/pointers from the experts here.

 

I have a dataset that looks like this:

IDA1B1C1
a0.1230promo1
b0.1230promo1
b0.1530promo1
b0.1230promo1
b0.1840promo2
c0.1230promo1
c0.1840promo2
c0.16540promo2
c0.1840

promo2

 

My main aim is to flag (0 or 1) for rows where if their IDs are the same and having the same values in column C1, same values in column B1 BUT having different values in column A1, to be flagged as 1. Thus, the output that I'm looking for would look something like this:

 

IDA1B1C1Flag
a0.1230promo10
b0.1230promo11
b0.1530promo11
b0.1230promo11
b0.1840promo20
c0.1230promo10
c0.1840promo21
c0.16540promo21
c0.1840

promo2

1

 

The "incorrect" values/rows to be flagged are of those highlighted in red.  I initially wanted to transpose the column from long to wide and have the table to be unique based on their ID. However the duplicate IDs have up to 60+ rows hence I feel that it is not as efficient having hundreds of columns that I might not even use. Is there any other alternative in approaching this? Would appreciate any form of input.  Thanks in advance.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @nanmz,

 

Try this to obtain your desired result:

/* Create sample data */

data have;
input ID $ A1 B1 C1 $;
cards;
a 0.12 30 promo1
b 0.12 30 promo1
b 0.15 30 promo1
b 0.12 30 promo1
b 0.18 40 promo2
c 0.12 30 promo1
c 0.18 40 promo2
c 0.165 40 promo2
c 0.18 40 promo2
;

/* Create a unique sort key */

data _tmp / view=_tmp;
set have;
_seqno=_n_;
run;

/* Flag groups of observations with inconsistent A1 values */

proc sql;
create table want(drop=_seqno) as
select *, count(distinct A1)>1 as Flag
from _tmp
group by ID, B1, C1
order by _seqno;
drop view _tmp;
quit;

View solution in original post

6 REPLIES 6
AMSAS
SAS Super FREQ

Hi,

This can be done with FIRST. and LAST. DATA Step Variables

 

Although I'm not clear on what you are attempting to achieve. I don't understand why flag=1 in rows 2 & 7

ID A1 B1 C1 Flag
a 0.12 30 promo1 0
b 0.12 30 promo1 1
b 0.15 30 promo1 1
b 0.12 30 promo1 1
b 0.18 40 promo2 0
c 0.12 30 promo1 0
c 0.18 40 promo2 1
c 0.165 40 promo2 1
c 0.18 40

promo2

1

nanmz
Fluorite | Level 6

Hi,

 

Thanks for your reply. I wanted to flag rows 2 and 7 as well because they belong in the same group (i.e. rows 2,3, and 4 are in the same group, rows 7, 8 and 9 are in the same group as well). My goal is to find inaccuracies in column A1 in terms of their value as I don't have an external lookup table that would be able to map column A1, B1 and C1 respectively.

Shmuel
Garnet | Level 18

You input test tables are each 9 observations.

By comparing each possible pair of the two tables observations the output contains more observations then you posted as wanted:

data table1;
  infile cards dlm='0920'x truncover;
  input ID $ A1 B1 c1 $;
cards;
a	0.12	30	promo1
b	0.12	30	promo1
b	0.15	30	promo1
b	0.12	30	promo1
b	0.18	40	promo2
c	0.12	30	promo1
c	0.18	40	promo2
c	0.165	40	promo2
c	0.18	40	promo2
;run;
data table2;
  infile cards dlm='0920'x truncover;
  input ID $ A1 B1 c1 $;
cards;
a	0.12	30	promo1	
b	0.12	30	promo1	
b	0.15	30	promo1	
b	0.12	30	promo1	
b	0.18	40	promo2	
c	0.12	30	promo1	
c	0.18	40	promo2	
c	0.165	40	promo2	
c	0.18	40	promo2  
;run;
proc sql;
  create table want as
  select a.*,
  case when (a.a1 ne b.a1) 
      then 1 else 0 end as flag 
  from table1 as a
  left join table2 as b
  on (a.ID=b.ID and a.c1=b.c1 and a.b1=b.b1);
quit;

AS for the first compare method mentioned:

 

 

 

 

FreelanceReinh
Jade | Level 19

Hi @nanmz,

 

Try this to obtain your desired result:

/* Create sample data */

data have;
input ID $ A1 B1 C1 $;
cards;
a 0.12 30 promo1
b 0.12 30 promo1
b 0.15 30 promo1
b 0.12 30 promo1
b 0.18 40 promo2
c 0.12 30 promo1
c 0.18 40 promo2
c 0.165 40 promo2
c 0.18 40 promo2
;

/* Create a unique sort key */

data _tmp / view=_tmp;
set have;
_seqno=_n_;
run;

/* Flag groups of observations with inconsistent A1 values */

proc sql;
create table want(drop=_seqno) as
select *, count(distinct A1)>1 as Flag
from _tmp
group by ID, B1, C1
order by _seqno;
drop view _tmp;
quit;
nanmz
Fluorite | Level 6

Thanks for your reply @FreelanceReinh ! this worked out great for me. Just a follow up question that I'm still trying to wrap my head around:

 

1) how does creating a unique sort key (_seqno) prior to the proc sql help me achieve the desired output?

 

Thanks again!

 

 

FreelanceReinh
Jade | Level 19

@nanmz wrote:

Thanks for your reply @FreelanceReinh ! this worked out great for me. Just a follow up question that I'm still trying to wrap my head around:

 

1) how does creating a unique sort key (_seqno) prior to the proc sql help me achieve the desired output?


You're welcome. I introduced the sort key just to ensure that the sort order of WANT is the same as that of HAVE. If there's another suitable sort key in your real data, you can use it instead of _seqno (and omit the intermediate DATA step). Also, if you don't care about the sort order of WANT, you can omit everything related to _seqno as well and thus simplify the code and reduce run time.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1691 views
  • 3 likes
  • 4 in conversation