BookmarkSubscribeRSS Feed
Smitha9
Fluorite | Level 6

Hi,

I have a dataset of 3000 ID's. I need to flag specific 200ID's as "Not sure". Since it is huge number of ID's, I am not sure how to flag the specific 200 ID's. this should be done everyweek.

 

 

 

5 REPLIES 5
ballardw
Super User

What are the rules involved with setting the flags?

Where do the "specific 200" id values come from?

Are the ID values duplicated in the data?

What would the flag values look like?

Smitha9
Fluorite | Level 6

Data set A

ID

1

2

3

4

5

 

dataset B

ID

2

3

4

 

I want to flag the dataset B ID's in Dataset A

 

Output dataset A:

ID    Flag

1      

2       1

3       1

4       1

5

 

I have the dataset which has around 2000ID's and to be flagged ID's are around 200.

ballardw
Super User

One way:

Data setA;
  input ID;
datalines;
1
2
3
4
5
;
 
data setB;
  input ID;
datalines;
2
3
4
;

proc sort seta;
  by id;
run;
proc sort setb;
  by id;
run;

data want;
  merge seta setb (in=inb);
  by id;
  if inb then flag=1;
run;

First please note the DATA step as a way to provide example data.

The sets are sorted to allow use of the BY Id in the Want data step.

The data set option In= creates a temporary variable (not added to the set) that indicates whether a particular data set contributes to the current observation and is valued 1/0 for yes/no or true/false if your prefer. Since the concern is that Setb contribute the match then that is on the set that needs in. We can test the value of the value of the In variable to do things conditionally such as set this flag.

If you had wanted a 1/0 coding flag value on each observation you would change the code to

data want;
  merge seta setb (in=inb);
  by id;
  flag=inb;
run;

Note: IF the ID value is duplicated in both sets this will not work as desired and there will be a warning in the Log.

Tom
Super User Tom
Super User

This is a simple MERGE operation.

So if you have two datasets like this:

data A;
  input ID ;
cards;
1
2
3
4
5
;

data B;
  input ID;
cards;
2
3
4
;

You can easily run a data step to make a NEW dataset that includes your flag variable.

data want;
  merge a(in=in1) b(in=in2);
  by id;
  if in1;
  flag=in2;
run;

Results:

Obs    ID    flag

 1      1      0
 2      2      1
 3      3      1
 4      4      1
 5      5      0

 

Quentin
Super User

Can you describe your question a bit more, and add some example data?  You can make it a small example, say a datasets of 5 ID's and you want to flag 3 of them.

 

Do you have a dataset of the records you want to flag?  If so, could you merge your "flagme" dataset onto your main dataset, to identify the records that need to be flagged?

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1299 views
  • 1 like
  • 4 in conversation