Solved: Re: Identify Unique Values in Data Set and Create New Fields

JH74 · Posted 04-04-2024 10:08 AM

Hi. I have a data set that contains some variables that I need to identify if it's the first and only occurrence, or first of multiple, and create new fields in a new data set that can be reviewed further. The data sets have thousands of lines.

Here's an example of what I am starting with:

id	line	exclude_ind	reimb_amt
12345	01	00	20
12345	02	00	0
54321	01	01	10
54321	02	01	10
54321	03	00	5
34567	01	05	0
34567	02	05	15
34567	03	09	20
34567	04	03	10

From the above data set, I want a field named id_count, which identifies unique id's. Each unique id gets a value = 01.

I want to create a field called exclude_ind_count, which is made of a value = 01 for each unique value in the exclude_ind field, within the unique id field.

So, the resulting new data set looks like this:

id	id_count	line	exclude_ind	exclude_ind_count	reimb_amt
12345	01	01	00	01	20
12345	00	02	00	00	0
54321	01	01	01	01	10
54321	00	02	01	00	10
54321	00	03	00	01	5
34567	01	01	05	01	0
34567	00	02	05	00	15
34567	00	03	09	01	20
34567	00	04	03	01	10

So, each id_count field should, when summed, = 1 for each unique id. Basically I want a count of how many unique id's are in my data set.

And, as shown, within each unique id, each unique instance of exclude_ind has a resulting exclude_ind_count = 01. So, when summed, I can get an idea of how many id's have either 1 exclude_ind, or multiple exclude_ind.

data_null__ · Posted 04-04-2024 10:28 AM

I think this is what you want, or close. You may need to sort by id exclude_ind.

data have;
   infile cards dlm='09'x dsd;
   input id line exclude_ind reimb_amt;
   cards;
12345	01	00	20
12345	02	00	0
54321	01	01	10
54321	02	01	10
54321	03	00	5
34567	01	05	0
34567	02	05	15
34567	03	09	20
34567	04	03	10
;;;;
   run;
   
data want;
   set have;
   by id exclude_ind notsorted;
   id_count = first.id;
   exclude_ind_count = first.exclude_ind;
   run;

View solution in original post

ballardw · Posted 04-04-2024 10:25 AM

Is that id_count supposed to be numeric or character? Not at all clear as showing two characters makes it a bit odd.

Is your data sorted by ID?

If your data is at least grouped by ID (all the ID values in adjacent rows) AND the Exclude_ind are also grouped together then creating NUMERIC values for this is easy. If not grouped then you may need to provide an example of that data and what you expect, which might require sorting the data.

data have;
  input id 	line 	exclude_ind 	reimb_amt;
datalines;
12345 	01 	00 	20
12345 	02 	00 	0
54321 	01 	01 	10
54321 	02 	01 	10
54321 	03 	00 	5
34567 	01 	05 	0
34567 	02 	05 	15
34567 	03 	09 	20
34567 	04 	03 	10
;

data want;
   set have;
   by id exclude_ind notsorted;
   id_count=first.id;
   exclude_ind_count=first.exclude_ind;
run;

If you must see two digits for those variables then assign a Z2. format. Of you want a character version then use

put(first.var,z2.)

When you use BY group processing in a data step SAS provides automatic numeric 1/0 valued (1=true 0=false) variables FIRST. and LAST. for each variable on the By statement indicating whether the current observation is the first or last of that group.

JH74 · Posted 04-04-2024 10:40 AM

Thank you. I appreciate your help. Pretty much the same answer as the other user added. It worked.

data_null__ · Posted 04-04-2024 10:28 AM

I think this is what you want, or close. You may need to sort by id exclude_ind.

data have;
   infile cards dlm='09'x dsd;
   input id line exclude_ind reimb_amt;
   cards;
12345	01	00	20
12345	02	00	0
54321	01	01	10
54321	02	01	10
54321	03	00	5
34567	01	05	0
34567	02	05	15
34567	03	09	20
34567	04	03	10
;;;;
   run;
   
data want;
   set have;
   by id exclude_ind notsorted;
   id_count = first.id;
   exclude_ind_count = first.exclude_ind;
   run;

Identify Unique Values in Data Set and Create New Fields

Re: Identify Unique Values in Data Set and Create New Fields

Re: Identify Unique Values in Data Set and Create New Fields

Re: Identify Unique Values in Data Set and Create New Fields

Re: Identify Unique Values in Data Set and Create New Fields

Registration is open

SAS Training: Just a Click Away