Dear all,
I have a data set which has a variable containing some card nos with duplicates. I want to give numbering to all the cards in a separate column in such a way that all the same card nos hold the same number(new column) please see the below example and kindly advise:
Have dataset= only column1 from below
want dataset= column1 and 2 as given below
column1 column2
456789 1
456789 1
234567 2
456789 1
543672 3
543672 3
234567 2
891234 4
if I filter with cilumn2 with 1 I shd get the card no 456789 three times ( as it is present 3 times in the dataset and I don't want to remove the duplicates)
thank you in advance!!
Hi, I would probably do it this way:
proc format;
value newcode
456789 = '1' /* if this information comes from a file you can create the format with a CTLIN file, or simply merge that file with the data */
234567 = '2'
543672 = '3'
891234 = '4';
run;
data a;
length col1 8;
input col1;
datalines;
456789
456789
234567
456789
543672
543672
234567
891234
;
run;
data b;
set a;
col2 = put(col1, newcode.);
run;
proc freq data = b; table col2 * col1 / nopercent nocum list ; run;
Hope that helps.
That requires hardcoding the proc format. You can of course do it simply by sorting the dataset then:
data want;
set have;
retain column2;
if _n_=1 then column2=1;
else if column1 ne lag(column1) then column2=1;
else column2=column2+1;
run;
If you, for some unknown reason need the data in the order it previously was then add _n_ into the dataset as it stands, then sort, add the counter, then sort by the old _n_.
E.g.
data inter;
set have;
ord=_n_;
run;
proc sort data=inter;
by column1;
run;
data want;
set inter;
retain column2;
if _n_=1 then column2=1;
else if column1 ne lag(column1) then column2=1;
else column2=column2+1;
run;
proc sort data=want;
by ord;
run;
A solution amongst others.
/*******************************/
/**** random sample dataset ****/
/*******************************/
data t_have(keep=id);
do i= 1 to 10;
id = 10+ ceil(3*ranuni(7));
output;
end;
run;
/*****************************/
/**** a possible solution ****/
/*****************************/
data t_want(keep=id rnk);
length id rnk 8.;
if _N_=1 then do;
declare hash ha(multidata:'N');
ha.definekey('zId');
ha.definedata('zId','zRnk');
ha.definedone();
end;
rnk=0;
do until(aDone);
set t_have end=aDone;
zId=id;
rc=ha.find();
if (rc>0) then do; rnk+1; zRnk=rnk; ha.add(); output; end;
else do; rnk=zRnk; output; end;
end;
run;
t_want becomes:
====================
id rnk
11 1
13 2
13 2
13 2
12 3
13 2
13 2
13 2
12 3
12 3
====================
hope this helps.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.