Dear all,
I have a data set which has a variable containing some card nos with duplicates. I want to give numbering to all the cards in a separate column in such a way that all the same card nos hold the same number(new column) please see the below example and kindly advise:
Have dataset= only column1 from below
want dataset= column1 and 2 as given below
column1 column2
456789 1
456789 1
234567 2
456789 1
543672 3
543672 3
234567 2
891234 4
if I filter with cilumn2 with 1 I shd get the card no 456789 three times ( as it is present 3 times in the dataset and I don't want to remove the duplicates)
thank you in advance!!
Hi, I would probably do it this way:
proc format;
value newcode
456789 = '1' /* if this information comes from a file you can create the format with a CTLIN file, or simply merge that file with the data */
234567 = '2'
543672 = '3'
891234 = '4';
run;
data a;
length col1 8;
input col1;
datalines;
456789
456789
234567
456789
543672
543672
234567
891234
;
run;
data b;
set a;
col2 = put(col1, newcode.);
run;
proc freq data = b; table col2 * col1 / nopercent nocum list ; run;
Hope that helps.
That requires hardcoding the proc format. You can of course do it simply by sorting the dataset then:
data want;
set have;
retain column2;
if _n_=1 then column2=1;
else if column1 ne lag(column1) then column2=1;
else column2=column2+1;
run;
If you, for some unknown reason need the data in the order it previously was then add _n_ into the dataset as it stands, then sort, add the counter, then sort by the old _n_.
E.g.
data inter;
set have;
ord=_n_;
run;
proc sort data=inter;
by column1;
run;
data want;
set inter;
retain column2;
if _n_=1 then column2=1;
else if column1 ne lag(column1) then column2=1;
else column2=column2+1;
run;
proc sort data=want;
by ord;
run;
A solution amongst others.
/*******************************/
/**** random sample dataset ****/
/*******************************/
data t_have(keep=id);
do i= 1 to 10;
id = 10+ ceil(3*ranuni(7));
output;
end;
run;
/*****************************/
/**** a possible solution ****/
/*****************************/
data t_want(keep=id rnk);
length id rnk 8.;
if _N_=1 then do;
declare hash ha(multidata:'N');
ha.definekey('zId');
ha.definedata('zId','zRnk');
ha.definedone();
end;
rnk=0;
do until(aDone);
set t_have end=aDone;
zId=id;
rc=ha.find();
if (rc>0) then do; rnk+1; zRnk=rnk; ha.add(); output; end;
else do; rnk=zRnk; output; end;
end;
run;
t_want becomes:
====================
id rnk
11 1
13 2
13 2
13 2
12 3
13 2
13 2
13 2
12 3
12 3
====================
hope this helps.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.