BookmarkSubscribeRSS Feed
iSAS
Quartz | Level 8

data have;
length jobs $ 100;
input jobs $;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;
run;

 

From the dataset given, May I know how to add a column that has this logic:

1. If a number appeared only once, it will be removed

2. If a number appeared more than once, it will be retained BUT it will only appear once

 

To explain further, the output of the added column should be like this:

data want;
length final $ 100;
input final $;
datalines;
11,2
.
2
11
;
run;

 

7 REPLIES 7
Reeza
Super User
Please show the exact expected output for this input data set.
iSAS
Quartz | Level 8

The expected output should be like this:

data want;
length final $ 100;
input final $;
datalines;
11,2
.
2
11
;
run;

Kurt_Bremser
Super User
data have;
length jobs $ 100 job $5;
input jobs $;
n = _N_;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;

proc sort
  data=have
  dupout=intermediate
  nodupkey
;
by n job;
run;

proc sort
  data=intermediate
  nodupkey
;
by n job;
run;

data want;
set intermediate;
by n;
length jobs $100;
retain jobs;
if first.n
then jobs = job;
else jobs = catx(',',jobs,job);
if last.n;
drop job;
run;

Although I would recommend to keep the long dataset layout instead.

iSAS
Quartz | Level 8
Hello KurtBremser, thank you for the reply.
In the final dataset work.want, the second observation of work.have was removed. What if, though the value is blank, I still want it to appear it work.want? What should be done if it should appear like this?:
data want;
length final $ 100;
input final $;
datalines;
11,2
.
2
11
;
run;
Kurt_Bremser
Super User

OK, that was the old code.

See this restructured code:

data have;
length jobs $ 100 job $5;
input jobs $;
n = _N_;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;

data int1;
set have;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
run;

proc sort
  data=int1
  dupout=int2
  nodupkey
;
by n job;
run;

proc sort
  data=int2
  nodupkey
;
by n job;
run;

data want;
merge
  int2
  have (keep=n)
;
by n;
length jobs $100;
retain jobs;
if first.n
then jobs = job;
else jobs = catx(',',jobs,job);
if last.n;
drop job;
run;

proc print data=want noobs;
run;

Result:

n	jobs
1	11,2
2	 
3	2
4	11
Kurt_Bremser
Super User
data have;
length jobs $ 100 job $5;
input jobs $;
n = _N_;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;

proc sort
  data=have
  dupout=intermediate
  nodupkey
;
by n job;
run;

proc sort
  data=intermediate
  nodupkey
;
by n job;
run;

data want;
set intermediate;
by n;
length jobs $100;
retain jobs;
if first.n
then jobs = job;
else jobs = catx(',',jobs,job);
if last.n;
drop job;
run;

You can join the result back to have by n.

Ksharp
Super User
data have;
input jobs $40.;
n = _N_;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
cards;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;
proc freq data=have noprint;
table n*job/out=freq list;
run;
data freq;
 set freq;
 if count=1 then call missing(job);
run;
data want;
 do until(last.n);
   set freq;
   by n;
   length jobs $ 200;
   jobs=catx(',',jobs,job);
 end;
 keep n jobs;
 run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1187 views
  • 0 likes
  • 4 in conversation