BookmarkSubscribeRSS Feed
iSAS
Quartz | Level 8

data have;
length jobs $ 100;
input jobs $;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;
run;

 

From the dataset given, May I know how to add a column that has this logic:

1. If a number appeared only once, it will be removed

2. If a number appeared more than once, it will be retained BUT it will only appear once

 

To explain further, the output of the added column should be like this:

data want;
length final $ 100;
input final $;
datalines;
11,2
.
2
11
;
run;

 

7 REPLIES 7
Reeza
Super User
Please show the exact expected output for this input data set.
iSAS
Quartz | Level 8

The expected output should be like this:

data want;
length final $ 100;
input final $;
datalines;
11,2
.
2
11
;
run;

Kurt_Bremser
Super User
data have;
length jobs $ 100 job $5;
input jobs $;
n = _N_;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;

proc sort
  data=have
  dupout=intermediate
  nodupkey
;
by n job;
run;

proc sort
  data=intermediate
  nodupkey
;
by n job;
run;

data want;
set intermediate;
by n;
length jobs $100;
retain jobs;
if first.n
then jobs = job;
else jobs = catx(',',jobs,job);
if last.n;
drop job;
run;

Although I would recommend to keep the long dataset layout instead.

iSAS
Quartz | Level 8
Hello KurtBremser, thank you for the reply.
In the final dataset work.want, the second observation of work.have was removed. What if, though the value is blank, I still want it to appear it work.want? What should be done if it should appear like this?:
data want;
length final $ 100;
input final $;
datalines;
11,2
.
2
11
;
run;
Kurt_Bremser
Super User

OK, that was the old code.

See this restructured code:

data have;
length jobs $ 100 job $5;
input jobs $;
n = _N_;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;

data int1;
set have;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
run;

proc sort
  data=int1
  dupout=int2
  nodupkey
;
by n job;
run;

proc sort
  data=int2
  nodupkey
;
by n job;
run;

data want;
merge
  int2
  have (keep=n)
;
by n;
length jobs $100;
retain jobs;
if first.n
then jobs = job;
else jobs = catx(',',jobs,job);
if last.n;
drop job;
run;

proc print data=want noobs;
run;

Result:

n	jobs
1	11,2
2	 
3	2
4	11
Kurt_Bremser
Super User
data have;
length jobs $ 100 job $5;
input jobs $;
n = _N_;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
datalines;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;

proc sort
  data=have
  dupout=intermediate
  nodupkey
;
by n job;
run;

proc sort
  data=intermediate
  nodupkey
;
by n job;
run;

data want;
set intermediate;
by n;
length jobs $100;
retain jobs;
if first.n
then jobs = job;
else jobs = catx(',',jobs,job);
if last.n;
drop job;
run;

You can join the result back to have by n.

Ksharp
Super User
data have;
input jobs $40.;
n = _N_;
do i = 1 to countw(jobs,',');
  job = scan(jobs,i,',');
  output;
end;
keep n job;
cards;
11,2,11,2
11,2
2,11,2
2,11,3,11,4,11
;
proc freq data=have noprint;
table n*job/out=freq list;
run;
data freq;
 set freq;
 if count=1 then call missing(job);
run;
data want;
 do until(last.n);
   set freq;
   by n;
   length jobs $ 200;
   jobs=catx(',',jobs,job);
 end;
 keep n jobs;
 run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1788 views
  • 0 likes
  • 4 in conversation