Dear All,
I am using SAS data management studio 9.4 version, it is frontend interface where in there are only options of drag & drop and also to write few expressions.
I am in this sitution where the data is repeated and i need to eliminate them from the file. I have tried using clusters by grouping the primary key but getting stuck in the next condition. Please suggest me the appropriate solution, provided below the data.
Required : if roll no = 123A then remove the roll no, i.e. 0012 and 0064 has 123A area so the whole 0012 and 0064 should be removed from the data. The output should only consist of 12345 roll no
Input | ||
Roll No | Name | Area |
0012 | KKKKK | 123A |
0012 | KKKKK | 3333 |
0012 | KKKKK | 7869 |
0012 | KKKKK | 7777 |
0012 | KKKKK | 913B |
12345 | LLLLL | 7869 |
12345 | LLLLL | 123A |
12345 | LLLLL | 3333 |
0064 | MMMM | 7869 |
0064 | MMMM | 7869 |
0064 | MMMM | 3333 |
0064 | MMMM | 123A |
0064 | MMMM | 7869 |
0064 | MMMM | 6666 |
0064 | MMMM | 913B |
Output | ||
Roll No | Name | Area |
12345 | LLLLL | 7869 |
12345 | LLLLL | 123A |
12345 | LLLLL | 3333 |
Regards,
Shaheen
data have;
infile cards missover;
input Roll_No Name$ Area$;
cards;
0012 KKKKK 123A
0012 KKKKK 3333
0012 KKKKK 7869
0012 KKKKK 7777
0012 KKKKK 913B
12345 LLLLL 7869
12345 LLLLL 123A
12345 LLLLL 3333
0064 MMMM 7869
0064 MMMM 7869
0064 MMMM 3333
0064 MMMM 123A
0064 MMMM 7869
0064 MMMM 6666
0064 MMMM 913B
;
proc sql;
create table test as select a.*,b.sum from have as a left join (select sum(count) as sum, area from (select count(distinct area) as count, area, roll_no from have group by area,roll_no) group by area) as b on a.area=b.area where b.sum>=3 order a.area,b.roll_no;
quit;
data want;
set test;
by area roll_no;
if last.area;
run;
data have;
infile cards missover;
input Roll_No Name$ Area$;
cards;
0012 KKKKK 123A
0012 KKKKK 3333
0012 KKKKK 7869
0012 KKKKK 7777
0012 KKKKK 913B
12345 LLLLL 7869
12345 LLLLL 123A
12345 LLLLL 3333
0064 MMMM 7869
0064 MMMM 7869
0064 MMMM 3333
0064 MMMM 123A
0064 MMMM 7869
0064 MMMM 6666
0064 MMMM 913B
;
proc sql;
create table test as select a.*,b.sum from have as a left join (select sum(count) as sum, area from (select count(distinct area) as count, area, roll_no from have group by area,roll_no) group by area) as b on a.area=b.area where b.sum>=3 order a.area,b.roll_no;
quit;
data want;
set test;
by area roll_no;
if last.area;
run;
Dear Jag,
I use parts supplied by dataflux, it is frontend tool where there are data source node which is used to import the input files and data validation node and expression node is used to draw the conditions. So I cannot write the coding what you have provided. Please suggest short condition which will help me to reach the output. Also note that my data is dynamic, the area will keep changing and the roll no's will increase based on the requirement. The data provided was just a sample of my input.
Regards,
Shaheen
Please don't post the same question twice.
Dear Patrick,
Sorry for posting the query twice, i had a confusion to choose the forum, SAS studio and Data Management. Please ignore.
Thank you.
Regards,
Shaheen
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.