Hi,
I have a datset final in which there are columns like
Id,name,function,status,reason, department and has values
123,flowers-gift,nw,onboard,blank SLS,human resource
123,flowers-gifts,nw,onboard,missing country,Human resource
123,onboard gift,nw,onboard,missing country,
I need to take unique rows so I performed first. In the dataset
Data new;
Set final;
If first.id and first.name and first.department then output;
Run;
The output I'm getting is only the first row that is with the reason blank SSL but I also need the second row with missing country.
Could anyone please help me on this
I have a feeling that you are not representing the data in your SAS data set properly. Please provide the data (in this thread, and from now on in all future threads) as working SAS data step code (examples and instructions).
@Akshaya_1397 wrote:
Sorry for that.
I have now edited the it properly
This is not the format we requested.
i have dataset name merged file as shown below:
i wrote a code to obtain unique rows:
DATA fin;
set Merged_file;
by ID dep status reason;
if first.id and first.dep and first.status and first.reason then output;
run;
the output which im getiing is only the first row.but the output im expecting is the first two rows as the reason is different
can anyone please help me on this
Please provide the data in the requested form (we provided a link with examples and instructions). Please don't start new threads on the same topic.
@Akshaya_1397 wrote:
i have dataset name merged file as shown below:
i wrote a code to obtain unique rows:
DATA fin;
set Merged_file;
by ID dep status reason;
if first.id and first.dep and first.status and first.reason then output;
run;
the output which im getiing is only the first row.but the output im expecting is the first two rows as the reason is different
can anyone please help me on this
This test
if first.id and first.dep and first.status and first.reason then output;
indicates a misunderstanding of how FIRST. flags are created.
That test is the same as
if first.id then output;
because the time when all of the FIRST. flags will be true is when you start a new value of the first BY variable, in this case the ID variable.
If you want to find the distinct observations then you should just test the value of the last BY variable, in this case the REASON variable.
if first.reason then output;
Thanks for ur reply tom.
But we have exact same reasons for other I'd too that's why I used first.id and first.reason.
Is there any other way to do it?
@Akshaya_1397 wrote:
Thanks for ur reply tom.
But we have exact same reasons for other I'd too that's why I used first.id and first.reason.
Is there any other way to do it?
Do WHAT?
Are you look to get one observation per distinct values of REASON?
If so then sort by REASON.
proc sort data=have out=one_per_reason nodupkey ;
by reason;
run;
All of the rows in your example dataset are unique already. Please provide example data that has one or more duplicates that you would like to remove. Explain which ones should be removed (or which ones kept) and why.
data have ;
infile cards dsd truncover ;
input Id $ name :$20. function :$5. status :$20. reason :$20. department :$20.;
cards;
123,flowers-gift,nw,onboard,blank SLS,human resource
123,flowers-gifts,nw,onboard,missing country,Human resource
123,onboard gift,nw,onboard,missing country,
;
Obs Id name function status reason department 1 123 flowers-gift nw onboard blank SLS human resource 2 123 flowers-gifts nw onboard missing country Human resource 3 123 onboard gift nw onboard missing country
data have ;
infile cards dsd truncover ;
input Id $ name :$20. function :$5. status :$20. reason :$20. department :$20.;
cards;
123,flowers-gift,nw,onboard,blank SLS,human resource
123,flowers-gifts,nw,onboard,missing country,Human resource
123,onboard gift,nw,onboard,missing country,
124,flowers-gift,nw,onboard,blank SLS,human resource
124,flowers-gifts,nw,onboard,missing country,Human resource
124,onboard gift,nw,onboard,missing country,
125,flowers-gift,nw,onboard,blank SLS,human resource
126,flowers-gifts,nw,onboard,missing country,Human resource
126,onboard gift,nw,onboard,missing country,
;
proc sort data= have; by reason; run;
data haveuniq;
set have;
by reason;
if first.reason then output;
run;
the output which im expecting is shown below:
123,flowers-gift,nw,onboard,blank SLS,human resource
123,flowers-gifts,nw,onboard,missing country,Human resource
124,flowers-gift,nw,onboard,blank SLS,human resource
124,flowers-gifts,nw,onboard,missing country,Human resource
125,flowers-gift,nw,onboard,blank SLS,human resource
126,flowers-gifts,nw,onboard,missing country,Human resource
the output im getting is
this is what is confusing me.
Why does the output confuse you? There are only two distinct values of REASON so you should only get two observations out when selecting based on distinct values of REASON.
What is the reason that you want to remove the three observations you want removed?
The only thing I can see that would produce that output would be to just delete the observations with blank values of DEPARTMENT.
data have ;
infile cards dsd truncover ;
input Id $ name :$20. function :$5. status :$20. reason :$20. department :$20.;
cards;
123,flowers-gift,nw,onboard,blank SLS,human resource
123,flowers-gifts,nw,onboard,missing country,Human resource
123,onboard gift,nw,onboard,missing country,
124,flowers-gift,nw,onboard,blank SLS,human resource
124,flowers-gifts,nw,onboard,missing country,Human resource
124,onboard gift,nw,onboard,missing country,
125,flowers-gift,nw,onboard,blank SLS,human resource
126,flowers-gifts,nw,onboard,missing country,Human resource
126,onboard gift,nw,onboard,missing country,
;
data expect ;
infile cards dsd truncover ;
input Id $ name :$20. function :$5. status :$20. reason :$20. department :$20.;
cards;
123,flowers-gift,nw,onboard,blank SLS,human resource
123,flowers-gifts,nw,onboard,missing country,Human resource
124,flowers-gift,nw,onboard,blank SLS,human resource
124,flowers-gifts,nw,onboard,missing country,Human resource
125,flowers-gift,nw,onboard,blank SLS,human resource
126,flowers-gifts,nw,onboard,missing country,Human resource
;
data no_miss_department;
set have;
if not missing(department) then output;
run;
proc compare data=no_miss_department compare=expect;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.