01-21-2015 10:12 AM
Hi everyone! I'm kind of new in SAS Guide. Right now Im in the process of shifting excel reports to guide reports.
I have a big problems managing rows in general. I want to delete rows according to a specific criteria. In the Excel report I sort the matrix with three criterias then I create an extra column with a formula like this: if C1=C2 and D1=D2 and H1=H2, L1= "Yes", "No", after that I filter this column and select only the "Yes" value. Then I delete every row that has the "Yes" value excepting the rows that have a specific value in another column. I guess this is a very standart procedure for an Excel user, but in guide it seems to be more complex. Im not so sure how to do it because I know that the query builder gives you the option to delete duplicated rows but I want to delete rows that are duplicated in some columns not in every column, and besides I want to select which one of that group, not to delete. I would really appreciate any kind of help that I could get. If somebody doesnt get the question I would gladly explain it throughly.
01-21-2015 01:40 PM
You might need to consider a code node and a data step using the lag function. You would have to sort of flip your criteria from using the next row to criteria using the previous row. The lag function allows a data step to look at the previous record.
First step would be to sort your dataset as you need and then do something like:
data want (drop= lagc lagd lagh);
if c=lagc and d=lagd and h=lagh then duprec='Y';
if duprec='N' then output;
01-21-2015 01:45 PM
Thank you so much for your help! Im not so sure what a lag function is, but I will google it! :smileygrin:
Thank you so much for your help!
01-21-2015 02:45 PM
If you're removing duplicates you may want to look at the options for removing duplicates in the SORT task instead. Or consider using DISTINCT in the QUERY builder as well.
The key to switching from Excel to SAS EG is to learn the best method in SAS EG, which may not be the same as the method in Excel. It may be better to explain what you're trying to do in more detail and get the best method in SAS EG.
01-21-2015 07:00 PM
If you provide several rows of data, dummy is okay as long as it exhibits the different choices likely to occur, and what the final output should look like it will help us provide answers.
I suspect that in SAS you may not even need any thing more complicated than a WHERE dataset option.
For example to print the records from the SAS supplied dataset SASHELP.CLASS where we don't want to print if the record is Female and Age is 13:
Proc print data=sashelp.class (where=( not(sex='F' and age=13)));