BookmarkSubscribeRSS Feed
clinsas
Calcite | Level 5

I want to validate my data set with certain condition where i am i am validating two rows of one fields. I want to create a data set with below listed conditions.(sample .csv attached)

 

1)if act ship has a date then act=Results should also have date(if date is present for ship and missing for Results then i want to output such records)

if act=ship and not missing(date) and act=result and missing(date);


2)if act ship has a date then then all pars of act Gp except rq should also have Value(if date is present for ship but value is missng for GP(x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) then i want to output such records)

if act=ship and not missing(date) and act=Gp and pars in (x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) and missing(values);

 

3)if act ship has a date then Pars 'case2' of act 'tot' should also have value(if date is present for ship and value is missng for case2 then i want to output such records)

if act=ship and not missing(date) and act=tot and pars=case2 and missing(Value);

 

 

 

Thanks in advance

 

3 REPLIES 3
PeterClemmensen
Tourmaline | Level 20

Hi and welcome to the SAS community 🙂

 

I have converted your csv file into a data step. This makes it much easier for users in here to provide a usable code answer for you.

 

Now, you have listed a few guidelines for your result. But if you really want to help us help you, post what your desired resuls looks like from this data set.

 

data have;
input id type DT date $ act $ pars $ Value;
infile datalines dlm=',' dsd;
datalines;
1,1,9,04JUL2016,Ship,,
1,1,9,U,Ship2,,
1,1,9,U,Ship3,,
1,1,10,U,GP,x,
1,1,10,U,GP,xx,
1,1,10,U,GP,xxx,
1,1,10,U,GP,a,
1,1,10,U,GP,aa,
1,1,10,U,GP,aaa,
1,1,10,U,GP,c,
1,1,10,U,GP,ccc,
1,1,10,U,GP,ccc,
1,1,10,U,GP,rq,
1,1,10,U,GP,cq,
1,1,10,U,GP,freq,
1,1,12,,Results,,
1,1,15,21SEP2016,tot,case1,0
1,1,15,21SEP2016,tot,case2,
2,1,9,11MAY2016,Ship,,
2,1,9,U,Ship2,,
2,1,9,U,Ship3,,
2,1,10,U,GP,x,1
2,1,10,U,GP,xx,1
2,1,10,U,GP,xxx,1
2,1,10,U,GP,a,1
2,1,10,U,GP,aa,1
2,1,10,U,GP,aaa,1
2,1,10,U,GP,c,1
2,1,10,U,GP,cc,1
2,1,10,U,GP,ccc,1
2,1,10,U,GP,rq,
2,1,10,U,GP,cq,1
2,1,10,U,GP,freq,1
2,1,12,21SEP2016,Results,,
2,1,15,21SEP2016,tot,case1,0
2,1,15,21SEP2016,tot,case2,2
;
clinsas
Calcite | Level 5

I want to validate my data set with certain condition where i am i am validating two rows of one fields. I want to create a data set with below listed conditions.(sample .csv attached)

 

1)if act ship has a date then act=Results should also have date(if date is present for ship and missing for Results then i want to output such records)

if act=ship and not missing(date) and act=result and missing(date);


2)if act ship has a date then then all pars of act Gp except rq should also have Value(if date is present for ship but value is missng for GP(x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) then i want to output such records)

if act=ship and not missing(date) and act=Gp and pars in (x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) and missing(values);

 

3)if act ship has a date then Pars 'case2' of act 'tot' should also have value(if date is present for ship and value is missng for case2 then i want to output such records)

if act=ship and not missing(date) and act=tot and pars=case2 and missing(Value);

 

 

The output should look something like this or something more meaningful.

like for id 1 act=ship has date but act=result date is missing.

Ship has date but value is missing for case2 and pars of GP.(exclude rq)

we can add a flag as identifier. 

id   type  DT     date       act                pars Value flag1
1    1       9   04-Jul-16 Ship                                   1
1    1     12                    Results                              1
1    1     10                    GP                     x              1
1    1    10                     GP                    xx             1
1    1    10                     GP                   xxx            1
1    1    10                     GP                     a             1
1    1    10                     GP                   aa             1
1    1    10                     GP                  aaa            1
1    1    10                     GP                   c               1
1    1    10                     GP                 ccc              1
1    1    10                     GP                 ccc              1
1    1   15    21-Sep-16 tot                 case2           1

 

 

sample dataset below.

data have;
input id type DT date $ act $ pars $ Value;
infile datalines dlm=',' dsd;
datalines;1,1,9,04JUL2016,Ship,,
1,1,9,U,Ship2,,
1,1,9,U,Ship3,,
1,1,10,U,GP,x,
1,1,10,U,GP,xx,
1,1,10,U,GP,xxx,
1,1,10,U,GP,a,
1,1,10,U,GP,aa,
1,1,10,U,GP,aaa,
1,1,10,U,GP,c,
1,1,10,U,GP,ccc,
1,1,10,U,GP,ccc,
1,1,10,U,GP,rq,
1,1,10,U,GP,cq,
1,1,10,U,GP,freq,
1,1,12,,Results,,
1,1,15,21SEP2016,tot,case1,0
1,1,15,21SEP2016,tot,case2,
2,1,9,11MAY2016,Ship,,
2,1,9,U,Ship2,,
2,1,9,U,Ship3,,
2,1,10,U,GP,x,1
2,1,10,U,GP,xx,1
2,1,10,U,GP,xxx,1
2,1,10,U,GP,a,1
2,1,10,U,GP,aa,1
2,1,10,U,GP,aaa,1
2,1,10,U,GP,c,1
2,1,10,U,GP,cc,1
2,1,10,U,GP,ccc,1
2,1,10,U,GP,rq,
2,1,10,U,GP,cq,1
2,1,10,U,GP,freq,1
2,1,12,21SEP2016,Results,,
2,1,15,21SEP2016,tot,case1,0
2,1,15,21SEP2016,tot,case2,2;

 

Thanks and Regards

clinsas
Calcite | Level 5

Just want to add extra notes that its not necessary to apply all all filters in one go we can split the filter conditions  like
1) if act=ship has a date and act=result has nissng(date) then output1.
2)if   act=ship has a date then parameter case2 of act=tot  shoud have value if not the output2;  and so on.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 749 views
  • 0 likes
  • 2 in conversation