I want to validate my data set with certain condition where i am i am validating two rows of one fields. I want to create a data set with below listed conditions.(sample .csv attached)
1)if act ship has a date then act=Results should also have date(if date is present for ship and missing for Results then i want to output such records)
if act=ship and not missing(date) and act=result and missing(date);
2)if act ship has a date then then all pars of act Gp except rq should also have Value(if date is present for ship but value is missng for GP(x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) then i want to output such records)
if act=ship and not missing(date) and act=Gp and pars in (x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) and missing(values);
3)if act ship has a date then Pars 'case2' of act 'tot' should also have value(if date is present for ship and value is missng for case2 then i want to output such records)
if act=ship and not missing(date) and act=tot and pars=case2 and missing(Value);
Thanks in advance
Hi and welcome to the SAS community 🙂
I have converted your csv file into a data step. This makes it much easier for users in here to provide a usable code answer for you.
Now, you have listed a few guidelines for your result. But if you really want to help us help you, post what your desired resuls looks like from this data set.
data have;
input id type DT date $ act $ pars $ Value;
infile datalines dlm=',' dsd;
datalines;
1,1,9,04JUL2016,Ship,,
1,1,9,U,Ship2,,
1,1,9,U,Ship3,,
1,1,10,U,GP,x,
1,1,10,U,GP,xx,
1,1,10,U,GP,xxx,
1,1,10,U,GP,a,
1,1,10,U,GP,aa,
1,1,10,U,GP,aaa,
1,1,10,U,GP,c,
1,1,10,U,GP,ccc,
1,1,10,U,GP,ccc,
1,1,10,U,GP,rq,
1,1,10,U,GP,cq,
1,1,10,U,GP,freq,
1,1,12,,Results,,
1,1,15,21SEP2016,tot,case1,0
1,1,15,21SEP2016,tot,case2,
2,1,9,11MAY2016,Ship,,
2,1,9,U,Ship2,,
2,1,9,U,Ship3,,
2,1,10,U,GP,x,1
2,1,10,U,GP,xx,1
2,1,10,U,GP,xxx,1
2,1,10,U,GP,a,1
2,1,10,U,GP,aa,1
2,1,10,U,GP,aaa,1
2,1,10,U,GP,c,1
2,1,10,U,GP,cc,1
2,1,10,U,GP,ccc,1
2,1,10,U,GP,rq,
2,1,10,U,GP,cq,1
2,1,10,U,GP,freq,1
2,1,12,21SEP2016,Results,,
2,1,15,21SEP2016,tot,case1,0
2,1,15,21SEP2016,tot,case2,2
;
I want to validate my data set with certain condition where i am i am validating two rows of one fields. I want to create a data set with below listed conditions.(sample .csv attached)
1)if act ship has a date then act=Results should also have date(if date is present for ship and missing for Results then i want to output such records)
if act=ship and not missing(date) and act=result and missing(date);
2)if act ship has a date then then all pars of act Gp except rq should also have Value(if date is present for ship but value is missng for GP(x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) then i want to output such records)
if act=ship and not missing(date) and act=Gp and pars in (x,xx,xxx,a,aa,aaa,c,cc,ccc,cq,freq) and missing(values);
3)if act ship has a date then Pars 'case2' of act 'tot' should also have value(if date is present for ship and value is missng for case2 then i want to output such records)
if act=ship and not missing(date) and act=tot and pars=case2 and missing(Value);
The output should look something like this or something more meaningful.
like for id 1 act=ship has date but act=result date is missing.
Ship has date but value is missing for case2 and pars of GP.(exclude rq)
we can add a flag as identifier.
id type DT date act pars Value flag1
1 1 9 04-Jul-16 Ship 1
1 1 12 Results 1
1 1 10 GP x 1
1 1 10 GP xx 1
1 1 10 GP xxx 1
1 1 10 GP a 1
1 1 10 GP aa 1
1 1 10 GP aaa 1
1 1 10 GP c 1
1 1 10 GP ccc 1
1 1 10 GP ccc 1
1 1 15 21-Sep-16 tot case2 1
sample dataset below.
data have;
input id type DT date $ act $ pars $ Value;
infile datalines dlm=',' dsd;
datalines;1,1,9,04JUL2016,Ship,,
1,1,9,U,Ship2,,
1,1,9,U,Ship3,,
1,1,10,U,GP,x,
1,1,10,U,GP,xx,
1,1,10,U,GP,xxx,
1,1,10,U,GP,a,
1,1,10,U,GP,aa,
1,1,10,U,GP,aaa,
1,1,10,U,GP,c,
1,1,10,U,GP,ccc,
1,1,10,U,GP,ccc,
1,1,10,U,GP,rq,
1,1,10,U,GP,cq,
1,1,10,U,GP,freq,
1,1,12,,Results,,
1,1,15,21SEP2016,tot,case1,0
1,1,15,21SEP2016,tot,case2,
2,1,9,11MAY2016,Ship,,
2,1,9,U,Ship2,,
2,1,9,U,Ship3,,
2,1,10,U,GP,x,1
2,1,10,U,GP,xx,1
2,1,10,U,GP,xxx,1
2,1,10,U,GP,a,1
2,1,10,U,GP,aa,1
2,1,10,U,GP,aaa,1
2,1,10,U,GP,c,1
2,1,10,U,GP,cc,1
2,1,10,U,GP,ccc,1
2,1,10,U,GP,rq,
2,1,10,U,GP,cq,1
2,1,10,U,GP,freq,1
2,1,12,21SEP2016,Results,,
2,1,15,21SEP2016,tot,case1,0
2,1,15,21SEP2016,tot,case2,2;
Thanks and Regards
Just want to add extra notes that its not necessary to apply all all filters in one go we can split the filter conditions like
1) if act=ship has a date and act=result has nissng(date) then output1.
2)if act=ship has a date then parameter case2 of act=tot shoud have value if not the output2; and so on.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.