Hi all,
I have been trying to find ways by which I could filter out my dataset based on the following requirement.
I have a variable named path (eg: wwb/dat/<ta>/<3digitnumber>/<3digitnumber>/) which should contain:
a) <ta> which could be ab, bd, ge, mf, ac.
b) either only the first field containing <3digitnumber> or both
Kindly advice on how I should approach this.....
Use SCAN() to get the different parts of the path
USE IN to check if it matches your criteria
Demo code, untested :
letter_part = scan(word, 3, '/');
number_part1 = scan(word, 4, '/');
number_part2 = scan(word, 5, '/');
if letter_part in ('ab', 'bc', 'cd') and <some criteria about number_part1) and <some criteria about number_part2>;
Try posting some examples so we can actually see what your working with/wanting out. Post test data in the form of a datastep so we don't have to guess what you data structure is.
It looks a bit like HTML or XML, if so maybe use that to import (XML libname for instance). But still just guessing.
malai,luch,ab/007/007/stab/unblin/raw,old,xxxx,full,11/8/2004,16:14:43
malai,luch,ab/007/007/stab/unblin/raw,new,modify,read,11/8/2004,16:14:43
malai,malai,/wwb/data/bd/007/007/stab/unblin/views,new,add,read,11/10/2004,10:54:49
malai,malai,/ww/data/ge/007/007/stab/unblin/./views,old,xxxx,read,11/10/2004,10:57:49
malai,malai,/wwb/data/mf/007/007/stab/unblin/./views,new,modify,full,11/10/2004,10:57:49
malai,luch,/wwb/data/mf/007/007/stab/unblin/views,new,add,full,11/10/2004,10:59:35
malai,boco,/wwb/data/ac/007/007/stab/unblin/views,new,add,full,11/10/2004,10:59:35
malai,luch,/ww/data/mf/007/007/stab/unblin/views,old,xxxx,full,11/10/2004,11:0:7
malai,luch,/ww/data/mf/007/007/stab/unblin/views,new,modify,read,11/10/2004,11:0:7
malai,boco,/wwb/data/bd/007/007/stab/unblin/views,old,xxxx,full,11/10/2004,11:0:7
malai,boco,/wwbd/data/ab/007/007/stab/unblin/views,new,modify,read,11/10/2004,11:0:7
malai,luch,/wwb/data/ab/007/007/stab/unblin/views,old,xxxx,read,11/10/2004,11:0:39
malai,luch,/ww/data/bd/007/007/stab/unblin/views,new,delete,,11/10/2004,11:0:39
malai,boco,/wwb/data/mf/007/007/stab/unblin/views,old,xxxx,read,11/10/2004,11:0:39
malai,boco,/ww/data/ac/007/007/stab/unblin/views,new,delete,,11/10/2004,11:0:39
malai,malai,/wwb/data/ac/007/007/stab/blin/raw,new,add,full,11/10/2004,11:37:19
malai,malai,/apps/bin/toolsdev/ge,old,xxxx,full,11/10/2004,16:34:25
malai,malai,/apps/bin/toolsdev/ge,new,modify,full,11/10/2004,16:34:25
luch,luch,/wwb/data/ac/007/007/stab/blin/**bleep**,new,add,full,11/11/2004,9:44:37
boco,boco,/ww/data/bd/101/800/stab/blin/views,new,add,full,11/16/2004,17:32:57
boco,boco,/wwb/data/bd/101/800/stab/blin/views,old,xxxx,full,11/16/2004,17:37:43
gaise,freem,/wwb/clin/proja/emb,new,add,read,11/10/2005,15:10:26
gaise,hard,/wwb/cl/pro/emb,new,add,read,11/10/2005,15:10:38
gaise,hard,/wwb/cli/pr/emba,old,xxxx,read,11/10/2005,15:10:44
gaise,hard,/wwb/cl/pro/emb,new,modify,full,11/10/2005,15:10:44
gaise,perr,/wwb/cli/pr/embc,new,add,full,11/10/2005,15:10:51
The above is a sample of the data to be processed. Order as follows:
user1
user2
path
old/new
action
access
date
time
Is that in one variable? Posting test data in the form of a datastep makes things far simpler. Anyways, if that is one string, then first step is to parse it out into its component parts. Can't really help there as no idea what the component parts are, maybe it is comma delimited? Once you have it in variables, then identify what set of filters to apply.
This is comma delimited indeed. the variables are in the order mentioned earlier , i.e:
a)user1
b)user2
c)path
d)old/new
e)action
f)access
g)date
h)time
Ok, so if you want to filter on:
a) <ta> which could be ab, bd, ge, mf, ac.
From var C, something like:
data want; set have; length str code $200; str=scan(yourstring,2,","); code=ifc(scan(str,1,"/")="",scan(str,2,"/"),scan(str,1,"/")); if code in ("ab","bd","ge","mf","ac") then do; end; run;
Its basically just about getting the items you want into variables so you can use them. You can use scan with the given delimter to chop out the big string into smaller parts, and then scan the smaller parts etc.
And output for that example "data" would be what?
this is just sample of two observations:
STD SNP User1 User2 Path oldnew acctype1 acctype2 date time
AB007007 | Stab | luch | malai | ab/007/007/stab/unblin/raw | old | xxxx | full | 11/8/2004 | 16:14:43 |
BD007007 | stab | malai | malai | /wwb/data/bd/007/007/stab/unblin/views | new | add | read | 11/10/2004 | 10:54:49 |
If you can formulate a rule for the things you want to find, this rule can be put into code for the computer. If you cannot state such a rule, there's nothing we can do.
And whoever came up with
"paths doesn't really have a fixed format."
needs to find a profession that does NOT involve computers at all.
Here is a case where Pearl Regular Expressions shine:
data want; set have; if prxmatch('/\b(ab|bd|ge|mf|ac)\/\d{3,3}\/\d{3,3}\//',path); run;
A short explanation of the match string:
\b is a word boundary
(ab|bd|ge|mf|ac) means either of the letter combinations you are looking for
\/ means a slash ("/", which has to be escaped with a backslash)
\d{3,3} means exactly 3 digits
thank you very much for making me understand the Pattern matching method in sas, but now I am facing another issue, where the path is having two consecutive paths included like so:
chadh,EO,/gb/pro/cli/da/bd/185/047/stab/blin/le0/td.sas7bdat /gb/pro/cl/da/bd/185/047/stab/blin/le0/tf.sas7bdat,old,xxxx,none,1/10/2008,12:13:53
chadh,EO,/gb/pro/cli/da/bd/185/047/stab/blin/le0/td.sas7bdat /gb/pro/cli/da/bd/185/047/stab/blin/le0/tf.sas7bdat,new,add,full,1/10/2008,12:13:53
what do I do in this case?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.