BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
carl_miles
Calcite | Level 5

Consider the dataset 

Data Exam;

format date date9.;
Input ID $ exam_date Test1 $;
Datalines;
9000001  15jan A
9000001  16jan A+B
9000001  17jan A+C

9000001  15jan A
9000001  16jan D
9000001  17jan A+E

 

my answer should be 

if it contains A, B Or in combination then 1 if they have something other with them then 0

9000001  15jan A  ---------- 1
9000001  16jan A+B ------- 1
9000001  17jan A+C ------- 0

9000001  15jan A ---------- 1
9000001  16jan D ---------- 0
9000001  17jan A+E ------- 0


Can you write a code which can help me in  this ode this flag 
here we have only A,B two values max to max four combination i will get (A,B,A+B,B+A) 
But I want code that will handle as many as values 

1 ACCEPTED SOLUTION

Accepted Solutions
Oligolas
Barite | Level 11

Hi,

here you go:

data want;
   set exam;
   length flag 8;
   flag=ifn(verify(compress(test1,,'ak'),'AB'),0,1);
run;
________________________

- Cheers -

View solution in original post

12 REPLIES 12
Oligolas
Barite | Level 11

Hi,

here you go:

data want;
   set exam;
   length flag 8;
   flag=ifn(verify(compress(test1,,'ak'),'AB'),0,1);
run;
________________________

- Cheers -

carl_miles
Calcite | Level 5
Thank you so much


Could you please explain how this step is working
flag=ifn(verify(compress(test1,,'ak'),'AB'),0,1);
Oligolas
Barite | Level 11

 

flag=ifn(verify(compress(test1,,'ak'),'AB'),0,1);

 

in SAS double click on a function and press F1 to start the help on the corresponding function page.

 

Start with the innermost function which is compress.

compress is used to remove signs.

The modifier 'a' means to select the alphabetic characters to the list of characters (to remove)

but the following 'k' modifier means to keep the 'keep the characters in the list instead of removing them'

so 'ak' means delete anything but keep the alphatic characters from test1.

 

Verify 'Returns the position of the first character in a string that is not in any of several other strings'

so it returns a numeric>0 if another character is found than the ones specified

Verify AB returns a numeric>0 if the result of compress contains anything else than A or B

The ifn function 'Returns a numeric value based on whether an expression is true, false, or missing.'

So if the Verify result is>0 than anything else than A or B was found so the flag should be 0

in the other case the flag should be 1

 

________________________

- Cheers -

carl_miles
Calcite | Level 5

Thank you so much for the teaching

Actually I have case that 

900001 sebastian 

900001 east

900001 nested 

900001 test

 

 

by following your code:

flag=ifn(verify(compress(test1,,'ak'),'sebastian', 'nested'),0,1);

I am getting flag

900001 sebastian ----------1

900001 east ----------------1

900001 nested -------------1

900001 test ----------------1

I want this case should be 

900001 sebastian ----------1

900001 east ----------------0

900001 nested -------------1

900001 test ----------------0

Could you kindly help me with this

Oligolas
Barite | Level 11

verify checks the presence of a character not if a word is matched, so if you specify 'sebastian' it will check for an 's', an 'e', a 'b' and so on...

if you want to match words you could use the findw function or use a regular expression

 

Data Exam;
Input ID $ Test1 $20.;
Datalines;
9000001 Sebastian
9000001 East
9000001 Nested
9000001 Test
;
run;

*case sensitive approach == no match;
data want1;
   set exam;
   length flag1 flag2 8;
   flag1=ifn(findw(test1,'sebastian') OR findw(test1,'nested'),1,0);
   flag2=prxmatch('/\bsebastian\b/',strip(test1));*with regular expression;
run;

*case insensitive approach;
data want2;
   set exam;
   length flag1 flag2 8;
   flag1=ifn(findw(test1,'sebastian', ' ','i') OR findw(test1,'nested', ' ','i'),1,0);
   flag2=prxmatch('/\bsebastian\b|\bnested\b/i',strip(test1));*with regular expression;
run;
________________________

- Cheers -

carl_miles
Calcite | Level 5
Data Exam;
Input ID $ Test1 $40.;
Datalines;
9000001 Sebastian
9000001 East
9000001 Nested, test
9000001 Test
9000001 Sebastian,Test
9000001 Sebastian,Nested
;
run;

*case sensitive approach == no match;
data want1;
   set exam;
   length flag1 flag2 8;
   flag1=ifn(findw(test1,'sebastian') OR findw(test1,'nested'),1,0);
   flag2=prxmatch('/\bsebastian\b/',strip(test1));*with regular expression;
run;

*case insensitive approach;
data want2;
   set exam;
   length flag1 flag2 8;
   flag1=ifn(findw(test1,'sebastian', ' ','i') OR findw(test1,'nested', ' ','i'),1,0);
   flag2=prxmatch('/\bsebastian\b|\bnested\b/i',strip(test1));*with regular expression;
run;

The code is really helpful 

Its just I should have mentioned this scenerio earlier 
Right now I am getting something in the attached photo 

carl_miles_0-1694005706495.png

 

So where i highlighted where it should be 0 but I getting 1

Could you help me like how can we approach this problem where i need the word in combination as well if it is present as single or multiple in combination 

 

So I assumed the same logic instead A, B. C i used words 
Could give help me to get the code like in words


Oligolas
Barite | Level 11

Hi,

I see, in this case I'd loop over the words to check the match


*case insensitive approach;
data want2;
   set exam;
   length flag1 flag2 8 CurrentWord $200;

   do i=1 to countw(test1); *loop over the words consult the help to check which delimiters are used;
      CurrentWord=scan(test1,i);
      flag1=ifn(findw(CurrentWord,'sebastian', ' ','i') OR findw(CurrentWord,'nested', ' ','i'),1,0);
      flag2=prxmatch('/^\bsebastian\b|\bnested\b$/i',strip(CurrentWord)); *with regular expression;
      if flag1 ne flag2 then put 'ERROR: Please check inconsistency ' test1= CurrentWord=;
      if flag1 eq 0 then leave; *stop loop in case a word is not matched and return 0;
   end;
   drop CurrentWord i;
run;
________________________

- Cheers -

carl_miles
Calcite | Level 5
Hi Oligolas,

data CustomerOrders2;
input CustomerID2 OrderDate2 :date9.;
format OrderDate2 date9.;
datalines;
1001 01JAN2023
1001 15FEB2023
1001 12DEC2023
1001 25JAN2024
;
run;
So in above code we can see that Orderdate2 

I want one more column that will be next date following will come as below mentioned output i want 
CustomerID2  OrderDate2   Orderdateprev

1001                 01JAN2023      15FEB2023 

1001                 15FEB2023       12DEC2023

1001                 12DEC2023      25JAN2024

1001                 25JAN2024           -



So the next immediate will append and if there is none then it will be null but it will still in cohort 

Could you guide me here 
Oligolas
Barite | Level 11

Here you go: lag function

data CustomerOrders2;
input CustomerID2 OrderDate2 :date9.;
format OrderDate2 date9.;
datalines;
1001 01JAN2023
1001 15FEB2023
1001 12DEC2023
1001 25JAN2024
;
run;

proc sort; by CustomerID2 descending OrderDate2 ; run;

data want;
   set CustomerOrders2;
   by CustomerID2 descending OrderDate2;
   length NextDate 8;
   format NextDate date9.;
   NextDate=lag1(OrderDate2);
   if first.CustomerID2 then NextDate=.;
run;

proc sort; by CustomerID2 OrderDate2 ; run;
________________________

- Cheers -

carl_miles
Calcite | Level 5

Hi Oligolas,

thanks for the reply earlier 
I have an dataset like this

cust_id   date1                date2                 date3

1001        15JAN2023    20FEB2023     30MAR2023

I want an ouput like 

cust_id   date1                date2                 date3                  date 4

1001        15JAN2023    20FEB2023     30MAR2023    15JAN2023

1001        15JAN2023    20FEB2023     30MAR2023    20FEB2023 

 

if you see date1 and date2 are appended against date3 

like that i want an output 
i have dataset of 15000 rows
kindly guide me here pls

Oligolas
Barite | Level 11

Difficult to say without any explanation of the derivation of Date4 

Anyway I do not see the relation of your last questions with the topic of this thread and if there is, why you don't follow the guidance as explained before

Kindly ask you create a new topic to treat this new question pls

________________________

- Cheers -

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 898 views
  • 0 likes
  • 2 in conversation