- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 09-19-2022 07:19 AM
(2650 views)
Hi Everyone. I've a variable which has value like below. Here every part is seperated by / .
Data have; Input tpc $30.; datalines; abcde1234&9&Y / defgh0987&2&N / attire1790&4&Y Daikin1290&6&Y / rets0987&2&Y / attire1790&4&Y
Daikin1290&6&N / rets0987&2&N / attire1790&4&N
; run;
Required output; Tpc countY countN abcde1234&9&Y / defgh0987&2&N / attire1790&4&Y 13 2 Daikin1290&6&Y / rets0987&2&Y / attire1790&4&Y 12 0
Daikin1290&6&N / rets0987&2&N / attire1790&4&N 0 12
I require the count of digits preceded by & in two different fields named countY, countN.
For example: Daikin1290&6&Y / rets0987&2&Y / attire1790&4&Y in this part the digits preceded by & is 6,2,4 so the county is 12 and countN is 0 ,because there's no substring preceded by N. Likewise this approach applicable to each string value. Looking for some support. Thankyou!.
13 REPLIES 13
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't get it. Why 12 and 13?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's the sum of the digits preceded by &. My bad, I said it as count l.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
is &Y / &N always the last part of the string?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, you're right. And we need to put the sum of digits for &Y , &N in two seperate columns.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Data have;
Input tpc $50.;
datalines;
abcde1234&9&Y / defgh0987&2&N / attire1790&4&Y
Daikin1290&6&Y / rets0987&2&Y / attire1790&4&Y
Daikin1290&6&N / rets0987&2&N / attire1790&4&N
;
data want;
set have;
do i = 1 to countw(tpc, ' / ');
w = scan(tpc, i, ' / ');
l = substr(w, max(1, length(w) - 1));
d = prxchange('s/.*&(\d).*/$1/', -1, w);
if l = '&Y' then sumY = sum(sumY, input(d, 8.));
if l = '&N' then sumN = sum(sumN, input(d, 8.));
end;
keep tpc sumY sumN;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Appreciate your help. I've millions of records like these I just posted a chunk of data here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sure. Does the code work for you?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It did work but, instead of 0's it got period.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you want 0s ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you want 0s then simply do
Data have;
Input tpc $50.;
datalines;
abcde1234&9&Y / defgh0987&2&N / attire1790&4&Y
Daikin1290&6&Y / rets0987&2&Y / attire1790&4&Y
Daikin1290&6&N / rets0987&2&N / attire1790&4&N
;
data want;
set have;
sumY = 0;
sumN = 0;
do i = 1 to countw(tpc, ' / ');
w = scan(tpc, i, ' / ');
l = substr(w, max(1, length(w) - 1));
d = prxchange('s/.*&(\d).*/$1/', -1, w);
if l = '&Y' then sumY = sum(sumY, input(d, 8.));
if l = '&N' then sumN = sum(sumN, input(d, 8.));
end;
keep tpc sumY sumN;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When I used your code against millions of records, it didn't work. I've got 0's for all the records evenif I initialise SumY, SumN as 0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Show us the data for a row where the code does not give the right answer.
--
Paige Miller
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In my data I've newly added special characters so, for them the if condition mentioned in his code isn't working n also for a long string his code for d taking the last substring followed by /. For instance, in this long string it is considering only the last substring ie. attire1790&4&Y
abcde1234&9&Y / defgh0987&2&N / attire1790&4&Y.
abcde1234&9&Y / defgh0987&2&N / attire1790&4&Y.