Hi there
I have the following data set:
Name | Code | code date | Code km |
0LH68150 | 0064370100 | 20011212 | 5584 |
0LH68150 | 6331023700 | 20021017 | 24097 |
0LH68150 | 6610051500 | 20021017 | 24097 |
0LH68150 | 99110475MP | 20021017 | 24097 |
0LH68150 | 99600228MP | 20021017 | 24097 |
0LH68150 | 6400311200 | 20030904 | 44415 |
0LH68150 | 6711021200 | 20030904 | 44415 |
0LH68150 | 99110054MP | 20030904 | 44415 |
0LH68150 | 99210429MP | 20030904 | 44415 |
0LH68150 | 6162065600 | 20031124 | 48366 |
0LH68150 | 99510440MP | 20031124 | 48366 |
0LH68150 | 99600239MP | 20031124 | 48366 |
0LH68150 | 99740535MP | 20031124 | 48366 |
0LH68150 | 99114106MP | 20040121 | 53155 |
0LH68150 | 99110475MP | 20040806 | 67368 |
0LH68150 | 99340069MP | 20040806 | 67368 |
0LH68150 | 99600229MP | 20041203 | 74874 |
0LH68150 | 99639912MP | 20041203 | 74874 |
0LH68150 | 99100121MP | 20050124 | 78658 |
0LH68150 | 99639912MP | 20050124 | 78658 |
0LH68150 | 1151011500 | 20050125 | 78661 |
0LH68150 | 1151011500 | 20050125 | 78661 |
0LH68150 | 1153004800 | 20050125 | 78661 |
0LH68150 | 99210429MP | 20050324 | 84101 |
0LH68150 | 3160003900 | 20050727 | 92087 |
0LH68150 | 99110475MP | 20050727 | 92087 |
0LH68150 | 99510249MP | 20050727 | 92087 |
0LH68150 | 99639912MP | 20050727 | 92087 |
0LH68150 | 6121001200 | 20050811 | 92236 |
0LH68150 | 99600265MP | 20051005 | 93268 |
0LH68150 | 2121005200 | 20051111 | 95961 |
0LH68150 | 5121014100 | 20051121 | 96296 |
0LH68150 | 6577081500 | 20051121 | 96296 |
0LH68150 | 6577091500 | 20051121 | 96296 |
0LH68150 | 2300030800 | 20060505 | 104569 |
0LH68150 | 5135003900 | 20060505 | 104569 |
0LH68150 | 6211208600 | 20060505 | 104569 |
0LH68150 | 6325051200 | 20060505 | 104569 |
0LH68150 | 99110037MP | 20060505 | 104569 |
0LH68150 | 99639912MP | 20060505 | 104569 |
0LH68150 | 6112547900 | 20061128 | 114629 |
0LH68150 | 99600229MP | 20061128 | 114629 |
0LH68150 | 99639912MP | 20061128 | 114629 |
0LH68150 | 99510249MP | 20070108 | 117880 |
0LH68150 | 5172006900 | 20070131 | 119229 |
0LH68150 | 99639912MP | 20070625 | 125672 |
0LH68150 | 5133122900 | 20070921 | 129551 |
0LH68150 | 99639912MP | 20071002 | 129784 |
0LH68150 | 3441002900 | 20080107 | 130255 |
0LH68150 | 1231001200 | 20080116 | 131010 |
0LH68150 | 3441006500 | 20080116 | 131010 |
0LH68150 | 6121001200 | 20080116 | 131010 |
0LH68150 | 6453014800 | 20080116 | 131010 |
0LH68150 | 6321007900 | 20080122 | 131304 |
0LH68150 | 99110475MP | 20080208 | 131919 |
0LH68150 | 99210429MP | 20080208 | 131919 |
I would like to select codes that started with (11) and code_km > 30000 , where there was none of the following codes (99600228MP and 99639912MP) done before the code_Km 30000
data have;
input Name $ Code $10. code_date: YYMMDD8. Code_km;
cards;
0LH68150 0064370100 20011212 5584
0LH68150 6331023700 20021017 24097
0LH68150 6610051500 20021017 24097
0LH68150 99110475MP 20021017 24097
0LH68150 99600228MP 20021017 24097
0LH68150 6400311200 20030904 44415
0LH68150 6711021200 20030904 44415
0LH68150 99110054MP 20030904 44415
0LH68150 99210429MP 20030904 44415
0LH68150 6162065600 20031124 48366
0LH68150 99510440MP 20031124 48366
0LH68150 99600239MP 20031124 48366
0LH68150 99740535MP 20031124 48366
0LH68150 99114106MP 20040121 53155
0LH68150 99110475MP 20040806 67368
0LH68150 99340069MP 20040806 67368
0LH68150 99600229MP 20041203 74874
0LH68150 99639912MP 20041203 74874
0LH68150 99100121MP 20050124 78658
0LH68150 99639912MP 20050124 78658
0LH68150 1151011500 20050125 78661
0LH68150 1151011500 20050125 78661
0LH68150 1153004800 20050125 78661
0LH68150 99210429MP 20050324 84101
0LH68150 3160003900 20050727 92087
0LH68150 99110475MP 20050727 92087
0LH68150 99510249MP 20050727 92087
0LH68150 99639912MP 20050727 92087
0LH68150 6121001200 20050811 92236
0LH68150 99600265MP 20051005 93268
0LH68150 2121005200 20051111 95961
0LH68150 5121014100 20051121 96296
0LH68150 6577081500 20051121 96296
0LH68150 6577091500 20051121 96296
0LH68150 2300030800 20060505 104569
0LH68150 5135003900 20060505 104569
0LH68150 6211208600 20060505 104569
0LH68150 6325051200 20060505 104569
0LH68150 99110037MP 20060505 104569
0LH68150 99639912MP 20060505 104569
0LH68150 6112547900 20061128 114629
0LH68150 99600229MP 20061128 114629
0LH68150 99639912MP 20061128 114629
0LH68150 99510249MP 20070108 117880
0LH68150 5172006900 20070131 119229
0LH68150 99639912MP 20070625 125672
0LH68150 5133122900 20070921 129551
0LH68150 99639912MP 20071002 129784
0LH68150 3441002900 20080107 130255
0LH68150 1231001200 20080116 131010
0LH68150 3441006500 20080116 131010
0LH68150 6121001200 20080116 131010
0LH68150 6453014800 20080116 131010
0LH68150 6321007900 20080122 131304
0LH68150 99110475MP 20080208 131919
0LH68150 99210429MP 20080208 131919
;
data want;
set have;
format pcode $10. ;
retain pcode;
if substr(code,1,2)= "11" and Code_km >30000 and reverse(substr(reverse(pcode),1,2)) ~= "MP" then output;
pcode=code;
run;
hi Mohamed
let me try and explain further:
I have over a million observations with different names.
Let’s try look at this in another way
Each name has many codes, each code has a code_date and code_km.
I want to know all those names where the difference between (11)
code with its code_km has a difference
of 30000 from another code say 994561258.
I want to keep the name only.
Well, you've pretty much written it yourself there. Start by adding in your logic to the dataset, something like the below. However to use the below you need to look at your data a bit more, as in your example there are three codes with 11 as prefix, in this example they all have the same distance, but maybe not always, how will you handle that?
proc sql;
create table INTER as
select A.*,
case when -11 <= (A.CODE_KM - B.CODE_KM) <= 11 then "Y"
else "N" end as RES
from HAVE A
left join (select distinct NAME,CODE_KM from HAVE) B
on A.NAME=B.NAME;
quit;
This will give you the dataset you had before, plus a Y/N column. From that you can proc sort by name nodupkey to get your list (or you can add into the above SQL, I keep it separate to explain).
hi again
where does the 30000 km differnce come in
..
Sorry, I had put 11:
proc sql;
create table INTER as
select A.*,
case when -3000 <= (A.CODE_KM - B.CODE_KM) <= 3000 then "Y"
else "N" end as RES
from HAVE A
left join (select distinct NAME,CODE_KM from HAVE) B
on A.NAME=B.NAME;
quit;
Note, have allowed =/- 3000
hi can you explain to me what the second part of the program is doing .. thanks
Yes, I really should write the code correctly:
proc sql;
create table INTER as
select A.*,
case when -3000 <= (A.CODE_KM - B.CODE_KM) <= 3000 then "Y"
else "N" end as RES
from HAVE A
left join (select distinct NAME,CODE_KM from HAVE where substr(CODE,1,2)="11") B
/* Merge on the KM from record with code prefixed by 11, so 78661 is merge onto all records in the example, so the calculated RES can be done. */
on A.NAME=B.NAME;
quit;
hi
if i want to compare the mileage og ay just 2 other code sto the 11 defect code..say the km differnce between
99110037MP
9999125481
can anyone assist me.
hi
if i want to compare the mileage og ay just 2 other code sto the 11 defect code..say the km differnce between
99110037MP
9999125481
to the 11 code .. adn keep only the 11 code.
can anyone assist me.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.