BookmarkSubscribeRSS Feed
PA8
Fluorite | Level 6 PA8
Fluorite | Level 6


proc sort data=old.Ex1_Missing_Weeks out=ds_com; by SUBJECT RecordPosition Ref_Date1_ Ref_Date2_ EXSTDT ; where EXSTDT ne ''; run;

proc sort data=new.Ex1_Missing_Weeks out=ds_new; by SUBJECT RecordPosition Ref_Date1_ Ref_Date2_ EXSTDT; run;

data ds_new;
set ds_new;
if RecordPosition=. then RecordPosition=0;
subject=strip(subject);
run;

proc sort data=ds_new; by SUBJECT RecordPosition Ref_Date1_ Ref_Date2_ EXSTDT; run;

data ds_com_;
set ds_com;
length EXSTDT1 $10.;
if Ref_Date0_ ne . then
Ref_Date0_n =put(Ref_Date0_ ,yymmdd10.);
else Ref_Date0_n='';
if Ref_Date1_ ne . then
Ref_Date1_n =put(Ref_Date1_ ,yymmdd10.);
else Ref_Date1_n='';
if Ref_Date2_ ne . then
Ref_Date2_n =put(Ref_Date2_ ,yymmdd10.);
else Ref_Date2_n='';

if EXSTDT not in ('','None') then do;
mm=scan(EXSTDT,1,'/');
dd=scan(EXSTDT,2,'/');
if length(EXSTDT) =8 then
EXSTDT1=strip(substr(EXSTDT,5))|| '-0'||substr(EXSTDT,1,1) || '-0'|| substr(EXSTDT,3,1) ;
else if length(EXSTDT) =9 and length(mm)=1 then
EXSTDT1=strip(substr(EXSTDT,6))|| '-0'||substr(EXSTDT,1,1) || '-' ||substr(EXSTDT,3,2) ;
else if length(EXSTDT) =9 and length(dd)=1 then
EXSTDT1=strip(substr(EXSTDT,6))|| '-' ||substr(EXSTDT,1,2) || '-0'|| substr(EXSTDT,4,1) ;
else if length(EXSTDT)=10 then
EXSTDT1=substr(EXSTDT,7)|| '-' || substr(EXSTDT,1,2) ||'-'|| substr(EXSTDT,4,2) ;
end;
else EXSTDT1=EXSTDT;
subject=strip(subject);
DM_Comments=strip(DM_comments);
if RecordPosition=. then RecordPosition=0;
drop mm dd EXSTDT Ref_Date1_ Ref_Date2_ Ref_Date0_;
run;

proc sort data=ds_com_ out=ds_com1(rename=(EXSTDT1=EXSTDT Ref_Date0_n=Ref_Date0_ Ref_Date1_n=Ref_Date1_ Ref_Date2_n=Ref_Date2_));
by SUBJECT RecordPosition Ref_Date1_n Ref_Date2_n EXSTDT1 DM_Comments; run; *EXSTDT1=EXSTDT;

proc sort data=ds_com1; by SUBJECT RecordPosition Ref_Date1_ Ref_Date2_ EXSTDT; run;


data fin merge.Ex1_Missing_Weeks;
merge ds_new(in=a) ds_com1(in=b);
by SUBJECT RecordPosition Ref_Date1_ Ref_Date2_ EXSTDT;
if a;
run;

 

9 REPLIES 9
Reeza
Super User
Do you want to add some context to this? And maybe the log to illustrate the error? Or is it a logical erorr versus syntax error?
We don't have the data so a wall of code without a clearly formed question isn't something we can answer.

Some good details on how to ask a good question are here:
https://stackoverflow.com/help/how-to-ask
ballardw
Super User

With this code it appears you are doing something, possibly unnatural, with date values.

if Ref_Date0_ ne . then
Ref_Date0_n =put(Ref_Date0_ ,yymmdd10.);
else Ref_Date0_n='';
if Ref_Date1_ ne . then
Ref_Date1_n =put(Ref_Date1_ ,yymmdd10.);
else Ref_Date1_n='';
if Ref_Date2_ ne . then
Ref_Date2_n =put(Ref_Date2_ ,yymmdd10.);
else Ref_Date2_n='';

if EXSTDT not in ('','None') then do;
mm=scan(EXSTDT,1,'/');
dd=scan(EXSTDT,2,'/');
if length(EXSTDT) =8 then
EXSTDT1=strip(substr(EXSTDT,5))|| '-0'||substr(EXSTDT,1,1) || '-0'|| substr(EXSTDT,3,1) ;
else if length(EXSTDT) =9 and length(mm)=1 then
EXSTDT1=strip(substr(EXSTDT,6))|| '-0'||substr(EXSTDT,1,1) || '-' ||substr(EXSTDT,3,2) ;
else if length(EXSTDT) =9 and length(dd)=1 then
EXSTDT1=strip(substr(EXSTDT,6))|| '-' ||substr(EXSTDT,1,2) || '-0'|| substr(EXSTDT,4,1) ;
else if length(EXSTDT)=10 then
EXSTDT1=substr(EXSTDT,7)|| '-' || substr(EXSTDT,1,2) ||'-'|| substr(EXSTDT,4,2) ;
end;
else EXSTDT1=EXSTDT;

99.9% of the time I see something that starts by Putting a date value into a character version and then tearing it apart tells me that someone does not know the functions Year, Month, Day exist to extract numeric values. All that EXSTDT with substr looks like someone does not know that all you need to do to change the appearance of a date value is to use a format.
The following code will write a date to the results window in just a few of the date formats that SAS provides.

data example;
   file print;
   date= '15DEC2020'd;
   put "date9. format" date= date9.;
   put "mmyydd10. format" date= mmddyy10.;
   put "yymon. format" date= yymon.;
   put "yyq. format" date= yyq.;
   put "yymmdd10. format" date= yymmdd10.;
   put "mmddyyd10. format" date= mmddyyd10.;
   put "mmddyyc10. format" date= mmddyyc10.;
   put "mmddyyb10. format" date= mmddyyb10.;
   put "mmddyyp10. format" date= mmddyyp10.;
run;   

If you want to sort, compare or display dates then the way is to use the SAS date numeric value and use the desired format. The Format of the numeric value means nothing when it comes to merging or comparing unless you have malformed dates with non-integer values.

 

And if the exercise was because one set had date in character form it is almost certainly easier to change the character version to numeric and avoid all that string manipulation.

 

Now: Which variable from which data set are you "unable to bring values" while merging?

If both data sets have a variable of the same name that is NOT on the BY statement then that is the intended behavior of Merge. One of the uses of merge is to replace values.

 

In the following code I expect the values for Y from the data set Two to replace the Y values in data set One;

data one;
  input x y z;
datalines;
1 1 1 
2 2 2
3 3 3
;

data two;
  input x y q;
datalines;
1 11 44
2 22 22
3 33 33
;

data example;
   merge one two;
   by x;
run;

So if you are "missing" one of the variables you may need to rename it to avoid this overwrite behavior.

Which is very easy to do:

data example2;
   merge one  two(rename=(y=y2));
   by x;
run;
PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

Thank you so much for taking your time and explaining in detail. The issue is however not obvious to me as there are no Errors/Warnings/Notes written to the Log window.

 

I need bring over values from one particular column called DM_Comments in dataset ds_com1(in=b), as this is a listing which helps carry over the past comments and running with fresh data. So each time DM team just needs to look at the new records with missing DM_Comments and not review the whole output all over again.

Hope that helps understand the rationale for the listing.

 

I am doing the similar merging for several other listings and the values in the column get carried over. I also used Strip() function just to remove any unwanted spaces in the By variables. Also tried with Proc Sql but same output.

 

I have also attached the Log herewith.

 

 

 

ballardw
Super User

I think you may just need to examine your result a bit closer:

From your log:

26551  data fin merge.Ex1_Missing_Weeks;
26552  merge ds_new(in=a) ds_com1(in=b);
26553  by SUBJECT ;*RecordPosition Ref_Date1_ Ref_Date2_ EXSTDT;
26554  if a;
26555  run;

NOTE: There were 16976 observations read from the data set WORK.DS_NEW.
NOTE: There were 16211 observations read from the data set WORK.DS_COM1.

If this is supposed to be a one-to-one match then from the number of observations in the source data sets then 765 observations do not have a match in the DS_Com1 data. If all those appear at the top of the data set you may just think the values didn't come in if you did not scroll down far enough.

Or if there are no actual matches at all then the "if a;" means that none of the Ds_com1 data would be available.

 

Try removing or commenting out the "if a;" and see if your output data set jumps to 16976+16211 =    33187 observations (or some number significantly larger than 16976) that will tell you either no matches or very few.

 

PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

Thanks again !

It is not supposed to be a one to one match since one of the dataset contains output from last month and more records have been entered since then. Here are some details on the datasets -

 

1. Old Dataset from which Comments needs to be carried over - Ds_com1 -

Subject Recordpos  EXDSTXT   DM_COMMENTS   EXSTDT   REF_DATE0_  REF_DATE1_  REF_DATE2_

101         1                15.5            Completed Dose     2018-10-12  2018-10-16    2018-10-16       2018-10-16

101         2                10.5            Incomplete Dose     2018-10-12  2018-10-16    2018-10-16       2018-10-16

 

2. New Dataset with fresh data which does not contain Comments - Ds_new -

Subject Recordpos  EXDSTXT     EXSTDT         REF_DATE0_  REF_DATE1_  REF_DATE2_

101         1                15.5               2018-10-12    2018-10-16    2018-10-16       2018-10-16

101         2                10.5               2018-10-16    2018-10-20    2018-10-20       2018-10-20

101         3                15.5               2018-10-18    2018-10-22    2018-10-22       2018-10-22

 

3. Expected Dataset after merge - Fin/Mer - 

Subject Recordpos  EXDSTXT   DM_COMMENTS   EXSTDT      REF_DATE0_  REF_DATE1_  REF_DATE2_

101         1                15.5            Completed Dose     2018-10-12   2018-10-16    2018-10-16       2018-10-16

101         2                10.5            Incomplete Dose     2018-10-16   2018-10-20    2018-10-20       2018-10-20

101         3                15.5                                            2018-10-18   2018-10-22    2018-10-22       2018-10-22

 

So, for the old matching records DM_comments column should have data carried over while for the new records it should be blank.

 

ballardw
Super User

Data.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

Make subsets of the final two data sets that you merge with ONE record each that should match.

Use the link above to generate a data step of each of those two one record data set.

Copy that code and paste it into a text box opened with the </>. This is critical. The message boxes on this forum will reformat text and often does so in a manner that the generated data steps will not run.

Then we can test the data.

 

Note: showing example data from either views, proc print output or similar can be influenced by things like formats so that you think the values are the same but the underlying values are not actually equal.

PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

Thanks once again for your help! I figured the issue was with another program used to convert CSV into a sas dataset. Because of which some hidden spaces were getting saved in the column values used for merging. I recreated the dataset and it solved the issue.

Shmuel
Garnet | Level 18

Missing a column can be either because of dropping it unintentionally or because of renaming it.

You did not mention which column you miss. Could it be that you searched with mistyping the variable name ?

 

Check the DROP and RENAME statements in your code.

PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

Hello,

The column is present in the merged dataset but with missing values or No values. However DM team has entered data in it so it should be present with values/data. Also, this column is only present in one dataset so overwriting is also not the possibility.

Thank you !

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 782 views
  • 0 likes
  • 4 in conversation