BookmarkSubscribeRSS Feed
Kenj11
Calcite | Level 5

I am trying to correct the master dataset using the Correct dataset. The correct dataset was produced after the double-entry. The second column is the Correct dataset indicates the name of the variable that needs to be corrected. The problem about the code below is the missing. Some of the entries that need to be corrected is the missing. If missing is the correct entry then the master dataset should be updated as such. However, when I transpose the Correct dataset, additional empty cells are created, which SAS can't distinguish the different between the correct missing vs missing entries resulted from the transpose. I need a code that only updates the variables listed in the Correct dataset. Thank you in advance.

data master;

  input ID $ var1 $ var2 $;

  datalines;

  123 Yes Red

  124 No Blue

  125 Yes Yellow;


  data correct ;

  input ID $ variable $ correct $;

  datalines;

  123 var1 No

  123 var2 Orange

  124 var2 Red

  125 var1 99;

proc sort data=Correct;by ID;run;

proc transpose data=correct out=cwide (drop=_name_);

  by ID;

  var correct;

  id variable;

run;

proc contents data =cwide;run ;


Data Final;

  update master cwide;

  by ID;

run;

proc print data=final; title "Corrected";run;

10 REPLIES 10
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10

Did you execute the SAS code piece and analyze/compare the outcome?  The resulting WORK.FINAL dataset is correct, with the replaced column/variable values from the WORK.CORRECT file as would be expected.  With an UPDATE request, SAS does not manipulate any columns for which there is a blank / missing value identified in the UPDATE statement (second argument / file).

Scott Barry

SBBWorks, Inc.

Kenj11
Calcite | Level 5

Right. This is a  "correct missing vs missing entries" question. The var1 for 125 should be empty in the Final dataset but it is not because after transposing Correct dataset, SAS creates the missing field for  Var1 for 124 and Var2 for 125. When I update the Master with the transposed dataset, SAS can't differentiate the 'correct' missing (=Var1 for 125) and other missing entries created by SAS (Var1 for 124 and Var2 for 125).

The Final dataset should show that Var1 for 125 is empty. I hope this clear up some confusion. Sorry about that.

Steelers_In_DC
Barite | Level 11

I'd add a step to identify missing.  Not sure that I would have come up with the same solution from the start but after looking at the transpose that makes sense to keep.  Although I enjoyed the smiley faces that didn't really answer my previous question.  Below I am loading the file as CSV, so the blank comes through and is clearly missing due to the comma.

This will work, not sure if it suits your purposes.

data master;

infile datalines dsd;

  input ID $ var1 $ var2 $;

  datalines;

123,Yes,Red

124,No,Blue

125,Yes,Yellow

;

data correct;

infile datalines dsd;

  input ID $ variable $ correct $;

  datalines;

123,var1,No

123,var2,Orange

124,var2,Red

125,var1,

;

data correct_II;

set correct;

if missing(correct) then correct = 'missing';

run;

proc sort data=Correct;by ID;run;

proc transpose data=correct_II out=cwide (drop=_name_);

  by ID;

  var correct;

  id variable;

run;

data final;

update master cwide;

by id;

if var1 = 'missing' then call missing(var1);

if var2 = 'missing' then call missing(var2);

run;

Ksharp
Super User

What do you mean by "correct missing vs missing entries" ?

Do you mean if any one of var1 and var2 in CORRECT is missing , the updated should be stopped ?

Code: Program

data master;
  input ID $ var1 $ var2 $;
  datalines;
  123 Yes Red
  124 No Blue
  125 Yes Yellow
  ;

  data correct ;
  input ID $ variable $ correct $;
  datalines;
  123 var1 No
  123 var2 Orange
  124 var2 Red
  125 var1 99
  ;

proc sort data=Correct;by ID;run;
proc transpose data=correct out=cwide (drop=_name_);
  by ID;
  var correct;
  id variable;
run;

Data Final;
  merge master cwide(rename=(var1-var2=_var1-_var2));
  by ID;
  if cmiss(of _var1-_var2)=0 then do;var1=_var1;var2=_var2;end;
  drop _:;
run;

Xia Keshan

Kenj11
Calcite | Level 5

"correct missing vs missing entries". Sorry for the confusion.

Steelers_In_DC
Barite | Level 11

If something comes through in the correct dataset as missing how is it shown,  How do you know something should be 'correctly' missing?

Kenj11
Calcite | Level 5

Right. When you transpose the Correct dataset it will get:

**I modified the original Correct dataset -- For 125, the var1 should be empty.

The problem is that when I transpose the Correct dataset, transposed dataset will automatically create the missing field in var1 for 124 and var 2 for 125 (marked with :smileycry:) -- var1 (marked with :smileygrin: ) should be empty for 125 as it is "correctly' missing. However when I update using this transposed Correct dataset, SAS won't update the Var1 to missing as SAS can't tell the difference between :smileygrin: and :smileycry:.

ID          Var1      Var2

123     No         Orange

124     :smileycry:         Red

125      :smileygrin:              :smileycry:

I hope this helps to clear up things.


Kenj11
Calcite | Level 5

*I modified the original Correct dataset -- For 125, the var1 should be empty.

The problem is that when I transpose the Correct dataset, transposed dataset will automatically create the missing field in var1 for 124 and var 2 for 125 (marked with :smileycry:) -- var1 (marked with :smileygrin: ) should be empty for 125. However I am only interested to update ones from the Correct dataset --meaning the Final dataset should show that var1 for 125 is empty.

ID          Var1      Var2

123     No         Orange

124     :smileycry:         Red

125      :smileygrin:              :smileycry:

data master;

  input ID $ var1 $ var2 $;

  datalines;

  123 Yes Red

  124 No Blue

  125 Yes Yellow;


  data correct ;

  input ID $ variable $ correct $;

  datalines;

  123 var1 No

  123 var2 Orange

  124 var2 Red

  125 var1  .  ; ** This should be empty. (Character variable --  I didn't know what to put other than .)

proc sort data=Correct;by ID;run;

proc transpose data=correct out=cwide (drop=_name_);

  by ID;

  var correct;

  id variable;

run;

proc contents data =cwide;run ;


Data Final;

  update master cwide;

  by ID;

run;

sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10

Clearly UPDATE processing doesn't work with such desired behavior, so you will want to consider assigning some "placeholder" unique-string data-value in WORK.CORRECT that would represent "empty" (which would not appear in your data normally), then in the final DATA step to create WORK.FINAL, adjust the data-value(s) back to blank/missing after the BY statement.

Scott Barry

SBBWorks, Inc.

Tom
Super User Tom
Super User

If you want your transactions to be able to convert the value to missing then you need to set the value to ._ in the transaction dataset.

data master;

  input ID $ var1 $ var2 $;

datalines;

  123 Yes Red

  124 No Blue

  125 Yes Yellow

;;;;

data trans ;

  infile cards truncover ;

  if 0 then set master;

  input id= var1= var2= ;

datalines;

id=123 var1= No

id=123 var2= Orange

id=124 var2= Red

id=125 var1= _

;;;;

data Final;

  update master trans;

  by ID;

run;

proc print;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1252 views
  • 0 likes
  • 5 in conversation