Greetings all,
I have the following data set stored as a .txt file. How can I use only one DATA step to produce an output that only contains one observation for each variable? My ultimate goal is to have the output as the attached picture. I know I am currently missing a few other variables, but how can I create those variables from the existing ones? This is what I have so far.
data testing;
infile datalines firstobs=2;
input name $ 1-15 latest_sales_date:date11. person_total_sales 28-32;
format date date11.;
datalines;
1234567890123456789012345678901234567890
David Wong 1/Aug/2018 13200
Brian Leung 15/Sep/2018 23450
10/Sep/2018 33000
Mary Chan 3/Jul/2018 45600
20/Aug/2018 37800
1/Sep/2018 21500
30/Aug/2018 42000
John Tam 12/Sep/2018 35000
run;
given your sample data @blowsnow__ this should work.
data testing;
infile datalines firstobs=2;
input name $ 1-15 latest_sales_date date11. person_total_sales 28-32;
format latest_sales_date date11.;
/*1234567890123456789012345678901234567890*/
datalines;
David Wong 1/Aug/2018 13200
Brian Leung 15/Sep/2018 23450
10/Sep/2018 33000
Mary Chan 3/Jul/2018 45600
20/Aug/2018 37800
1/Sep/2018 21500
30/Aug/2018 42000
John Tam 12/Sep/2018 35000
run;
data temp;
set testing;
retain holder;
if not missing(name) then holder=name;
run;
proc sort data=temp out=temp;
by holder latest_sales_date;
run;
data want(drop=holder hdate person_sales_total);
retain days person_sales_total 0 hdate ;
set temp;
by holder latest_sales_date;
running_total + person_total_sales;
Person_sales_total + person_total_sales;
if first.holder then hdate = latest_sales_date;
if last.holder then days = (latest_sales_date - hdate);
name=holder;
person_total_sales = person_sales_total;
if last.holder then
do;
output;
person_total_sales = 0;
end;
run;
Hi @blowsnow__ is your expectation one datastep or one pass of the dataset?
Plz clarify as within a datastep , we can have multiple passes
that changes what I have already posted. It doesn't meet the requirements since the requirements were not fully stated.
You just need to read the name value into a dummy column and then save it into the real Name variable when it is not empty. Use RETAIN the carry the old values of NAME forward.
Here is sketch of what you need to do.
input dummy $ 1-15 ...... ;
name=coalescec(dummy,name);
retain name;
But even after running this program, I still have multiple observations for each person. How can my output shows only one name; one Latest_Sales_Date number that is the largest among others; and one Total_Sales sum of all the associated sales amount; for each respective salesperson?
data Sales_Total ;
infile datalines firstobs=2 ;
length Dummy Name $15 Latest_Sales_Date Total_Sales 8;
input dummy $15. Latest_Sales_Date date11. Total_Sales ;
retain name;
name=coalescec(dummy,name);
drop dummy;
datalines;
1234567890123456789012345678901234567890
David Wong 1/Aug/2018 13200
Brian Leung 15/Sep/2018 23450
10/Sep/2018 33000
Mary Chan 3/Jul/2018 45600
20/Aug/2018 37800
1/Sep/2018 21500
30/Aug/2018 42000
John Tam 12/Sep/2018 35000
run;
Greetings all,
I am trying to read a data set into the following table.
When I try to run my program, however, the Name column contains more than one observation for each person. What can I do to make the output contains only one name for each person? And the same goes for the date column, how can I only select the most recent date to show in my output table? Here is what I have so far.
data Sales_Total ;
infile datalines firstobs=2 ;
length Dummy Name $15 Latest_Sales_Date Total_Sales 8;
input dummy $15. Latest_Sales_Date date11. Total_Sales ;
retain name;
name=coalescec(dummy,name);
drop dummy;
datalines;
1234567890123456789012345678901234567890
David Wong 1/Aug/2018 13200
Brian Leung 15/Sep/2018 23450
10/Sep/2018 33000
Mary Chan 3/Jul/2018 45600
20/Aug/2018 37800
1/Sep/2018 21500
30/Aug/2018 42000
John Tam 12/Sep/2018 35000
run;
Why did you start a new post for the same question. refer to your last post and follow through with your issue that is the same topic same question in the same post please.
@blowsnow__ prior post same question
Note you also don't have all of the original requirements listed for the same question listed here.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.