BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SixUnder
Obsidian | Level 7

proc import datafile="c:/data.csv" out=datain replace dbms=csv;
  getnames=no;
  datarow=4;
  guessingrows=40;
run;

data need (drop=var2 var3);
  set datain (rename=(var1=geo_id));
  if substr(var2,1,89)='Estimate; Female: - Worked full-time, year-round in the past 12 months: - With income: -';
  if substr(reverse(strip(var2)),1,7) eq 'erom ro' then salary=100000;
  else if substr(reverse(strip(var2)),1,7) eq 'ssol ro' then salary=2499;
  else salary=input(substr(var2,index(var2,'to ')+4),comma8.);
  count=input(var3,8.);
run;

proc report data = datfile.dat NOWINDOWS HEADLINE;
    title 'Auglaize County sorted Data';
    column Wages Full time Part time Population;
run;

LIBNAME datafile "c file location";

proc report data = datafile.dat NOWINDOWS;
    title 'Auglaize County sorted Data';
run;

PROC SORT DATA=auto OUT=auto2 ;
  BY foreign ;
RUN ;
 
PROC PRINT DATA=auto2 ;
RUN ;

@SixUnder wrote:

I am going to try and create the data step, the input and possibly the variables and go from there. I will come back here and post once i get that far. Whos knows, you are all much more advanced than I am.


close as i have been so far..

mkeintz
PROC Star

I think your program looks pretty straightforward and clean.

 

One comment - on the use of the REVERSE function, which you are using to get substrings on the right-hand side of some text. You can take advantage of the LENGTH function instead, which will make their intention more apparent:

 

I.e. instead of

  if=substr(reverse(strip(var2)),1,7) eq 'erom ro' then salary=100000;

  else if substr(reverse(strip(var2)),1,7) eq 'ssol ro' then salary=2499;

  else salary=input(substr(var2,index(var2,'to ')+4),comma8.);

 

you can use

  if substr(salary,length(salary)-6)='or more' then salary=100000;

  else if substr(salary,length(salary)-6)='or less' then salary=2499;

  else salary=input(substr(var2,index(var2,'to ')+4),comma8.);

 

 

And the third line can be changed to:

  else salary=input(scan(vars,3,' '),comma8.);

which uses the SCAN function to retrieve the 3rd word of VAR2. No  need to find the letters "to" and then add 4.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
art297
Opal | Level 21

Yes, you're quite close, but your code has some likely errors. e.g.,

proc report data = datfile.dat NOWINDOWS HEADLINE;

You will be specifying the SAS dataset set that you created. Thus it should be more like:

proc report data = need NOWINDOWS HEADLINE;

 Same thing for your other proc report. And, in your proc sort, where did data 'auto' come from?

 

Finally, you mentioned that your specs called for you to build a report based on a sample of 20 records. That can be accomplished with a datastep like:

 

data subset;
 SampleSize = 20;
 Left = numberObs;
 do i = 1 to numberObs;
    set need point = i nobs = numberObs;
    if uniform(0) le SampleSize/Left then
    do;  /* select the observations */
       SampleSize + (-1); /* decrement needed by 1 */
       output;
    end; /* select the observations */
    Left +(-1); /* decrement remaining obs by 1 */
 end;
 stop;
run;

HTH,

Art, CEO, AnalystFinder.com

 

SixUnder
Obsidian | Level 7

What do you think of this? Any advise would be appreciated. Thanks. So sad on my part, i still have 3 more of these to do in the next monthSmiley Frustrated

 

data subset;
 SampleSize = 20;
 Left = numberObs;
 do i = 1 to numberObs;
    set need point = i nobs = numberObs;
    if uniform(0) le SampleSize/Left then
    do;  /* select the observations */
       SampleSize + (-1); /* decrement needed by 1 */
       output;
    end; /* select the observations */
    Left +(-1); /* decrement remaining obs by 1 */
 end;
 stop;
run;



 

*Identify SAS Library folder?
LIBNAME wk3 '<path>'?
*Import xls file to an SAS Dataset?
PROC IMPORT DATAFILE='<path>/<file>.xls' DBMS= XLS OUT=wk3.MedPop1 REPLACE?
RUN?
/*Create a dataset based off the original xls, but limit the data to the Female/ Full Time Population */
Data MedFem?
SET MedPop1 ?
Where Sex = 'F' AND Status ='FT'?
RUN?
/*Creates a raw 'txt' file to hold the data */
DATA _NULL_?
SET MedFem?
FILE '<path>/<file>.txt' PRINT?
PUT SEX $ POPULATION Salary_Low Salary_High?
RUN?
/* Sorts the data by Salary_High in descending order (highest to lowest salary */
PROC SORT data = MedFem?
BY DESCENDING Salary_High?
RUN?
/*Creates a raw 'txt' file to hold the data */
DATA _NULL_?
SET MedFem?
FILE '<path>/<file>.txt' PRINT?
PUT SEX $ POPULATION Salary_Low Salary_High?
RUN?
/* Sorts the data from Highest to lowest population #'s per salary */
PROC SORT data = MedFem?
BY DESCENDING Population?
RUN?
/*Creates a raw 'txt' file to hold the data */
DATA _NULL_?
SET MedFem?
FILE '<path>/<file>.txt' PRINT?
PUT SEX $ POPULATION Salary_Low Salary_High?
RUN?
mkeintz
PROC Star

 A couple of notes on the program as you have constructed it:

 

  1. You could put the "set ... point="  INSIDE the inner do group.  No need to read the data if the record is not going to be kept.
  2. You don't always need the outer loop to go from 1 all the way to numberOBS.  Once samplesize reaches zero you can stop the loop. So you could try  "do i=1 to numberobs while(sampsize>0);"
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
SixUnder
Obsidian | Level 7
Data Calculation;
infile '.csv' DSD Missover firstobs=2;
Input Gender $ Wages_Low Wages_High Estimated Max_Earnings;

Proc Print Data=Calculation;
run;


then a program for sort 2


Data Calculation;
infile '.csv' DSD Missover firstobs=2;
Input Gender $ Wages_Low Wages_High Estimated Max_Earnings;

Proc Sort Data=Calculation Out=Sort2;
By Descending Max_Earnings;
Run;

and sort 3...


Data Calculation;
infile '.csv' DSD Missover firstobs=2;
Input Gender $ Salary_Low Salary_High Estimate Max_Earnings;

Proc Sort Data=Calculation Out=Sort3;
Key Estimated / descending;
Run;


Would that make 3 tables, showing the requested information?


If i make my excel file like this...


Gender- Salary Low End- Salary High End- Esitmates- Maximum Earnings

female            1                    1                        1                       1

female           5                    7                         6                        8


etc. down to the 20 requested


Will that work?
SixUnder
Obsidian | Level 7

Boom!, my work did it! Now only 3 more this month, haha!

Reeza
Super User

Why did you import the data 3 times?

SixUnder
Obsidian | Level 7

It is actually 3 different programs, i just didnt separate them in the code. My apologies, on to the next problem!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 23 replies
  • 3045 views
  • 1 like
  • 6 in conversation