BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jimbarbour
Meteorite | Level 14

@SASdevAnneMarie,

 

I thought about it a bit more during my lunch time, and I am now able to get the data for the 9th row.  I think it would be important for you to use this latest program because I changed my streaming definition from $1 to $CHAR1.  A definition of $1 can result in some data oddities.  I should have used $CHAR1 all along, but I didn't think about it until later.

 

Here is the latest pre-edit program.  You will want to run this program before you run the import program:

%LET	Cmnt						=	*;
%LET	LRECL						=	65535;
%LET	Limit						=	999999999;

FILENAME	Elig_In		'R:\users\jbarbou3\Marie_Eligibles_c.csv_20200910 (2).csv';

FILENAME	Elig_Out	'R:\users\jbarbou3\Marie_Eligibles_c.csv_20200910 (2)_SAS_Pre_Edit.csv';

DATA	_NULL_;
	RETAIN	Carriage_Return			'0D'x;
	RETAIN	Line_Feed				'0A'x;
	RETAIN	Double_Quote			'"';
	RETAIN	Semi_Colon				';';
	RETAIN	Space					' ';
	RETAIN	In_Quotes				0;
	RETAIN	Record_Cnt				1;

	%IF	%BQUOTE(&Cmnt)				=	%BQUOTE()			%THEN
		%DO;
			IF	_N_					>	&Limit				THEN
				DO;
					PUTLOG	"NOTE-  ";
					PUTLOG	"NOTE:  Stopping "	_N_=;
					STOP;
				END;
		%END;

	INFILE	Elig_In
		RECFM						=	N
		;

	FILE	Elig_Out
		LRECL						=	&LRECL
		RECFM						=	N
		;

	ARRAY	Chars	[5]	$1	_TEMPORARY_;

	INPUT	Char	$CHAR1.;

&Cmnt	IF	_N_						=	1					THEN
			PUTLOG	"NOTE:  Starting Run";

	LINK	Load_Array;
	LINK	Check_Chars;

	******;
	RETURN;
	******;

	**********;
	Load_Array:
	**********;
		Chars[5]					=	Chars[4];
		Chars[4]					=	Chars[3];
		Chars[3]					=	Chars[2];
		Chars[2]					=	Chars[1];
		Chars[1]					=	UPCASE(Char);
	******;
	RETURN;
	******;

	***********;
	Check_Chars:
	***********;
		IF	Chars[5]				=	'V'					AND
			Chars[4]				=	'I'					AND
			Chars[3]				=	'D'					AND
			Chars[2]				=	'E'					AND
			Chars[1]				=	Line_Feed			THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** VIDE and Line feed detected *** "				_N_= Record_Cnt= Chars[5]= Chars[4]= Chars[3]= Chars[2]= Chars[1]= Char=;
				In_Quotes			=	0;
				PUT	Carriage_Return		$1.;
				PUT	Line_Feed			$1.;
				Record_Cnt			+	1;
			END;
		ELSE
		IF	Char					=	Double_Quote		THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** Double quote detected *** "						_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  *** In_Quotes set to false *** "			_N_= Record_Cnt= Char=;
						In_Quotes	=	0;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  *** In_Quotes set to true  *** "			_N_= Record_Cnt= Char=;
						In_Quotes	=	1;
					END;
				PUT	Double_Quote		$1.;
			END;
		ELSE
		IF	Char					=	Line_Feed			THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** Stand alone line feed detected *** "			_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  ### Chg embedded line feed to space ### "	_N_= Record_Cnt= Char=;
						PUT	Space		$1.;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  %%% Change line feed to semicolon %%% "		_N_= Record_Cnt= Char=;
						PUT	Semi_Colon	$1.;
					END;
			END;
		ELSE
		IF	Char					=	Carriage_Return		THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** Stand alone carriage return detected *** "		_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  ### Change embedded CR to space ### "		_N_= Record_Cnt= Char=;
						PUT	Space		$1.;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  @@@ Chg carriage return to semicolon @@@ "	_N_= Record_Cnt= Char=;
						PUT	Semi_Colon	$1.;
					END;
			END;
		ELSE
		IF	Char					=	Semi_Colon			THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  Stand alone semi colon detected "					_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  ^^^ Chg embedded semi colon to space^^^ "	_N_= Record_Cnt= Char=;
						PUT	Space		$1.;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  Retaining semi colon "						_N_= Record_Cnt= Char=;
						PUT	Semi_Colon	$1.;
					END;
			END;
		ELSE
			DO;
				*&Cmnt	PUTLOG	"NOTE-  Other character detected "							_N_= Record_Cnt= Char=;
				PUT	Char				$1.;
			END;
	******;
	RETURN;
	******;
RUN;

After you run the above pre-edit program, then run the below import program.  I changed a couple of things from your original import program, so you will want to use this program.  Among other things, I increased the LRECL to prevent trunction.  I also increased the size of some of the variables like fraisGestionCommentaire to $1024.  These variables were $256 and were truncating a lot of data.  Even now, some of the data is getting truncated, so you may want to increase them to an even larger size.

%LET	LRECL		=	65535;

FILENAME	Elig	'R:\users\jbarbou3\Marie_Eligibles_c.csv_20200910 (2)_SAS_Pre_Edit.csv';

data EUROPERF_IMP1_SAS;
	INFILE	Elig
		LRECL		=	&LRECL
		TERMSTR		=	CRLF
		DLM			=	';'
		FIRSTOBS	=	2
		RECFM		=	V
		MISSOVER
		DSD 
		;

	/*	56 Columns	*/
	informat	CodeIsin	$12.;
	informat	Libelle	$50.;
	informat	nature	$12.;
	informat	categAMF	$32.;
	informat	categEPF	$50.;
	informat	zoneGeo	$40.;
	informat	dureeMinReco $8.;
	informat	affectationResultat	$40.;
	informat	fiscaliteText	$50.;
	informat	deviseComptable	$3.;
	informat	dateCloture	$32.;
	informat	dateOuverture	$32.;
	informat	promoteur	$50.;
	informat	adressePromoteur	$80.;
	informat	steGestion	$70.;
	informat	adresseSteGestion	$80.;
	informat	steAdmDelegue	$80.;
	informat	steFinAdmDelegue	$80.;
	informat	depositaire	$80.;
	informat	adresseDepositaire	$80.;
	informat	commissaireCpte	$50.;
	informat	souscription	$100.;
	informat	rachat	$50.;
	informat	fraisGestion	$70.;
	informat	datePerformance	$12.;
	informat	perf12mois	BEST32.;
	informat	perf36mois	$5.;
	informat	perf60mois	$5.;
	informat	fraisGestionCommentaire	$1024.;
	informat	objectifGestion	$1024.;
	informat	souscripteurs	$1024.;
	informat	orientationPlacement	$1024.;
	informat	dateClotureExercice	$1024.;
	informat	periodeVL	$1024.;
	informat	sitePublication	$1024.;
	informat	conditionSouscripRachat	$1024.;
	informat	dateValidation		$12.;
	informat	minimumRisqueAction	$5.;
	informat	dateAgrementAMF		$12.;
	informat	Tfe	$20.;
	informat	perfAm1	$8.;
	informat	perfAm2	$8.;
	informat	perfAm3	$8.;
	informat	perfAm4	$8.;
	informat	perfAm5	$8.;
	informat	perfAm6	$8.;
	informat	perfAm7	$8.;
	informat	perfAm8	$8.;
	informat	UnitedureeMinReco	$7.;
	informat	perfYTD	$5.;
	informat	volat_12mois	$5.;
	informat	frequence_text	$30.;
	informat	prospectus_url	$50.;
	informat	jour_valorisation $8.;
	informat	risque	$6.;
	informat	VIDE	$4.;

	/*	56 Columns	*/
	format 	CodeIsin	$12.;
	format 	Libelle	$50.;
	format 	nature	$12.;
	format 	categAMF	$32.;
	format 	categEPF	$50.;
	format 	zoneGeo	$40.;
	format 	dureeMinReco $8.;
	format 	affectationResultat	$40.;
	format 	fiscaliteText	$50.;
	format 	deviseComptable	$3.;
	format 	dateCloture	$32.;
	format 	dateOuverture	$32.;
	format 	promoteur	$50.;
	format 	adressePromoteur	$80.;
	format 	steGestion	$70.;
	format 	adresseSteGestion	$80.;
	format 	steAdmDelegue	$80.;
	format 	steFinAdmDelegue	$80.;
	format 	depositaire	$80.;
	format 	adresseDepositaire	$80.;
	format 	commissaireCpte	$50.;
	format 	souscription	$100.;
	format 	rachat	$5.;
	format 	fraisGestion	$70.;
	format 	datePerformance $12.;
	format 	perf12mois	BEST32.;
	format 	perf36mois	$5.;
	format 	perf60mois	$5.;
	format 	fraisGestionCommentaire	$1024.;
	format 	objectifGestion	$1024.;
	format 	souscripteurs	$1024.;
	format 	orientationPlacement	$1024.;
	format 	dateClotureExercice	$1024.;
	format 	periodeVL	$1024.;
	format 	sitePublication	$1024.;
	format  conditionSouscripRachat	$1024.;
	format 	dateValidation		$12.;
	format 	minimumRisqueAction	$5.;
	format 	dateAgrementAMF		$12.;
	format 	Tfe	$20.;
	format 	perfAm1	$8.;
	format 	perfAm2	$8.;
	format 	perfAm3	$8.;
	format 	perfAm4	$8.;
	format 	perfAm5	$8.;
	format 	perfAm6	$8.;
	format 	perfAm7	$8.;
	format 	perfAm8	$8.;
	format 	UnitedureeMinReco	$7.;
	format 	perfYTD	$5.;
	format 	volat_12mois	$5.;
	format 	frequence_text	$30.;
	format 	prospectus_url	$50.;
	format 	jour_valorisation	$8.;
	format 	risque	$6.;
	format	VIDE	$4.;

	/*	56 Columns	*/
	input			
		CodeIsin	$ 	
		Libelle	$	
		nature	$	
		categAMF	$	
		categEPF	$	
		zoneGeo	$	
		dureeMinReco $ 		
		affectationResultat	$	
		fiscaliteText	$	
		deviseComptable	$	
		dateCloture	$	
		dateOuverture	$	
		promoteur	$	
		adressePromoteur	$	
		steGestion	$	
		adresseSteGestion	$	
		steAdmDelegue	$	
		steFinAdmDelegue	$	
		depositaire	$	
		adresseDepositaire	$	
		commissaireCpte	$	
		souscription	$	
		rachat	$	
		fraisGestion	$	
		datePerformance	$
		perf12mois	
		perf36mois	$	
		perf60mois $		
		fraisGestionCommentaire	$	
		objectifGestion	$	
		souscripteurs	$
		orientationPlacement	$	
		dateClotureExercice	$	
		periodeVL	$	
		sitePublication	$	
		conditionSouscripRachat	$	
		dateValidation	$	
		minimumRisqueAction	$	
		dateAgrementAMF	$	
		Tfe	$	
		perfAm1	$	
		perfAm2	$	
		perfAm3	$	
		perfAm4	$	
		perfAm5	$	
		perfAm6	$	
		perfAm7	$	
		perfAm8	$	
		UnitedureeMinReco $		
		perfYTD	$	
		volat_12mois $		
		frequence_text	$	
		prospectus_url	$	
		jour_valorisation	$	
		risque 	$	
		VIDE $;
run;

Here are the results; I think they look good.

Fr_Data_CRLF_And_Delimiters_Fix_2020-09-11_06-58-07.jpg

 

There is still one problem in the data in record number 513.  There is an extra delimiter before "risque".  The extra delimiter is causing a mis-alignment in the data.  You can just delete this one by hand.  

Fr_Data_Mis_Aligned_Data_2020-09-11_06-58-07.jpg

 

Even though the topic is already marked as "solved," I hope you will see this latest program and that it will be useful to you.

 

Regards,

 

Jim

SASdevAnneMarie
Barite | Level 11
Thank you very much Jim!
The solution is perfect. Sorry again to embarrassing you with the bad data:(
Best regards,
Marie
jimbarbour
Meteorite | Level 14

@SASdevAnneMarie,

 

You are welcome, and I am glad I was able to solve it.   No need to apologize for the bad data.  The person who created such a mess should apologize!  I hope my latest program will be useful to you for some time to come.

 

Jim

SASdevAnneMarie
Barite | Level 11

Hello Jim,

 

I tried to replace dlm=';'||'0D0A'x by dlm=';'||'0A'x in my code but it doesn't work. 😞

I made a request for data correction.
Thank you very much for your help!

Marie

jimbarbour
Meteorite | Level 14

@SASdevAnneMarie 

 

Sorry to take so long to reply.  I don't know how, but I missed your question.

 

The delimiter should be a semi-colon only.  Do not add a '0A'x to the delimiter.  The code should be as shown below.  I have included both the pre-edit and he main import programs in one set of SAS code.  This is how they should be run.

 

Jim

%LET	Cmnt						=	*;
%LET	LRECL						=	65535;
%LET	Limit						=	999999999;

FILENAME	Elig_In		'R:\users\jbarbou3\Marie_Eligibles_c.csv_20200910 (2).csv';

FILENAME	Elig_Out	'R:\users\jbarbou3\Marie_Eligibles_c.csv_20200910 (2)_SAS_Pre_Edit.csv';

DATA	_NULL_;
	RETAIN	Carriage_Return			'0D'x;
	RETAIN	Line_Feed				'0A'x;
	RETAIN	Double_Quote			'"';
	RETAIN	Semi_Colon				';';
	RETAIN	Space					' ';
	RETAIN	In_Quotes				0;
	RETAIN	Record_Cnt				1;

	%IF	%BQUOTE(&Cmnt)				=	%BQUOTE()			%THEN
		%DO;
			IF	_N_					>	&Limit				THEN
				DO;
					PUTLOG	"NOTE-  ";
					PUTLOG	"NOTE:  Stopping "	_N_=;
					STOP;
				END;
		%END;

	INFILE	Elig_In
		RECFM						=	N
		;

	FILE	Elig_Out
		LRECL						=	&LRECL
		RECFM						=	N
		;

	ARRAY	Chars	[5]	$1	_TEMPORARY_;

	INPUT	Char	$CHAR1.;

&Cmnt	IF	_N_						=	1					THEN
			PUTLOG	"NOTE:  Starting Run";

	LINK	Load_Array;
	LINK	Check_Chars;

	******;
	RETURN;
	******;

	**********;
	Load_Array:
	**********;
		Chars[5]					=	Chars[4];
		Chars[4]					=	Chars[3];
		Chars[3]					=	Chars[2];
		Chars[2]					=	Chars[1];
		Chars[1]					=	UPCASE(Char);
	******;
	RETURN;
	******;

	***********;
	Check_Chars:
	***********;
		IF	Chars[5]				=	'V'					AND
			Chars[4]				=	'I'					AND
			Chars[3]				=	'D'					AND
			Chars[2]				=	'E'					AND
			Chars[1]				=	Line_Feed			THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** VIDE and Line feed detected *** "				_N_= Record_Cnt= Chars[5]= Chars[4]= Chars[3]= Chars[2]= Chars[1]= Char=;
				In_Quotes			=	0;
				PUT	Carriage_Return		$1.;
				PUT	Line_Feed			$1.;
				Record_Cnt			+	1;
			END;
		ELSE
		IF	Char					=	Double_Quote		THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** Double quote detected *** "						_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  *** In_Quotes set to false *** "			_N_= Record_Cnt= Char=;
						In_Quotes	=	0;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  *** In_Quotes set to true  *** "			_N_= Record_Cnt= Char=;
						In_Quotes	=	1;
					END;
				PUT	Double_Quote		$1.;
			END;
		ELSE
		IF	Char					=	Line_Feed			THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** Stand alone line feed detected *** "			_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  ### Chg embedded line feed to space ### "	_N_= Record_Cnt= Char=;
						PUT	Space		$1.;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  %%% Change line feed to semicolon %%% "		_N_= Record_Cnt= Char=;
						PUT	Semi_Colon	$1.;
					END;
			END;
		ELSE
		IF	Char					=	Carriage_Return		THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  *** Stand alone carriage return detected *** "		_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  ### Change embedded CR to space ### "		_N_= Record_Cnt= Char=;
						PUT	Space		$1.;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  @@@ Chg carriage return to semicolon @@@ "	_N_= Record_Cnt= Char=;
						PUT	Semi_Colon	$1.;
					END;
			END;
		ELSE
		IF	Char					=	Semi_Colon			THEN
			DO;
				&Cmnt	PUTLOG	"NOTE-  Stand alone semi colon detected "					_N_= Record_Cnt= Char=;
				IF	In_Quotes								THEN
					DO;
						&Cmnt	PUTLOG	"NOTE-  ^^^ Chg embedded semi colon to space^^^ "	_N_= Record_Cnt= Char=;
						PUT	Space		$1.;
					END;
				ELSE
					DO;
						&Cmnt	PUTLOG	"NOTE-  Retaining semi colon "						_N_= Record_Cnt= Char=;
						PUT	Semi_Colon	$1.;
					END;
			END;
		ELSE
			DO;
				*&Cmnt	PUTLOG	"NOTE-  Other character detected "							_N_= Record_Cnt= Char=;
				PUT	Char				$1.;
			END;
	******;
	RETURN;
	******;
RUN;

FILENAME	Elig	'XXX\Marie_Eligibles_c.csv_20200910 (2)_SAS_Pre_Edit.csv';

data EUROPERF_IMP1_SAS;
	INFILE	Elig
		LRECL		=	&LRECL
		TERMSTR		=	CRLF
		DLM			=	';'
		FIRSTOBS	=	2
		RECFM		=	V
		MISSOVER
		DSD 
		;

	/*	56 Columns	*/
	informat	CodeIsin	$12.;
	informat	Libelle	$50.;
	informat	nature	$12.;
	informat	categAMF	$32.;
	informat	categEPF	$50.;
	informat	zoneGeo	$40.;
	informat	dureeMinReco $8.;
	informat	affectationResultat	$40.;
	informat	fiscaliteText	$50.;
	informat	deviseComptable	$3.;
	informat	dateCloture	$32.;
	informat	dateOuverture	$32.;
	informat	promoteur	$50.;
	informat	adressePromoteur	$80.;
	informat	steGestion	$70.;
	informat	adresseSteGestion	$80.;
	informat	steAdmDelegue	$80.;
	informat	steFinAdmDelegue	$80.;
	informat	depositaire	$80.;
	informat	adresseDepositaire	$80.;
	informat	commissaireCpte	$50.;
	informat	souscription	$100.;
	informat	rachat	$50.;
	informat	fraisGestion	$70.;
	informat	datePerformance	$12.;
	informat	perf12mois	BEST32.;
	informat	perf36mois	$5.;
	informat	perf60mois	$5.;
	informat	fraisGestionCommentaire	$1024.;
	informat	objectifGestion	$1024.;
	informat	souscripteurs	$1024.;
	informat	orientationPlacement	$1024.;
	informat	dateClotureExercice	$1024.;
	informat	periodeVL	$1024.;
	informat	sitePublication	$1024.;
	informat	conditionSouscripRachat	$1024.;
	informat	dateValidation		$12.;
	informat	minimumRisqueAction	$5.;
	informat	dateAgrementAMF		$12.;
	informat	Tfe	$20.;
	informat	perfAm1	$8.;
	informat	perfAm2	$8.;
	informat	perfAm3	$8.;
	informat	perfAm4	$8.;
	informat	perfAm5	$8.;
	informat	perfAm6	$8.;
	informat	perfAm7	$8.;
	informat	perfAm8	$8.;
	informat	UnitedureeMinReco	$7.;
	informat	perfYTD	$5.;
	informat	volat_12mois	$5.;
	informat	frequence_text	$30.;
	informat	prospectus_url	$50.;
	informat	jour_valorisation $8.;
	informat	risque	$6.;
	informat	VIDE	$4.;

	/*	56 Columns	*/
	format 	CodeIsin	$12.;
	format 	Libelle	$50.;
	format 	nature	$12.;
	format 	categAMF	$32.;
	format 	categEPF	$50.;
	format 	zoneGeo	$40.;
	format 	dureeMinReco $8.;
	format 	affectationResultat	$40.;
	format 	fiscaliteText	$50.;
	format 	deviseComptable	$3.;
	format 	dateCloture	$32.;
	format 	dateOuverture	$32.;
	format 	promoteur	$50.;
	format 	adressePromoteur	$80.;
	format 	steGestion	$70.;
	format 	adresseSteGestion	$80.;
	format 	steAdmDelegue	$80.;
	format 	steFinAdmDelegue	$80.;
	format 	depositaire	$80.;
	format 	adresseDepositaire	$80.;
	format 	commissaireCpte	$50.;
	format 	souscription	$100.;
	format 	rachat	$5.;
	format 	fraisGestion	$70.;
	format 	datePerformance $12.;
	format 	perf12mois	BEST32.;
	format 	perf36mois	$5.;
	format 	perf60mois	$5.;
	format 	fraisGestionCommentaire	$1024.;
	format 	objectifGestion	$1024.;
	format 	souscripteurs	$1024.;
	format 	orientationPlacement	$1024.;
	format 	dateClotureExercice	$1024.;
	format 	periodeVL	$1024.;
	format 	sitePublication	$1024.;
	format  conditionSouscripRachat	$1024.;
	format 	dateValidation		$12.;
	format 	minimumRisqueAction	$5.;
	format 	dateAgrementAMF		$12.;
	format 	Tfe	$20.;
	format 	perfAm1	$8.;
	format 	perfAm2	$8.;
	format 	perfAm3	$8.;
	format 	perfAm4	$8.;
	format 	perfAm5	$8.;
	format 	perfAm6	$8.;
	format 	perfAm7	$8.;
	format 	perfAm8	$8.;
	format 	UnitedureeMinReco	$7.;
	format 	perfYTD	$5.;
	format 	volat_12mois	$5.;
	format 	frequence_text	$30.;
	format 	prospectus_url	$50.;
	format 	jour_valorisation	$8.;
	format 	risque	$6.;
	format	VIDE	$4.;

	/*	56 Columns	*/
	input			
		CodeIsin	$ 	
		Libelle	$	
		nature	$	
		categAMF	$	
		categEPF	$	
		zoneGeo	$	
		dureeMinReco $ 		
		affectationResultat	$	
		fiscaliteText	$	
		deviseComptable	$	
		dateCloture	$	
		dateOuverture	$	
		promoteur	$	
		adressePromoteur	$	
		steGestion	$	
		adresseSteGestion	$	
		steAdmDelegue	$	
		steFinAdmDelegue	$	
		depositaire	$	
		adresseDepositaire	$	
		commissaireCpte	$	
		souscription	$	
		rachat	$	
		fraisGestion	$	
		datePerformance	$
		perf12mois	
		perf36mois	$	
		perf60mois $		
		fraisGestionCommentaire	$	
		objectifGestion	$	
		souscripteurs	$
		orientationPlacement	$	
		dateClotureExercice	$	
		periodeVL	$	
		sitePublication	$	
		conditionSouscripRachat	$	
		dateValidation	$	
		minimumRisqueAction	$	
		dateAgrementAMF	$	
		Tfe	$	
		perfAm1	$	
		perfAm2	$	
		perfAm3	$	
		perfAm4	$	
		perfAm5	$	
		perfAm6	$	
		perfAm7	$	
		perfAm8	$	
		UnitedureeMinReco $		
		perfYTD	$	
		volat_12mois $		
		frequence_text	$	
		prospectus_url	$	
		jour_valorisation	$	
		risque 	$	
		VIDE $;
run;
SASdevAnneMarie
Barite | Level 11
Thank you very much, it’s clear!
Last question please : I would like to make a test, replace all CR by LF. How I can do that via Notepad ++. I mean I don’t know how to represent CR and LF. Do I need to use “0A”x and “0D”x?
Thank you!
jimbarbour
Meteorite | Level 14

@SASdevAnneMarie,

 

Generally speaking, I don't think you should replace CR with LF.  You are welcome to try it, but I don't think that will work with your data.  At least in a Windows environment, the CR and LF are meant to work together to indicate the end of a record.  The little pre-edit program that I wrote for you should be able to prepare the data for you.  You should not have to do any work with Notepad++, at least not on the data I have seen so far.

 

However, Notepad++ is a good tool, and at some point, the little pre-edit program that I wrote may encounter data that it doesn't know how to handle in which case you may need to do some manual editing before running the pre-edit program.

 

It's very easy to see the CR and LF characters in Notepad++.

 

  1. Click on "View"
  2. Click on "Show Symbol"
  3. Check "Show All Charaters"

Thereafter, you should be able to see a CR for all carriage return characters and a LF for all line feed characters.

Notepad++_CRLF_ScreenShot_2020-09-15_12-00-58.jpg

 

Jim

SASdevAnneMarie
Barite | Level 11
Hello Jim,
Thank you very much!
Ideally I need the file with « ; » for data delimitation and CRLF for observation delimitation?
jimbarbour
Meteorite | Level 14

@SASdevAnneMarie,

 

Yes, your data should have a semi-colon as the data delimiter and a CRLF as the observation delimiter. It is however good practice to enclose text in double quotes, like the following:

"Fixe 0.00 %";"Fixe 0.01 %"

The semi-colon is the delimiter, but Fixe 0.00 % and Fixe 0.01 % are enclosed in double quotes.

 

There should be no LF characters by themselves alone in the data.  The LF character should only be used with a CR; they should always be used together (CRLF); they should always indicate "end of observation."

 

Jim

 

SASdevAnneMarie
Barite | Level 11

Hello Jim,

 

My problem is that I don’t know how to control the futur file.

Normaly, after correction, I’ll have only the « CRLF » in the end of observation.

For making a test, I would like to put in Notepad ++/Find (« Rechercher » : ‘0A’x) : just to see that I don’t have the soul « LF », but it doesn’t work (the example is attached):

 

Maybe you know how to check if there are soul « LF ».

 

Thank you!

jimbarbour
Meteorite | Level 14

@SASdevAnneMarie,

 

For Notepad++, you will want to select Mode étendu.

 

When you want to search for a line feed (LF), use \n.

When you want to search for a carriage return (CR), use \r.

When you want to search for both together, use \r\n.

 

Fr_Data_Notepad++_Extended_Mode_2020-09-16_07-01-49.jpg

 

Jim

SASdevAnneMarie
Barite | Level 11
Super!
Thank you very much!
SASdevAnneMarie
Barite | Level 11
Thank you very much!
I’ll try to do that!
Do I need to do in Notepad++ just:
Find: ('0D0A'x)/Replace: '^^^'... I’m so on?
jimbarbour
Meteorite | Level 14

@SASdevAnneMarie,

 

No, you do not need to use Notepad++.  I wrote you a small Data step that you need to run before your main Data step.  The small Data step will take care of the bad Line Feed characters for you.  

 

Jim

SASdevAnneMarie
Barite | Level 11
Thank you!
I’ll try to run this code!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 52 replies
  • 3233 views
  • 19 likes
  • 5 in conversation