Author Topic: ^M end of line characters - Win to Unix FTP  (Read 4478 times)

ABlipInContinuity

^M end of line characters - Win to Unix FTP
« on: 16 July, 2008, 03:01:11 pm »
Hi,

I'm using SmartFTP to transfer files from Win to Unix.

It doesn't matter what transfer method I select (Auto/Binay/ASCII) I'm getting "^M" characters on the end of each line within the files I transfer.

I know this is to do the different ways line feeds are handled in Windows and Unix. It's quite easy to remove them in Vi post-transfer. How do I stop them from appearing in the first place?

I'm just going to try the Windows command line FTP client.

Cheers
Daniel

Re: ^M end of line characters - Win to Unix FTP
« Reply #1 on: 16 July, 2008, 03:08:37 pm »
That's what ASCII mode is for. It should automatically remove them when transferring from Win -> UNIX and add them when going the other way.

Binary mode doesn't perform any conversion at all.

Auto is application specific, so could do anything.
"Yes please" said Squirrel "biscuits are our favourite things."

Hello, I am Bruce

  • Hello, I am Bruce
  • Hello, I am Bruce
    • Flickr Photos
Re: ^M end of line characters - Win to Unix FTP
« Reply #2 on: 16 July, 2008, 03:09:40 pm »
The auto/binary/text option on FTP switches between 7-bit and 8-bit bytes (basic text files only need a 7-bit byte, and this used to provide a time saving in transfer).

The problem with Dos and Window's text files is that they have an extra character (byte) at every new line (the ctrl-M).  FTP won't get rid of the extra byte.

You might be able to pipe the file through a filter that removes the extra characters prior  to or after transfer.

Re: ^M end of line characters - Win to Unix FTP
« Reply #3 on: 16 July, 2008, 03:24:26 pm »
The auto/binary/text option on FTP switches between 7-bit and 8-bit bytes (basic text files only need a 7-bit byte, and this used to provide a time saving in transfer).

Nope. ascii mode is still 8-bit.

The problem with Dos and Window's text files is that they have an extra character (byte) at every new line (the ctrl-M).  FTP won't get rid of the extra byte.

No, that's exactly what ascii mode is designed to do.

If you don't believe me then create a 256 byte file containing all 256 different byte values and transfer it about:-

Code: [Select]
$ od -x data
0000000 0100 0302 0504 0706 0908 0b0a 0d0c 0f0e
0000020 1110 1312 1514 1716 1918 1b1a 1d1c 1f1e
0000040 2120 2322 2524 2726 2928 2b2a 2d2c 2f2e
0000060 3130 3332 3534 3736 3938 3b3a 3d3c 3f3e
0000100 4140 4342 4544 4746 4948 4b4a 4d4c 4f4e
0000120 5150 5352 5554 5756 5958 5b5a 5d5c 5f5e
0000140 6160 6362 6564 6766 6968 6b6a 6d6c 6f6e
0000160 7170 7372 7574 7776 7978 7b7a 7d7c 7f7e
0000200 8180 8382 8584 8786 8988 8b8a 8d8c 8f8e
0000220 9190 9392 9594 9796 9998 9b9a 9d9c 9f9e
0000240 a1a0 a3a2 a5a4 a7a6 a9a8 abaa adac afae
0000260 b1b0 b3b2 b5b4 b7b6 b9b8 bbba bdbc bfbe
0000300 c1c0 c3c2 c5c4 c7c6 c9c8 cbca cdcc cfce
0000320 d1d0 d3d2 d5d4 d7d6 d9d8 dbda dddc dfde
0000340 e1e0 e3e2 e5e4 e7e6 e9e8 ebea edec efee
0000360 f1f0 f3f2 f5f4 f7f6 f9f8 fbfa fdfc fffe
0000400

FTPing this from a Windows machine to a Solaris box:-

ftp> ascii
200 Type set to A.
ftp> put data data.ascii
200 PORT command successful.
150 Opening ASCII mode data connection for data.ascii.
226-WARNING! 1 bare linefeeds received in ASCII mode
   File may not have transferred correctly.
226 Transfer complete.

Whereas in binary mode it doesn't complain:-

ftp> bin
200 Type set to I.
ftp> put data data.bin
200 PORT command successful.
150 Opening BINARY mode data connection for data.bin.
226 Transfer complete.
ftp: 256 bytes sent in 0.00Seconds 256000.00Kbytes/sec.

Over on the UNIX machine the two files (data.ascii and data.bin) are exactly the same (since the ^M in the file didn't precede a linefeed and therefore didn't need to be removed, but it did issue a warning about it). And all 8-bits are there in the ascii mode file, the top bit hasn't been stripped out.

Now if I get the files back from the UNIX machine back onto my Windows box:-

ftp> get data data.towin_ascii
200 PORT command successful.
150 Opening ASCII mode data connection for data.ascii (256 bytes).
226 Transfer complete.
ftp: 257 bytes received in 0.00Seconds 257000.00Kbytes/sec.
ftp> bin
200 Type set to I.
ftp> get data.ascii data.towin_bin
200 PORT command successful.
150 Opening BINARY mode data connection for data.ascii (256 bytes).
226 Transfer complete.
ftp: 256 bytes received in 0.00Seconds 256000.00Kbytes/sec.

The data.towin_bin is exactly the same as the original file as expected.

You'll notice that in ASCII mode it has grabbed 257 bytes because it's translated the LF (0x0a) into CRLF (0x0a 0x0d):-

Code: [Select]
$ od -x data.towin_ascii
0000000 0100 0302 0504 0706 0908 0a0d 0c0b 0e0d
0000020 100f 1211 1413 1615 1817 1a19 1c1b 1e1d
0000040 201f 2221 2423 2625 2827 2a29 2c2b 2e2d
0000060 302f 3231 3433 3635 3837 3a39 3c3b 3e3d
0000100 403f 4241 4443 4645 4847 4a49 4c4b 4e4d
0000120 504f 5251 5453 5655 5857 5a59 5c5b 5e5d
0000140 605f 6261 6463 6665 6867 6a69 6c6b 6e6d
0000160 706f 7271 7473 7675 7877 7a79 7c7b 7e7d
0000200 807f 8281 8483 8685 8887 8a89 8c8b 8e8d
0000220 908f 9291 9493 9695 9897 9a99 9c9b 9e9d
0000240 a09f a2a1 a4a3 a6a5 a8a7 aaa9 acab aead
0000260 b0af b2b1 b4b3 b6b5 b8b7 bab9 bcbb bebd
0000300 c0bf c2c1 c4c3 c6c5 c8c7 cac9 cccb cecd
0000320 d0cf d2d1 d4d3 d6d5 d8d7 dad9 dcdb dedd
0000340 e0df e2e1 e4e3 e6e5 e8e7 eae9 eceb eeed
0000360 f0ef f2f1 f4f3 f6f5 f8f7 faf9 fcfb fefd
0000400 00ff
0000401

0a has been replaced with 0d0a (once you get used to the switched byte-pair output of od -x).
"Yes please" said Squirrel "biscuits are our favourite things."

Hello, I am Bruce

  • Hello, I am Bruce
  • Hello, I am Bruce
    • Flickr Photos
Re: ^M end of line characters - Win to Unix FTP
« Reply #4 on: 16 July, 2008, 03:36:39 pm »
The auto/binary/text option on FTP switches between 7-bit and 8-bit bytes (basic text files only need a 7-bit byte, and this used to provide a time saving in transfer).

Nope. ascii mode is still 8-bit.

You are right.  I am misremembering stuff from about 15 years ago (unix-to-unix transfer of a binary file in ascii mode didn't work, but not for that reason).  Ignore me.

Re: ^M end of line characters - Win to Unix FTP
« Reply #5 on: 16 July, 2008, 03:40:16 pm »
You are right.  I am misremembering stuff from about 15 years ago (unix-to-unix transfer of a binary file in ascii mode didn't work, but not for that reason).  Ignore me.

<old man mode>
uucp, bang paths, shar files. I remember the days...
</omm>
"Yes please" said Squirrel "biscuits are our favourite things."

Re: ^M end of line characters - Win to Unix FTP
« Reply #6 on: 16 July, 2008, 03:43:10 pm »
LOL we had this same conversation a week or two ago, probably between me and Greenbank, and I more or less said the same things about misremembering things!

We need an FAQ. ;D
Actually, it is rocket science.
 

Re: ^M end of line characters - Win to Unix FTP
« Reply #7 on: 16 July, 2008, 04:18:57 pm »
I just used to have a macro in Emacs that stripped the dreaded ^M
I think you'll find it's a bit more complicated than that.