• Skip to main content
  • Zur Hauptsidebar springen

Technik News

Das Blog zu IT, Mobilfunk & Internet

Discussion of the uuencode/uudecode standard

Mai 28, 2006 von Harald Puhl

This file is a short discussion of the uuencode/uudecode ’standard:‘ how hey work, pitfalls for them, and variants on the theme.

Here’s how UUENCODE and UUDECODE do their thing: The basic idea is to convert binary data to plain ASCII data, so it can be mailed safely over standard maile systems. You start with 3 8-bit bytes (24 bits) and convert them to 4 6-bit bytes (also 24 bits). Normal ASCII text has all values between 32 and 127 (all of the ‚control‘ characters are between 0 & 31), so you need to add 32 to every byte as an offset. 127 takes 7 bits, so with only 6 bits (maximum value = 63), you can add 32 and still be under 128.

The encoding process goes like this: you take the high order 6 bits from the first input byte (Ibyte1 for short) and they become the 6 bits of the first output byte (Obyte1)…you have to remember to mask off the top two bits, which is done by doing a logical .and. with 63, which is the lower six bits all *on* and the top two *off*. Then you add 32 to force the result to be at least 32. Now Obyte2 is made from the lower 2 bits of Ibyte1 and the top 4 bits of Ibyte2. Obyte3 is the lower 4 bits of Ibyte2 and the top 2 bits of Ibyte3. Obyte4 is the lower 6 bits of Ibyte3. Again, mask off the top two of each output byte and add 32.

The first byte of each encoded output line tells you how many bytes will be output when decoded. This count byte is also encoded; the ‚M‘ you see on the beginning of almost all lines is ASCII character 77…subtracting 32 gets you 45. This 45 refers to the count of DECODED bytes, which come in 3-byte cells, so this means that there are 15 cells on a line that begins with ‚M‘, or 4 * 15 = 60 encoded bytes on that line. Remember that you can’t have less than a whole cell, even though you may need only 1 or 2 bytes of the 3-byte decoded cell. Now to discuss a few common variations on the uuencode ’standard.‘ One of the earliest things that people noticed was that some mailers like to strip off any trailing spaces on lines of mail text. Seems like a good ideam, no? You’re saving storage space by trimming off useless data, right? Well…unfortunately, trailing spaces are perfectly valid in uunecoded data. I have seen several interesting ways around this problem. The first is to replace the space character (ASCII value = 0x20, or 32 decimal) with the ` character (it’s called a grave, pronounced „grahhhve“), which has an ASCII value of 0x60 (96 decimal). Under the standard decoding procedure, described above, the grave decodes the same way as a space:

for a space: (0x20 – 0x20) & 0x3F = (0x00) & 0x3F = 0x00
for a grave: (0x60 – 0x20) & 0x3F = (0x40) & 0x3F = 0x00

or in binary format:
space = 00100000
grave = 01100000

space – 32 = 0
grave – 32 = 01000000
(grave – 32) & 63 = 01000000 & 00111111 = 0

So you see that both decode to zero. The difference between them, of course, is that the grave is not seen by the mailer as a space, so it does not get truncated!
The other way around this truncation problem is to place some non-space character at the end of every line…then the trailing spaces don’t trail any more, so they’re not trimmed! It really doesn’t matter which character gets put there, so long as it is *not* a space. A normal decoder will never see this character, because it stops when the correct number of bytes is encountered, as described in the earlier discussion. Some decoders do choke on this, but they’re pretty rare.

Something I said earlier is not strictly true; the space and the grave do not *always* decode to zero; if you have a decoder that forms a look-up table for decoding, and neglects to include both the space and the grave in the table, then you’re hosed.

Speaking of look-up tables, there’s another variant of uuencode/uudecode called xxencode/xxdecode. Xxencode was invented to avoid problems in a conversion routine between the ASCII world and the EBCDIC world (used primarily by IBM mainframes). It seems that a bug in the conversion routine incorrectly maps some ASCII characters onto other characters in EBCDIC. In a fit of right-thinking, Phil Howard realized that uuencode/uudecode were just a fancy form of a look-up table, and set about finding a character set that does *not* get munged by the conversion routine. He then modified some uuencode source to handle the new character set, and *presto* the problem was solved. The rest of the code remains basically unchanged, but all encoding/decoding is done via the look-up table instead of via the „get a value between 0 and 64 and then add 32“ method that uuencode uses. You can always recognize xxencoded files because the lines start with ‚h‘ instead of the uuencode ‚M‘.

There’s yet another variant of uudecode that I’ve seen. Once people realized that this was all just a look-up table problem, it wasn’t long before encoders started appearing that included the entire table before the data, like thus:

table
`!“#$%&'()*+,-./0123456789:;<=>?
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]~_
begin 666 ROGER.GIF
M1TE&.#=A@`)>`9,„„„*JJ`*H`J@“JJO]5557_5:JJJ@„JE55_U5550″Ja
M`*H„%7___]5____5?___RP„„`@`)>`0,$_A#(2:N]..O-N_]@*(YD:9YHa
MJJYLZ[YP+,]T;=]XKN]\[__`H‘!(+!J/R*1RR6RV#-„H=$JM6J_8K‘;+[7J_a
MX+!X3″Z;S~BT>LUN%][PN’Q.K]OO~+Q~S~_7H00&#H$$#@8$#X6’@(N!AX.`a

Then your decoder has an absolute reference for which characters are where in the table! Again, if the standard character set is used, this shouldn’t present any problems, because most uudecoders skip over everything before the ‚begin‘ line before they start decoding.

Note that in the example above, the encoder used *both* methods of avoiding the trailing-space-truncation problem, by using grave instead of space, and by placing the ‚a‘ at then end of every line. Kind of redundant, but I suppose it’s better than nothing. Note also that this example would *not* decode under a standard decoder that knew nothing about ‚table‘ statements; the tilde (~) specified as the second-to-last character in the table is *not* the same as the carat (^) that should be there! This example is actual data that someone posted to the net, which drove everybody nuts.

Now I’ve probably got you wondering about that ‚begin‘ line in the example. This line tells the decoder where to start decoding. It also tells the decoder which mode to use for opening the output file (for UNIX systems), and what to call the output file. Most encoders will write 644 as the mode, which gives read/write/execute privileges to the ‚owner‘, but only read/execute to the ‚group‘ and to ‚others.‘ Note that in the example, read/write/execute privileges are given to *everybody.* Dangerous!

There is also a corresponding ‚end‘ line at the end of the data. Without one, decoders would not know where data stops and other trailer info begins.

We now come to the last topic: multi-part files. Many mailers cannot handle files larger than some maximum value (some are as low as 32 kbytes!), so a common practice is to cut the uuencoded output into smaller chunks which can be mailed. At the other end, the chunks are re-combined and then decoded. The problem here is that many people use newsreaders/posters that automatically append a signature to every post, which means that the resulting concatenated file will have bad data in it! This is why you must trim off the headers & trailers by hand. Unfortunately, there is no standard yet for a uuencoder/decoder which recognizes multi-part files. Recently I’ve seen some people put „cut here“ lines in their files like this:

BEGIN—————-CUT HERE—————–
…
END——————CUT HERE—————–

so that simple programs can be written which recognize these markers. Again, normal decoders will have no problem with this, because the watch for the ‚begin‘ marker is case-sensitive; ‚BEGIN‘ is different from ‚begin.‘

That’s about it for uuencode and uudecode.

Free File Hosting – Bandwidth

Mai 2, 2006 von Franz Hieber

Bandwidth is a term that many people can get confused about. Essentially, the bandwidth shows how much information how been downloaded from the server (a computer where the files are stored). Bandwidth is shown in terms of MB’s and sometimes even in GB’s (for large websites and files that are being downloaded often). For full websites, bandwidth can go pretty quickly since every single file and image must be downloaded for every single page that is visited by every individual user. Each image and file on a page are generally small, but having those being downloaded by many people all throughout the day can build up very quickly.

Bandwidth is purchased with a web hosting package. Bandwidth isn’t free, it costs money to be allotted a certain amount of bandwidth each month. Once the bandwidth limit is reached the files and images hosted on the server for the websites will not be able to be downloaded any longer. Essentially, they are inaccessible either until more bandwidth is purchased or until the next billing cycle (since bandwidth is purchased for a specified time period, generally a month).

File hosting websites tend to use up massive amounts of bandwidth. Bandwidth is not only used when people visit the website but also is used every single time a person downloads a file or image that has been uploaded to the file hosting website. For instance, when a person uploads a picture and then goes to show other people, bandwidth is used up every single time another person views that image. That means if you put the link to that image on another page or forum, the file hosting web site is having its bandwidth used up every single time that page is loaded since the image is being downloaded along with the rest of the page.

Since bandwidth isn’t free, file hosting websites have to place limits on the amount of times a file or image uploaded can be downloaded and also on the size of the file or image uploaded. Items that are very large require more bandwidth usage each and every time they are downloaded. Also, if the files were allowed to be downloaded unlimited amounts of time, people would abuse this and post them all over the place in forums and this could lead to serious bandwidth issues. File hosting websites allot as much as much and try to be very flexible with the issue but bandwidth isn’t free.

  • « Go to Previous Page
  • Go to page 1
  • Interim pages omitted …
  • Go to page 70
  • Go to page 71
  • Go to page 72

Haupt-Sidebar

Tags

3D-Drucker Amazon AOL Apple asus memo pad Blackberry Dell DSL E-Book E-Book-Reader Ebay Elster Facebook Google Google Android Handy Hardware Hotmail IBM Internet Makerbot Microsoft mobiles Internet Netbook Prism Quantencomputer Rundfunkbeitrag Samsung samsung galaxy fame Samsung Galaxy Mega Samsung Galaxy Tab SchülerVZ Skype Smartphone Software sony xperia tablet z Suchmaschine Tablet Tintenpatronen Twitter Typo3 WebOS WhatsApp Xing Yahoo

Technik News Kategorien

Ausgewählte Artikel

LTE tilgt weiße Flecken und drückt aufs Tempo

LTE steht für Long Term Evolution und zugleich für den Vorstoß des mobilen Internets in die erste Liga der Breitband-Internetverbindungen. [...]. Heutige Angebote für mobiles Internet bringen 3,6 oder gar 7,2 MB/sec. Der Zugang erfolgt dabei meistens über einen Internet Stick der dank USB-Schnittstelle sowohl an einem Laptop wie auch am Desktop-Computer verwendet werden kann.


Externe Festplatte mit 3,5 Zoll, 2,5 Zoll oder 1,8 Zoll

Angeschlossen wird die externe Festplatte über USB, Firewire, eSATA oder einen Netzwerk-Anschluss. Vorsicht: Bei manch einer externen Festplatte stört ein lärmender Lüfter. Die kleineren Notebook-Festplatten sind 2,5-Zoll groß. Eine externe Festplatte mit 2,5-Zoll nimmt in den meisten Fällen über den USB-Anschluss Kontakt zum Computer auf und wird über dasselbe Kabel auch gleich mit Strom versorgt.

Inhaltsverzeichnis | Impressum und Datenschutzerklärung