|
The weird-looking
addresses above take advantage of several things many people don't know
about the structure of a valid URL. There's a little more to Internet
addressing than commonly meets the eye; there are conventions which
allow for some interesting variations in how an Internet address is
expressed. These tricks are known to the spammers and scammers, and
they're used freely in unsolicited mails. You'll also see them in
ad-related URLs and occasionally on web pages where the writer hopes to
avoid recognition of a linked address for whatever reason. Now, I'm
making these tricks known to you. Read on, and you'll soon be very hard
to fool. (Note: Depending on your browser type and its version, some of
the oddly-formatted URLs on this page may not work. Also if you're on a
LAN and using a proxy [gateway] for Internet access, many of them are
unlikely to work. Also, fear not; this page does not exploit the
"Dotless IP Address" vulnerability of some IE versions.)
HOW IT'S DONE
First take note of the "@" symbol that appears
amid all those numbers. In actual fact, everything between "http://" and
"@" is completely irrelevant! Just about anything can go in there and
it makes no difference whatsoever to the final result. This feature is
actually used for authentication. If a login name and/or password is
required to access a web page, it can be included here and login will be
automatic. But if the page requires no authentication, the
authentication text is in effect ignored by both browser and server.
This presents interesting possibilities for confusing the unsuspecting
user. How about this one: Http://www.playboy.com@3484559912/obscure.htm
If you didn't know better, you might think this page were at playboy.com!
By the way, the @ symbol can be represented
by its hex code %40 to further confuse things; this works for the IE
browser, but not for Netscape. All right, so what about that long number
after the "@"? How does 3484559912 get you to www.pc-help.org?
In actual fact, the two are equivalent to one
another. This takes a little explaining so follow me carefully here.
The first thing you need to know (most Net
users know this), is that Internet names translate to numbers called IP
addresses. An IP address is normally seen in "dotted decimal" format.
www.pc-help.org translates to 207.178.42.40.
Numeric IP addresses are generally
unrecognizable to people. That's why we use names for network locations
in the first place.
Merely using an IP address, in its usual
dotted-decimal format, in place of the name is commonly done and can be
quite effective at leaving the human reader in the dark.
But there are other ways to express that same number. The alternate formats are:
* "dword" - meaning double word because it
consists essentially of two binary "words" of 16 bits; but it is
expressed in decimal (base 10);
* "octal", meaning it's expressed in base 8; and
* "hexadecimal" hexa=6 + deci=10 (base 16).
The dword equivalent of 207.178.42.40 is
3484559912. Its octal and hexadecimal equivalents are also illustrated
below.
Okay, so what about the rest of the URL?
Here's how all that gibberish on the right works:
Individual characters of a URL's path and
filename can be represented by their numbers in hexadecimal form. Each
hex number is preceded by a "%" symbol to identify the following two
numbers/letters as a hexadecimal representation of the character. The
practical use for this is to make it possible to include spaces and
unusual characters in a URL. But it works for all characters and can
render perfectly readable text into a complete hash.
In my example, I have interspersed hex
representations with the real letters of the URL. It simply spells out
"/obscure.htm" in the final analysis:
/ o %62 s %63 ur %65 %2e %68 t %6D / o b s c ur e . h t m
The letters used in the hex numbers can be either upper or lower case. The "slashes" in the address
|
cannot be represented in hex; nor can the IP address be rendered this particular way. But everything else can be.
HEXADECIMAL CHARACTER CODES
Hex character codes are simply the hexadecimal
(base 16) numbers for the ASCII character set; that is, the
number-to-letter representations which comprise virtually all computer
text.
For most people, the conversion is probably
best done with a chart. The best ASCII-to-hex chart I have ever seen is
on the website of Jim Price: http://www.jimprice.com/jim-asc.htm. Jim
explains the ASCII character set wonderfully well, and provides a wealth
of handy charts.
MORE ON DOTTED-DECIMAL IPS
Here's another address for this page: http://463.434.298.552/obscure.htm
Normally, the four IP numbers in a standard
dotted-decimal address will all be between 0 and 255. In fact they must
translate to an 8-bit binary number (ones and zeroes), which can
represent a quantity no higher than 255.
But the way this number is handled by some
software often allows for a value higher than 255. The program uses only
the 8 right-hand digits of the binary number, and will drop the rest if
the number is too large.
This means you can add multiples of 256 to
any or all of the 4 segments of an IP address, and it will often still
work. In my tests, it was limited to 3 digits per number; values over
999 didn't work.
CONVERTING AN IP ADDRESS TO DWORD FORMAT
Here's a way to do this with very simple math
Multiply the numbers of the IP address by the
following fixed values (which are powers of 256), then add the results:
10420224= 159 x 65536 (256^2) 10240 = 40 x 256 (256^1) 2 = 2 x 1 (256^0) ______ 3466536962
Now, there is a further step that can make
this address even more obscure. You can add to this dword number, any
multiple of the quantity 4294967296 (2564) - and it will still work.
This is because when the sum is converted to its basic digital form, the
last 8 hexadecimal digits will remain the same. Everything to the left
of those 8 hex digits is discarded by the IP software and therefore
irrelevant.
OCTAL IP ADDRESSES
As if all this weren't enough, an IP address can
also be represented in octal form - base 8. The URL for this page with
its IP address in octal form looks like this:
http://0317.0262.052.050/obscure.htm
Note the leading zeroes. They're necessary to
convey to your browser the fact that this is an octal number. Any
number of leading zeroes can be added to any or all of the numbers in
the address.
I'll spare you a detailed description of
octal conversion. For those who can't figure it out, there's a nifty
URLomatic at www.samspade.org that will do it for you.
There is yet another obscure way to express
an IP address. Using the method outlined above, calculate the
hexadecimal number for 207.178.42.40. That number (CFB22A28) can be
expressed as an IP address in this manner: 0xCF.0xB2.0x2A.0x28 The "0x"
designates each number as a hex quantity. The dots can be omitted, and
the entire hex number preceded by 0x: 0xCFB22A28 And, additional
arbitrary hex digits can be added to the left of the "real" number:
0x9A3F0800CFB22A28 Some browsers (Netscape 3.x and 4.x for instance)
won't work with hex IPs; but for IE users, this page's URL can be:
http://0xCF.0xB2.0x2A.0x28/obscure.htm or: http://0xCFB22A28/obscure.htm
or: http://0x9A3F0800CFB22A28/obscure.htm
IN SUM
URLs can be obscured at least three ways:
1) Meaningless or deceptive text can be added after the "http://" and before an "@" symbol.
2) The domain name can be expressed as an IP
address, in dotted-decimal, dword, octal or hexadecimal format; and all
of these formats have variants.
3) Characters in the URL can also be expressed as hexadecimal (base 16) numbers.
|