DIRECTORY TRAVERSAL ATTACK ~ Rookie-Hacker

Directory Traversal Attack is an attack that allows an attacker to traverse or move through one or more forbidden directories to gain access to restricted files. The attack is possible due to improper validation/configuration by either the programmer or the server itself.

In the context of Web Security, Directory Traversal Attack has two flavors: attacks against web server and attacks against application code.

Web Server Level Directory Traversal Attack

Directory Traversal Attacks targeted towards web servers are not very common these days because most web server vendors have realized the potentials of these attacks and have provided patches to their servers. But since not everybody has the latest web server software and because some do not apply patches promptly, it is still possible to find web servers on the internet that are still vulnerable to such attacks.

To illustrate this more consider the following simple example:

http://www.somewebsite.com/../password.txt

Most web servers in the world would block such an attempt but before most web servers became immune, the below attack was successful:

http://www.somewebsite.com/%2e%2e%2fpassword.txt

Ideally a web server must not serve documents outside the webroot folder, but in the above case the web server fails to block the escaped representation of the ../ command. ../ tells the server to step one level up. ../../ tells the server to step two levels up and so on. If you unescape the %2e%2e%2f sequence you will get ../. The web server failed to count for escaped characters and as a result made the attack successful. There are many ways to encode traversal commands. Listed below are some:

Normal URL Encoding

%2e = .
%2f = / 
%5c = \ 
%25 = %

As you can see, to URL encode a character you write the % character followed by two hexadecimals and this will be the encoded representation of the character.

%2e%2e%2f = ../
%2e%2e%5c = ..\

Double Encoding

As you saw above, web servers know that when they encounter the % character in the URL then an encoded character exists. Then they attempt to restore the original character by looking at the two hexadecimals that follow the % sign. Double encoding involves encoding the% sign itself to fool the server. Example:

%2f = / 
%252f = %2f (double encoding)
%25252f = %252f (triple encoding)

Each time to want to increase the level of encoding, you encode the % sign itself.

%252e = . (Double URL encoding)
%252f = / (Double URL encoding)
%255c = \ (Double URL encoding)

Now, what is the advantage of double encoding? Double encoding attacks work when two layers of decoding are expected: one that does the security check and another that doesn't. It is essential that both layers understand URL encoding otherwise the attack will not work.

Example:

http://www.somesite.com/%252e%252e%252fpasswords.txt

The web server decodes the above URL one time and gets the below URL:

http://www.somesite.com/%2e%2e%2fpasswords.txt

The web server checks the URL for the ../ sequence. It the sequence exists, it will reject the request. But in the above URL, the ../sequence does not exist, but its encoded form does. The system wrongfully trusts the URL and sends it to another subsystem for processing. The subsystem of course will decode the above URL one more time and gets this:

http://www.somesite.com/../passwords.txt

The attacks works!

Exhaustive Decoding

The solution to double encoding would be to perform the decoding more than one time. The decoding should be performed as many times as required until no encoded characters are found. Example:

Received URL

http://www.somesite.com/%25252525252f

First Level Decoding:

http://www.somesite.com/%252525252f

Second Level Decoding:

http://www.somesite.com/%2525252f

Third Level Decoding:

http://www.somesite.com/%25252f

Fourth Level Decoding:

http://www.somesite.com/%252f

Fifth Level Decoding:

http://www.somesite.com/%2f

Sixth Level Decoding:

http://www.somesite.com/.

Unicode Encoding

You saw above how normal and double encodings work. There is one more way to encode a URL: Unicode. For Unicode encoding attack to work two conditions are required: first the web server must be configured to accept Unicode URLs and second the web server must have an overlong Unicode bug.

Consider the below URL:

http://www.somesite.com/..%c0%afpassword.txt

Notice the %co%af code at the end. This is malformed UTF-8 representation of the / character. In UTF-8, some characters can be represented in one or more bytes. A character that can be represented in one byte can also be represented in two bytes. According to the UTF-8 specification, all characters must be represented in their simplest form only, which means that if a certain character can be represented in one or two bytes then the two bytes representation should be regarded as not proper or illegal.

In normal conditions, the / character can be Unicode encoded with just one byte: 00101111 or 2F in hexadecimal. But it also it can be encoded using two bytes like this: 11000000 10101111 or C0 AF in hexadecimal. We call the two bytes representation an overlong representation because it is longer than required. The overlong representation is malformed. It is illegal. Any system receiving an overlong representation must reject, disregard or normalize the character. Normalizing the character means transforming it into its simplest form: one byte representation.

An earlier version of IIS did not follow the specifications. IIS tried to block ../ sequences that lead to directories above the webroot. It does that by looking up ../ sequences in the URL. When IIS received the %c0%af character. IIS compared it to the one byte representation of the / character. They did not match. So IIS assumed that the ..%0c%af is not equal to ..%2f. It considered the URL as safe and allowed the request to pass. When the request is passed, the %0c%af is decoded back to / and the attack is successful. The mistake made by the web server is that the URL was not fully decoded before passing through the filter. The filter ran first on the malformed URL and only after that the URL was decoded. The URL should be decoded first and then passed to the filter after that. At least the filter should have norm alized the overlong %0c%af to its simplest form before doing any comparisons.

Possible UTF-8 Encodings:

%c1%1c 
%c0%af
%c0%9v

Application Level Directory Traversal Attack

Everything mentioned above about web server directory attacks applies to scripts that accept paths as parameters. To illustrate this, consider the example below:

http://www.somesite.com/?page=home.php

As you can see, the application will try to load or include a page based on some query string parameter. If the application does not do proper input validation, the below URL will disclose the password file on a Unix System:

http://www.somesite.com/?page=../../../../etc/passwd

In case the web developer tries to filter the page parameter by looking up the ../ sequence, all the encoding methods discussed above could be used here as well.

Protecting Against Directory Traversal Attacks

To protect against Directory Traversal Attacks, normalize all file request parameters to their fully decoded form before applying filter rules. This may include performing multiple steps of decoding. For web server level attacks there is little that you can do other than making sure the server is well patched and up to date.

Rookie-Hacker