Wget and long filenames [Vulnerability]

Recently taking a pen-testing course made me realize that security is such an important topic in software design, more so than I ever realized. Being a professional programmer for the last 15 years and attaining both undergraduate and graduate degrees in computer science helped me find only the basics of security. It took getting into the weeds and actually learning exploits to realize the woes of security flaws and in some cases these flaws are added on purpose to solve a problem and in turn end up adding a security risk. I guess in that case the feature was more important than the potential security risk.

In this post I want to highlight one flaw that I recently found while practicing on a practice virtual machine. In no way I am claiming discovery of the flaw, on the contrary, the box creator actually baked it into a test machine to test others to find it. The reason I am writing about it is because I couldn’t find anything online that talks about this particular gotcha.

History

Wget had a couple serious vulnerabilities identified recently, one of which was that wget allowed for arbitrary filename change if the resource being downloaded was redirected across a different scheme. In practice this would mean an HTTP or HTTPS resource returning a redirect to an FTP resource, the name that would be downloaded would be the FTP resource name [1]. This has since been fixed and is not an issue for Wget 1.8+, that being said, there is a slightly different intended feature which causes a similar issue in modern wget.

The Exploit

Many filesystems have a limit on filename lengths, for that reason when Wget is run it may have to download a file with a name that is larger than that length, which crashes the process. To resolve this users where advised to use the -O parameter when downloading a single file. However, this only solved that use-case and did not solve the use-case when users where trying to recursively download data from a website. The following bug was reported [2] and was resolved by the following code [3]. The code essentially checked a compile time configuration value to determine the max length of the filename allowed. If the filename is larger, the code goes ahead truncates the filename. On many linux systems, this length is 240 characters.

The following is an example of how a wget can be used to download a filename with a different extension than the one present in the url. This can be used on websites that use wget to download resources from the internet but are not aware of this particular limit and therefore do not handle it properly prior to making the request to wget.