00950: File uploads fail with filenames containing special characters (umlauts)

Summary: File uploads fail with filenames containing special characters (umlauts)
Created: 2007-07-14 16:01
Status: Open
Category: Bug
Assigned:
Priority: 3
Version: 2.1.27
OS: 4.4.7

Description: When uploading files the upload fails if the original filename contains special characters and the file is not renamed after uploading.

The PMWiki installation this bug was found on is not administrated by me (nor do I have server access).

Server: Apache/1.3.33 (Darwin) mod_jk/1.2.6 DAV/1.0.3 mod_ssl/2.8.24 OpenSSL/0.9.7l PHP/4.4.7

Test location: http://www.design.udk-berlin.de/PmWiki/PmWiki

Error message:

Warning: move_uploaded_file(uploads/PmWiki.PmWiki/üdk_lögö.gif) [function.move-uploaded-file]: failed to open stream: Invalid argument in /Library/WebServer/Documents/designwiki/scripts/upload.php on line 198

Warning: move_uploaded_file() [function.move-uploaded-file]: Unable to move '/var/tmp/phpDHksyH' to 'uploads/PmWiki.PmWiki/üdk_lögö.gif' in /Library/WebServer/Documents/designwiki/scripts/upload.php on line 198
PmWiki can't process your request

?cannot move uploaded file to uploads/PmWiki.PmWiki/üdk_lögö.gif

We are sorry for any inconvenience.

Found some "literature" about the issue of file uploading with UTF-8 in http headers, but it think it might help you, take a closer look.

http://stackoverflow.com/questions/186840/interesting-utf-8-yahoo-file-download-headers

http://www.ietf.org/rfc/rfc2183.txt

snip:

the correct answer to this is in rfc 2231: Asterisks ("*") are reused to provide the indicator that language and character set information is present and encoding is being used. A single quote ("'") is used to delimit the character set and language information at the beginning of the parameter value. Percent signs ("%") are used as the encoding flag, which agrees with RFC 2047. Specifically, an asterisk at the end of a parameter name acts as an indicator that character set and language information may appear at the beginning of the parameter value. A single quote is used to separate the character set, language, and actual value information in the parameter value string, and an percent sign is used to flag octets encoded in hexadecimal. For example:

Content-Type: application/x-stuff;
   title*=us-ascii'en-us'This%20is%20%2A%2A%2Afun%2A%2A%2A"

CarlosAB December 10, 2008, at 01:04 PM