Opened 11 years ago

Last modified 11 years ago

#955 new defect

mplayer does not accept filenames outside the local codepage on Windows OS

Reported by: sylvain@… Owned by: reimar
Priority: normal Component: core
Version: HEAD Severity: normal
Keywords: Cc: mplayer@…, ulion2002@…, bertrand@…, rvm@…
Blocked By: Blocking:
Reproduced by developer: Analyzed by developer:

Description

How to reproduce:

  • Use Windows XP (Linux with UTF8 is NOT affected)
  • Rename a file to include special characters outside your local codepage (for example japanese/hebrew on a west-european/US system). You can also include those special characters in the path (but not necessary in the filename).
  • Try to play the file

Result: You should get a file not found error.

The files do play in other media players. Tested only with audio files (ogg and mp3).

The mplayer binaries came from here:
http://www.paehl.com/open_source/?Convert_Tools:MPLAYER_MENCODER

Change History (19)

comment:1 Changed 11 years ago by compn

use dir /x to get the 8.3 filename then use mplayer fileba~1.mp3

comment:2 Changed 11 years ago by sylvain@…

Thanks for the answer. Isn't there a better solution however? Especially if you are implementing it in a media player like aTunes (www.atunes.org). After all, this would not solve anything in the official mplayer GUI and in many other media players using mplayer.

comment:3 Changed 11 years ago by compn

hey i just posted a workaround... its still a bug ;p

but i'm not sure if its a windows bug or not...

the files with strange characters also cannot be opened by some programs, even notepad!

comment:4 Changed 11 years ago by sylvain@…

I'm not expert in this, but as I already come in contact with it I can tell you it is quite a mess the way Windows is working. Internally, it should work in some Unicode (the reason why the files can play). As much as I understand the Microsoft documentation, there should be two different API's. I think mplayer still uses the old API. You might find some info on:
http://www.microsoft.com/globaldev/getWR/steps/WRG_unicode.mspx

For the notepad issue, maybe the support for asian languages was not installed. It must be done separately (I think it installs needed fonts and so).

comment:5 Changed 11 years ago by mplayer@…

  • Cc mplayer@… added

this problem also affects explorer.exe
there are still another names like com7.dll
that are not deletables with explorer

the file is drag & dropped to mplayer, so explorer passes the filename to mplayer
mplayer uses fopen() but fopen() is ansi and does not supports chars outside
your code page

the problem is not easy to solve
at least without using unicode version
of main() wmain() and using createfile or wfopen instead of fopen

but since a lot of stuff in mplayer is ansi I see this thing problematic

comment:6 Changed 11 years ago by sylvain@…

Thanks for the repsonse. I know I might sound a little bit naive, but currently the Linux versions do not seem to have this sort of problems with unicode (all the Linux systems I have are UTF-8). I've just listened to a japanese song before and now the filename has accents in it. It works just fine. So, assuming Windows and Linux versions both work the same way internally is this not more an interface problem?

Or you don't want to brake compatibility with Win9x which I understand would occur when using the new API's?

I'm sorry to nag you, but this is an issue that comes up frequently in our forum.

comment:7 Changed 11 years ago by reimar

Firstly, MPlayer is not ANSI, it is almost completely UTF-8 nowadays.
Which leads to the first problem: Microsoft does neither use nor support UTF-8 for filenames. This probably will be the biggest problem, since fixing it will require loads of code changes and additional code only for Windows.
The second thing is, AFAIK the normal Windows command line (as in cmd.exe) does not support anything outside the local code page, so I think that any "fix" to this would be a Gui/Drag?-and-Drop -specific hack.
I don't particulary want to discourage anyone from trying to find a solution, but so far I think that this depends on Microsoft providing Unicode-support that is more than an ad-hoc hack...

comment:8 Changed 11 years ago by sylvain@…

On the command line, you are supposed to be able to use UTF-8 by using "chcp 65001". In theory, you should be able to completely switch the whole system to UTF-8, but in reality the later ends up in a complete disaster ;-)

Keeping mplayer on the command line is likely being the most important problem in order to move Unicode, no? Otherwise it would be necessary to convert from UTF-16 to UTF-8 and use the new API's, right? Anyway, I see there is a conflict here.

comment:9 Changed 11 years ago by mplayer@…

I'm not saying that mplayer is ansi, I'm talking about command line stuff

using int main(int argc, char *argv[])
on linux you can pass an utf-8 filename that can fit in char type (as far I known)

main() on win32 accepts only ansi filenames since (you can still pass an utf-8 filename, but subsequently open() or fopen() will fail)
there is wmain() that accepts wchars, but then all filename
operations should have wchar_t as arguments

so all function that use filename from cmd line should be replaced
by the wide version on win32

this may be solvable but I don't known how much stuff need to be
ifdef-ed, and if it will be acceptable

the compatibility with win9x is not a problem, ms made a workaround
by using unicows.dll

there shouldn't be problems while using utf8 strings for subtitles
or similar even on win32

I'll look at the source trying to find really how much stuff needs
to be changed

Reimar, I would known if such a solution would be acceptable
to include in the main tree, I don't want to keep
it as a custom patch

Regards

comment:10 Changed 11 years ago by mplayer@…

msdn says that we can also get wchar cmd line even if program is started using main()
an example here:
http://msdn2.microsoft.com/en-us/library/bb776391.aspx

this can avoid to change all argument parsing code

by supposing that the array returned by CommandLineToArgvW()
has same elements of argv, we can use the wchar array to pick
only the filename and use argv for parsing arguments

comment:11 Changed 11 years ago by mplayer@…

sorry for spamming, but GetCommandLineW and CommandLineToArgvW
are missing on win9x, perhaps you cannot even have a foreign charset in filenames on win9x, so the function can be picked by using loadlibrary/getprocaddr

comment:12 Changed 11 years ago by reimar

Firstly on "chcp 65001", that reportedly breaks batch files.
Secondly about CommandLineToArgvW, that alone does not help much, you still need to convert everything to UTF-8, there is no way we can support UCS-2/UTF-16 strings.
Then all the functions to open a file all need to be replaced as well by something that converts the string back again into UCS-2/UTF-16 and use wfopen or whatever.
I really have some doubts that is possible without more code than acceptable.
Unless you are using Vista, why not just use the 8.3 compatibility filenames?
AFAIK you can even make Windows-Explorer use them when you double-click a file.

comment:13 Changed 11 years ago by ulion2002@…

  • Cc ulion2002@… added

Maybe we should replace the fopen and open call by our utf-8 version fopen and open call on mingw(win32?) to always use utf-8 filenames and parameters.
This could work around this problem in a easy way?

comment:14 Changed 11 years ago by reimar

We'd still have to get the file name in UTF-8 first, also replacing has to be done carefully, not mindlessly, to make sure there are no new bugs.
Before attempting such an intrusive change I'd first like some feedback why using the 8.3 short filenames is not a perfectly fine workaround.

comment:15 Changed 11 years ago by compn

(In reply to comment #14)

Before attempting such an intrusive change I'd first like some feedback why
using the 8.3 short filenames is not a perfectly fine workaround.

well using mplayer abcdef~1.avi will show abcdef~1.avi when you hit the I key to find out what file you are playing ;p

comment:16 Changed 11 years ago by bertrand@…

  • Cc bertrand@… added

For your information, we at Jajuk (http://jajuk.info/) have the same issue. Windows users have to rename some of their files to make mplayer (and jajuk) playing them. A fix is really welcomed.

Regards,
Bertrand Florat, Jajuk admin

comment:17 Changed 11 years ago by sylvain@…

As this would be a Windows only fix why not use functions (MultiByteToWideChar? and WideCharToMultiByte?) in Windows for converting UTF8 to/from UTF16 (stumbling point as I understand):

http://msdn2.microsoft.com/en-us/library/ms776420(VS.85).aspx
http://msdn2.microsoft.com/en-us/library/ms776413(VS.85).aspx

The first reason I see for not using the workaround is that it has to be implemented in ALL Windows programms using mplayer. Currently two are reporting this problem to you. Also, I remind once more that your own GUI is affected.
Second, for command line users this sucks, especially if you don't know the workaround.
Third, it is not consistent with other platforms.
Fourth, some (advanced) users will have switched off short file name generation, and I don't know how this will behave with mplayer.
Fifth, drag and drop does not work.

comment:18 Changed 11 years ago by sylvain@…

Is a resolution planned for the foreseeable future? User reports just don't stop. Otherwise I will try to implement the workaround, but only for aTunes.

comment:19 Changed 11 years ago by rvm@…

  • Cc rvm@… added

It seems there's another issue regarding this problem. If using -sub-fuzziness to autoload subtitles in the movie's directory, mplayer won't autoload those subtitles which filenames contains characters outside the system charset.

And this time, I think there's no workaround for this, or is it?

Note: See TracTickets for help on using tickets.