Context Navigation

← Previous Ticket
Next Ticket →

#2246 new enhancement

Using uchardet for encoding detection

Reported by:	Jehan	Owned by:	beastd
Priority:	normal	Component:	libass
Version:	unspecified	Severity:	normal
Keywords:		Cc:
Blocked By:		Blocking:
Reproduced by developer:	no	Analyzed by developer:	no

Description

mpv/ffmpeg have a very limited encoding detection based on ENCA (basically latin/cyrillic/chinese only). So when you pass for instance a subtitle file in Japanese/Korean not using UTF-8 (from experience, maybe about half of them? UTF-8 gains weight but still isn't the only used encoding in many areas), it shows garbled text and you have to specify the encoding (meaning you have to know which it is, which mostly is done through trial-and-error). See enca --list languages to see the list of supported encoding by enca, hence by mplayer/ffmpeg.

There are a few ports in various languages based on Mozilla firefox algorithm. A C binding is uchardet: https://github.com/BYVoid/uchardet

mpv, the mplayer fork, now uses "uchardet" as default language detection ("enca" is still available as alternative but is not default anymore).
See: https://github.com/mpv-player/mpv/issues/908
and: https://github.com/mpv-player/mpv/pull/2193

I believe mplayer's encoding detection is only in libass (otherwise wrong component, I guess)? Could mplayer/ffmpeg/libass also add a support to uchardet in order to improve support for Asian languages?
This would be great.

Note: See TracTickets for help on using tickets.

Context Navigation

#2246 new enhancement

Using uchardet for encoding detection

Description

Change History (0)

Download in other formats: