SVN Error: Can't convert string from native encoding to 'UTF-8'

I’ve got a post-commit hook script that performs a SVN update of a working copy when commits are made to the repository.

When users commit to the repository from their Windows machines using TortoiseSVN they get the following error:

  • eclipse : impossible to import git project
  • Git: index file open failed: Permission denied on “git status”. Hosted on Bitbucket
  • How to config socks5 proxy on Git
  • Why is git creating read-only (444) files?
  • Gitosis not accepting numbers in hostname
  • How to set up a Subversion (SVN) server on GNU/Linux - Ubuntu
  • post-commit hook failed (exit code 1) with output:
    svn: Error converting entry in directory '/home/websites/devel/website/guides/Images' to UTF-8
    svn: Can't convert string from native encoding to 'UTF-8':
    svn: Teneriffa-S?\195?\188d.jpg
    

    The file in question above is: Teneriffa-Süd.jpg notice the accented u. This is because the site is German and the files have been spelt in German.

    When executing a update on the working copy at the Linux command-line no errors are encountered. The above error only exists when the post-commit hook is executed via a commit by a Windows SVN client.

    Questions:

    1. Why would SVN try to change the encoding of a file?
    2. Are filenames allowed to contain chars that are outside the Windows standard ASCII ones?

    Update:

    It turns out that the file in question’s filename correctly displays as Teneriffa-Süd.jpg when viewed from a Windows machine (via Samba) but when I view the filename from the Linux server (using SSH and PuTTY) where the file resides I get Teneriffa-Süd.jpg

  • Are there any dangers associated with using JavaScript namespaces?
  • How to get a merge log for a branch?
  • What are the differences between merging a range of revisions vs. reintegrate in SVN?
  • git wildcard - remove all instances of a subdirectory
  • initialized empty git repository
  • How do I prepend history to a git repo?
  • 11 Solutions collect form web for “SVN Error: Can't convert string from native encoding to 'UTF-8'”

    1. It does not change the encoding of the file. It changes the encoding of the filename (to something that every client can hopefully understand).
    2. Allowed by whom ? NTFS uses 16-bit code points, and Windows can expose the file names in various encodings, based on how you ask for it (it will try to convert them to the encoding you ask for). Now… That bit (how you ask) depends on the specific svn client you use. It sounds to me like a bug in TortoiseSVN.

    Edit to add:

    Ugh. I misunderstood the symptoms. the svn server stores everything in utf-8 (and it seems that it did that successfully).

    The post-commit hook is the bit that fails to convert from UTF-8. If I understand what you’re saying correctly, the post-commit hook on the server triggers an svn update to a shared drive (the svn server therefore starts an svn client to itself…) ? This means that the configuration that needs to be fixed is the one for the client on the server.
    Check the LANG / LC_ALL on the environment executing the svn server.. As it happens, the hooks are run in a vacuum environment (see Tip). So you should set the variable in the hook itself.

    See also this page for info on how svn handles localisation

    Yet another example:

    $ svn update
    svn: Error converting entry in directory '.' to UTF-8
    svn: Can't convert string from native encoding to 'UTF-8':
    
    $ export LC_CTYPE=en_US.UTF-8
    
    $ svn update
    

    (… and all is fine now)

    If Error is –

    [abc@288832-web3 public_html]$ svn update
    svn: Error converting entry in directory 'images' to UTF-8
    svn: Valid UTF-8 data
    (hex: 46 65 6e 65 72 62 61 68)
    followed by invalid UTF-8 sequence
    (hex: e7 65 2b 46)
    

    Then do this.

    [abc@288832-web3 public_html]$ printf "\x46\x65\x6e\x65\x72\x62\x61\x68\n"
    Fenerbah  
    

    (This means that the system has some file name starting with “Fenerbah” in that folder.)

    [abc@288832-web3 public_html]$ cd  images
    [abc@288832-web3 images]$ rm -rf Fenerbahçe+Forma+2.jpg
    

    So you can see that there is a special character in the name and it is not supported by SVN.

    put this in your post-commit
    export LANG=xxxxx (your lang)

    Don’t forget to generate those locales in your system
    (as root)

    example for Ru

    locale-gen ru_RU.CP1251
    locale-gen ru_RU.UTF-8
    dpkg-reconfigure locales
    
    1. It changes the encoding to a location-neutral encoding in case someone with a different encoding checks it out.

    2. Of course. But it’s not “Windows” ASCII (Windows actually uses some strange encoding like CP1251 or so).

    The best way to fix this is to make sure that your system uses UTF-8 whenever possible (check $LANG).

    Just use the following line in your script before executing any svn command.
    User appropriate language codes, in following example I used japanese

    export LC_ALL=ja_JP.UTF8
    

    It seems that all LC_ varables need .UTF8 at the end. For example, I happened to have LC_ALL, LC_TIME, and LC_CTYPE defined. After setting LC_CTYPE the problem was not solved, so I needed to type LC_ALL as well and then it worked:

    LC_ALL=en_US.UTF-8
    LC_TIME=en_DK.UTF-8
    LC_CTYPE=en_US.UTF-8
    

    In order to avoid the problem again, I copied the file to a different name, removed the old one from svn, added new one to svn, and send a message to a collaborator not to do this.

    I got a similar problem when running “svn add” on a directory, but the solution was different. I couldn’t see the “hex” digits using printf (actually no hex output was shown by svn), but this command allowed me to see the results, and fix it:

    LC_ALL=C svn add probealign
    

    I think, in general, sticking LC_ALL=C before your command allows you to see the offending files… and is a lot easier than pasting in a lot of \x72 stuff (which apparently may not be available).

    For information, I got this error on commit native encoding to 'UTF-8'with a windows client tortoise svn,

    when my URL of repository was :

    http://x.x.x.x/svn/myrepos

    I changed my URL of repository for :

    svn://x.x.x.x/myrepos

    and now all is perferct.

    I think this information will be useful to some.

    In my case, I had the setting in ~/.subversion/config as below

    log-encoding = ...

    Commenting it worked.

    Git Baby is a git and github fan, let's start git clone.