收录日期:2019/03/20 20:01:08 时间:2009-07-29 19:32:30 标签:tomcat,solaris,character-encoding,diacritics

One of our client bought a publicity in a newspaper and added to his URL : http://www.website.com/publicité instead of "publicite" (without the accent)...

I'm trying to make the corresponding directory under Solaris and it doesn't seems to work. I grabbed the "get" request and it looks like the "real" request is /publicit%C3%A9 We tried to add a directory with that name but it doesn't work either.

Any idea of how web could fix this problem?

We use Apache and Tomcat as our web container with JAVA (and jsp)

Looks like it's a Solaris server and not a linux..

Perhaps you could use Apache's mod_rewrite to change it to publicite (no accent)?

Looks like the request has been URL-encoded. The tricky thing is, it's not in standard ASCII, so I don't think it can be reliably decoded to the correct "é" (because you don't know just from the URL that it was originally in Unicode and not ASCII).

Apparently, there is also no standard for encoding "é" either, so the URL you receive might be different for the same request from two different clients.

Good luck.

I'm trying to make the corresponding directory under linux and it doesn't seems to work.

What exactly did you try, and how did it fail?

You could try this (in bash):

cd /var/www/html   ## -- Change as needed.
dname=$(echo -en "publicit\0303\0251")
mkdir publicite
ln -s publicite "$dname"

This is a simple version of Paul's idea to use rewrite.

BTW, I just created a directory "publicité" with no problem by pasting text from this page, and from the commands above. Apache lists the empty directory fine in the browser (Firefox on Linux and WXP), albeit my English-configured Apache messed up the name in the listing:

Index of /xtra/publicité
[ICO]   Name	Last modified	Size	Description
[DIR]   Parent Directory	 	-
Apache/2.2.3 (CentOS) Server at localhost Port 80

And I'm seeing the same as you from the Apache access log: "GET /xtra/publicit%c3%a9/ HTTP/1.1"