Ivan Shmakov
2017-10-26 03:24:37 UTC
Since the switch to a new hoster, the files on
http://hendrikmaryns.name/antro.shtml are no longer downloadable,
due to garbling of utf8 filenames. How to solve this?
http://hendrikmaryns.name/antro.shtml are no longer downloadable,
due to garbling of utf8 filenames. How to solve this?
directory on the server. A solution that I expect to work for a
variety of cases would be to remove all the files with mangled
filenames and reupload them under proper ones.
If you have SSH (command-line) access, then, depending on the
tools available to you, you may be able to, say, run a Perl
script to rename them on the server.
P.S. I just realize this is probably not the right newsgroup for
this. Please refer me to the proper place.
this. Please refer me to the proper place.
As for HTML, the page seems to claim HTML4 compliance, but using
“unencoded” UTF-8 in ‘href’ is something that is only allowed in
HTML5. Moreover, even there, spaces need to be encoded as %20,
unless I be mistaken. Cf.:
<a href="https://ru.wikipedia.org/wiki/%D0%9E%D0%BC%D0%BE%D0%BD_%D0%A0%D0%B0"
(strict HTML4)</a>
<a href="https://ru.wikipedia.org/wiki/Омон_Ра" >(allowed in HTML5)</a>(Although the browsers seem to be rather forgiving in this regard.)
It does not appear to be a UTF-8 issue. This is how one of the URLs
http://hendrikmaryns.name/Antroposofie/Valentin%20Wember%20%E2%80%93%20Waar%20gaan%20we%20eigenlijk%20heen?.pdf
In plain text: Antroposofie/Valentin Wember – Waar gaan we eigenlijk
heen?.pdf
Note the “?” at the end. I doubt that is what is supposed to be
printed; it is a replacement character to some other value.
I’m unsure of what you mean by “replacement character” here, buthttp://hendrikmaryns.name/Antroposofie/Valentin%20Wember%20%E2%80%93%20Waar%20gaan%20we%20eigenlijk%20heen?.pdf
In plain text: Antroposofie/Valentin Wember – Waar gaan we eigenlijk
heen?.pdf
Note the “?” at the end. I doubt that is what is supposed to be
printed; it is a replacement character to some other value.
indeed, ‘?’ in a URI signifies the start of a ‘query’ portion,
so it has to be encoded as %3F. Cf.:
https://en.wikipedia.org/wiki/Main_page?
https://en.wikipedia.org/wiki/Main_page?action=history
https://en.wikipedia.org/wiki/Main_page%3F
https://en.wikipedia.org/wiki/Main_page%3Faction=history
(Then, it appears that the Wikimedia servers are slightly
misconfigured in that respect. Admittedly, this behavior may be
rather tricky to get right.)
That said, replacing ? with %3F in the URI above results in a
surprising 301 “permanent” redirect:
HTTP/1.1 301 Moved Permanently
Date: Thu, 26 Oct 2017 02:53:33 GMT
Server: Apache/2
Location: http://hendrikmaryns.name/Antroposofie/Valentin%20Wember%20%e2%80%93%20Waar%20gaan%20we%20eigenlijk%20heen.shtml?.pdf
Content-Length: 325
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
Yet it still doesn’t explain why some other URIs may be
inaccessible; say:
http://hendrikmaryns.name/Antroposofie/Spirituele%20opgaven%20Belgi%C3%AB%20%E2%80%93%20Johan%20Steverlinck.pdf
HTTP/1.1 404 Not Found
Date: Thu, 26 Oct 2017 02:57:57 GMT
Server: Apache/2
Content-Length: 382
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
--
FSF associate member #7257 np. Unforgettable — Illya Leonov
FSF associate member #7257 np. Unforgettable — Illya Leonov