How can I remove trailing slash from site completely? #51
-
I know this would be rather Hugo's problem, but even though many requested or asked about it, Hugo community seems to answer as 'no, we won't'. I see a lot of posts asking how to remove trailing slash from URL, but none of them seem to have usable solution. First, I tried with NGINX configuration. server {
server_name example.com;
root /var/www/html/example.com/;
index index.html;
location ^~ /.build {
deny all;
return 403;
}
location / {
try_files $uri $uri/index.html =404;
rewrite ^/(.*)/$ /$1 permanent;
}
gzip on;
gzip_disable "msie6";
gzip_buffers 32 32k;
gzip_vary on;
gzip_comp_level 5;
gzip_min_length 10240;
gzip_proxied no-cache no-store private expired auth;
gzip_types text/plain application/x-javascript text/xml text/css application/xml application/javascript;
} This rewrites any slash-trailed URL to non-slash-trailed URL, but Google Search Console keep trying to index slash-trailed URL. I think that's because URLs in generated site are still slash-trailed. So I decided to remove trailing slash from URL too, in generated site files. I created a NPM script to remove trailing slash from the generated site. import { readFileSync, writeFileSync } from 'fs'
import { parse } from 'node-html-parser'
import glob from 'glob'
function removeTrailingSlash(dirPath) {
const files = glob.sync(`${dirPath}/**/*.+(html|xml)`)
const regexXML = /<(?:link|guid|loc)>(https?:\/\/[^<]+)<\/(?:link|guid|loc)>/g
function rts(url) {
if (url.length > 1 && url.endsWith('/')) {
return url.slice(0, -1)
}
return url
}
files.forEach(file => {
console.log(file)
let content = readFileSync(file, 'utf8')
if (file.endsWith('.html')) {
const root = parse(content)
root.getElementsByTagName('meta').forEach(m => {
if (m.getAttribute('property') == 'og:url') {
m.setAttribute('content', rts(m.getAttribute('content')))
}
})
root.getElementsByTagName('a').forEach(a => {
a.setAttribute('href', rts(a.getAttribute('href')))
})
content = root.toString()
} else if (file.endsWith('.xml')) {
content = content.replace(regexXML, (match, url) => {
return match.replace(url, rts(url))
})
}
writeFileSync(file, content)
})
}
removeTrailingSlash('./public')
export { removeTrailingSlash } In theory, this should remove all trailing slash that can be found on HTML/XML files. But this breaks generated site like this. At this point, I really lost what to do. What should I do to remove trailing slash from my site, once and for all? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Hmm, I have no idea on this, can't provide any help. Template methods produce URLs with trailing slash for pages, such as I don't think it's worth it to change this. |
Beta Was this translation helpful? Give feedback.
-
I'll leave this for the record. I didn't mark this as answer because there might be an issue regarding of modifying HTML/XML files. At least, it looks like working on my site for now. First, NGINX should be configured like this: server {
location / {
try_files $uri $uri/index.html $uri/ =404;
rewrite ^/(.*)/$ /$1 permanent;
}
}
I wrote import { readFileSync, writeFileSync } from 'fs'
import { load } from 'cheerio'
import glob from 'glob'
function removeTrailingSlash(dirPath) {
const files = glob.sync(`${dirPath}/**/*.+(html|xml)`)
const regexXML = /<(?:link|guid|loc)>(https?:\/\/[^<]+)<\/(?:link|guid|loc)>/g
function rts(url) {
if (url.length > 2 && url.endsWith('/')) {
return url.slice(0, -1)
}
return url
}
files.forEach(file => {
console.log(file)
let content = readFileSync(file, 'utf8')
if (file.endsWith('.html')) {
const $ = load(content)
$('a').each(function(){
const href = $(this).attr('href')
if (href) {
$(this).attr('href', rts(href))
}
})
$('meta').each(function(){
const property = $(this).attr('property')
if (property && property === 'og:url') {
$(this).attr('content', rts($(this).attr('content')))
}
})
content = $.html()
} else if (file.endsWith('.xml')) {
content = content.replace(regexXML, (match, url) => {
return match.replace(url, rts(url))
})
}
writeFileSync(file, content)
})
} I wish I don't have to install additional dependency to my project, but this is a only solution to me. I included that function to my publish script. Right after the site is generated, that function will remove trailing slash from HTML/XML files. But there are some caveats about it.
I'll update this whenever I update my configuration. This is so disappointing because this would be easy to do if Hugo allowed non-slash-trailed URL generation globally... |
Beta Was this translation helpful? Give feedback.
Yes, since HTML include the trailing slash URLs, you may need to combine your script and rewrite rules.
But I have no experience with it, couldn't provide help.