Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add absolute sitemap URL to robots.txt #1396

Open
philippsauer opened this issue Mar 27, 2023 · 1 comment
Open

Add absolute sitemap URL to robots.txt #1396

philippsauer opened this issue Mar 27, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@philippsauer
Copy link

philippsauer commented Mar 27, 2023

Actual Behavior

When using the default robots.txt, which is generated by the PWA via express-robots-txt, it is currently not possible to declare an absolute sitemap URL. An absolute URL is required to use the sitemap feature and provide it to search engines. Therefore projects are forced to disregard the PWA feature and provide a static robots.txt file for each Domain.

Example of absolute sitemap URL in robots.txt:

User-agent: *
Disallow: /search/
Disallow: /xyz
Sitemap: https://myshop.com/sitemap_pwa.xml

Expected Behavior

Implement a feature that allows developers to add absolute sitemap URLs when using the PWA with multi site configuration

Steps to Reproduce the Bug

see express-robots-txt used in server.ts

AB#84812

@philippsauer philippsauer added the bug Something isn't working label Mar 27, 2023
@JoGro-code
Copy link

// Import the required modules
import express from 'express';

// Create an instance of the Express app
const app = express();

// Define the configuration for each site
const siteConfigurations = [
{
domain: 'example.com',
sitemap: 'https://example.com/sitemap.xml',
disallow: ['/search/', '/xyz']
},
// Add more site configurations here
];

// Define the custom middleware function for robots.txt
const serveRobotsTxt = (req, res, next) => {
const { hostname } = req;
const siteConfig = siteConfigurations.find(config => config.domain === hostname);

if (siteConfig) {
const { sitemap, disallow } = siteConfig;
res.type('text/plain');
res.send(User-agent: *\nDisallow: ${disallow.join('\nDisallow: ')}\nSitemap: ${sitemap});
} else {
// Default behavior if no configuration is found for the current site
res.type('text/plain');
res.send('User-agent: *\nDisallow: ');
}
};

// Register the custom middleware for the path "/robots.txt"
app.get('/robots.txt', serveRobotsTxt);

// Start the server
app.listen(3000, () => {
console.log('Server is running on port 3000');
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants