Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove URN support #1930

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
9 changes: 0 additions & 9 deletions doc/Programming-Guide/03_MajorComponents.dox
Original file line number Diff line number Diff line change
Expand Up @@ -329,13 +329,4 @@ TODO: get RFCs linked from ietf
we have made almost all of the cachemgr information available
via SNMP.

\section URNSupport URN Support
\par
We are experimenting with URN support in Squid version 1.2.
Note, we're not talking full-blown generic URN's here. This
is primarily targeted toward using URN's as an smart way
of handling lists of mirror sites. For more details, please
see (http://squid.nlanr.net/Squid/urn-support.html) URN Support in Squid
.

*/
1 change: 0 additions & 1 deletion doc/debug-sections.txt
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,6 @@ section 49 SNMP Interface
section 49 SNMP support
section 50 Log file handling
section 51 Filedescriptor Functions
section 52 URN Parsing
section 53 AS Number handling
section 53 Radix Tree data structure implementation
section 54 Interprocess Communication
Expand Down
12 changes: 12 additions & 0 deletions doc/release-notes/release-7.sgml.in
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ The Squid-@SQUID_RELEASE@ change history can be <url url="https://github.com/squ
<item>Removed purge tool
<item>Remove deprecated languages
<item>Remove Ident protocol support
<item>Remove URN protocol support
</itemize>

<p>Most user-facing changes are reflected in squid.conf (see further below).
Expand Down Expand Up @@ -123,6 +124,17 @@ in the position of what used to be a %ui record field.
<p>If necessary, an external ACL helper can be written to perform Ident transactions
and deliver the user identity to Squid through the **user=** annotation.

<sect1>Removed URN protocol support

<p>Squid URN resolution code has been neglected for a very long time and
caused multiple security vulnerabilities. This feature was rarely used (if at
all). Squid now treats URN as any unknown (to Squid) URI scheme, typically
responding with an HTTP 400 (Bad Request) ERR_INVALID_URL.

<p>If necessary, Squid handling of unknown (to Squid) URI schemes can be
enhanced, and a similar feature can be implemented externally, using
url_rewrite_program helpers or adaptation services.

<sect>Changes to squid.conf since Squid-@SQUID_RELEASE_OLD@
<p>
This section gives an account of those changes in three categories:
Expand Down
20 changes: 2 additions & 18 deletions src/FwdState.cc
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@
#include "ssl/PeekingPeerConnector.h"
#include "Store.h"
#include "StoreClient.h"
#include "urn.h"
#if USE_OPENSSL
#include "ssl/cert_validate_message.h"
#include "ssl/Config.h"
Expand Down Expand Up @@ -388,19 +387,8 @@ FwdState::Start(const Comm::ConnectionPointer &clientConn, StoreEntry *entry, Ht
return;
}

switch (request->url.getScheme()) {

case AnyP::PROTO_URN:
urnStart(request, entry, al);
return;

default:
FwdState::Pointer fwd = new FwdState(clientConn, entry, request, al);
fwd->start(fwd);
return;
}

/* NOTREACHED */
FwdState::Pointer fwd = new FwdState(clientConn, entry, request, al);
fwd->start(fwd);
}

void
Expand Down Expand Up @@ -1272,10 +1260,6 @@ FwdState::dispatch()
Ftp::StartGateway(this);
break;

case AnyP::PROTO_URN:
fatal_dump("Should never get here");
break;

case AnyP::PROTO_WHOIS:
whoisStart(this);
break;
Expand Down
2 changes: 1 addition & 1 deletion src/HttpRequest.cc
Original file line number Diff line number Diff line change
Expand Up @@ -812,7 +812,7 @@ HttpRequest::manager(const CbcPointer<ConnStateData> &aMgr, const AccessLogEntry
char *
HttpRequest::canonicalCleanUrl() const
{
return urlCanonicalCleanWithoutRequest(effectiveRequestUri(), method, url.getScheme());
return urlCanonicalCleanWithoutRequest(effectiveRequestUri(), method);
}

/// a helper for handling PortCfg cases of FindListeningPortAddress()
Expand Down
8 changes: 0 additions & 8 deletions src/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -455,8 +455,6 @@ squid_SOURCES = \
tunnel.cc \
tunnel.h \
typedefs.h \
urn.cc \
urn.h \
wccp.cc \
wccp.h \
wccp2.cc \
Expand Down Expand Up @@ -1944,8 +1942,6 @@ tests_testHttpRange_SOURCES = \
tools.h \
tests/stub_tunnel.cc \
tunnel.h \
urn.cc \
urn.h \
tests/stub_wccp2.cc \
wccp2.h \
wordlist.cc \
Expand Down Expand Up @@ -2331,8 +2327,6 @@ tests_testHttpRequest_SOURCES = \
tools.h \
tests/stub_tunnel.cc \
tunnel.h \
urn.cc \
urn.h \
tests/stub_wccp2.cc \
wccp2.h \
wordlist.cc \
Expand Down Expand Up @@ -2626,8 +2620,6 @@ tests_testCacheManager_SOURCES = \
tools.h \
tests/stub_tunnel.cc \
tunnel.h \
urn.cc \
urn.h \
tests/stub_wccp2.cc \
wccp2.h \
wordlist.cc \
Expand Down
1 change: 0 additions & 1 deletion src/adaptation/ecap/Host.cc
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,6 @@ Adaptation::Ecap::Host::Host()
libecap::protocolHttps.assignHostId(AnyP::PROTO_HTTPS);
libecap::protocolFtp.assignHostId(AnyP::PROTO_FTP);
libecap::protocolWais.assignHostId(AnyP::PROTO_WAIS);
libecap::protocolUrn.assignHostId(AnyP::PROTO_URN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thus rendering eCAP unable to meet the release notes claimed capability of performing Trivial-HTTP Resolver gateway.
("AnyP::PROTO_UNKNOWN" are passed as static string "unknown", not as the received scheme image)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Host application IDs being configured by this code are an optimization that speeds up string comparison for common cases. eCAP code should function correctly without that optimization. If it does not, it is an out-of-scope bug (in eCAP adapter or host application code).

libecap::protocolWhois.assignHostId(AnyP::PROTO_WHOIS);
protocolIcp.assignHostId(AnyP::PROTO_ICP);
#if USE_HTCP
Expand Down
2 changes: 0 additions & 2 deletions src/adaptation/ecap/MessageRep.cc
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,6 @@ Adaptation::Ecap::FirstLineRep::protocol() const
return libecap::protocolWais;
case AnyP::PROTO_WHOIS:
return libecap::protocolWhois;
case AnyP::PROTO_URN:
return libecap::protocolUrn;
case AnyP::PROTO_ICP:
return protocolIcp;
#if USE_HTCP
Expand Down
1 change: 0 additions & 1 deletion src/anyp/ProtocolType.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ typedef enum {
#if USE_HTCP
PROTO_HTCP,
#endif
PROTO_URN,
PROTO_WHOIS,
PROTO_ICY,
PROTO_TLS,
Expand Down
85 changes: 14 additions & 71 deletions src/anyp/Uri.cc
Original file line number Diff line number Diff line change
Expand Up @@ -325,11 +325,6 @@ AnyP::Uri::parse(const HttpRequestMethod& method, const SBuf &rawUrl)
if (scheme == AnyP::PROTO_NONE)
return false; // invalid scheme

if (scheme == AnyP::PROTO_URN) {
parseUrn(tok); // throws on any error
return true;
}

// URLs then have "//"
static const SBuf doubleSlash("//");
if (!tok.skip(doubleSlash))
Expand Down Expand Up @@ -531,48 +526,6 @@ AnyP::Uri::parse(const HttpRequestMethod& method, const SBuf &rawUrl)
}
}

/**
* Governed by RFC 8141 section 2:
*
* assigned-name = "urn" ":" NID ":" NSS
* NID = (alphanum) 0*30(ldh) (alphanum)
* ldh = alphanum / "-"
* NSS = pchar *(pchar / "/")
*
* RFC 3986 Appendix D.2 defines (as deprecated):
*
* alphanum = ALPHA / DIGIT
*
* Notice that NID is exactly 2-32 characters in length.
*/
void
AnyP::Uri::parseUrn(Parser::Tokenizer &tok)
{
static const auto nidChars = CharacterSet("NID","-") + CharacterSet::ALPHA + CharacterSet::DIGIT;
static const auto alphanum = (CharacterSet::ALPHA + CharacterSet::DIGIT).rename("alphanum");
SBuf nid;
if (!tok.prefix(nid, nidChars, 32))
throw TextException("NID not found", Here());

if (!tok.skip(':'))
throw TextException("NID too long or missing ':' delimiter", Here());

if (nid.length() < 2)
throw TextException("NID too short", Here());

if (!alphanum[*nid.begin()])
throw TextException("NID prefix is not alphanumeric", Here());

if (!alphanum[*nid.rbegin()])
throw TextException("NID suffix is not alphanumeric", Here());

setScheme(AnyP::PROTO_URN, nullptr);
host(nid.c_str());
// TODO validate path characters
path(tok.remaining());
debugs(23, 3, "Split URI into proto=urn, nid=" << nid << ", " << Raw("path",path().rawContent(),path().length()));
}

/// Extracts and returns a (suspected but only partially validated) uri-host
/// IPv6address, IPv4address, or reg-name component. This function uses (and
/// quotes) RFC 3986, Section 3.2.2 syntax rules.
Expand Down Expand Up @@ -695,23 +648,18 @@ AnyP::Uri::absolute() const

absolute_.append(getScheme().image());
absolute_.append(":",1);
if (getScheme() != AnyP::PROTO_URN) {
absolute_.append("//", 2);
const bool allowUserInfo = getScheme() == AnyP::PROTO_FTP ||
getScheme() == AnyP::PROTO_UNKNOWN;

if (allowUserInfo && !userInfo().isEmpty()) {
static const CharacterSet uiChars = CharacterSet(UserInfoChars())
.remove('%')
.rename("userinfo-reserved");
absolute_.append(Encode(userInfo(), uiChars));
absolute_.append("@", 1);
}
absolute_.append(authority());
} else {
absolute_.append(host());
absolute_.append(":", 1);
absolute_.append("//", 2);
const bool allowUserInfo = getScheme() == AnyP::PROTO_FTP ||
getScheme() == AnyP::PROTO_UNKNOWN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to exclude URI with image() of "urn:" (which is now part of AnyP::PROTO_UNKNOWN) or we open a security vulnerability for sensitive data exfiltration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, these changes are what render ICAP and helpers unable to meet the release notes claimed capability of performing Trivial-HTTP Resolver gateway.

Copy link
Contributor Author

@rousskov rousskov Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to exclude URI with image() of "urn:" (which is now part of AnyP::PROTO_UNKNOWN) or we open a security vulnerability for sensitive data exfiltration.

PR code treats all unknown (to Squid) URI schemes the same. This code had received unknown non-URN schemes prior to PR changes. Thus, the "we open a vulnerability" assertion is false: Either that vulnerability existed before these changes, or these changes do not open it.

Also, these changes are what render ICAP and helpers unable to meet the release notes claimed capability of performing Trivial-HTTP Resolver gateway.

That problem was flagged and addressed in another change request. If necessary, let's continue this part of the discussion there.


if (allowUserInfo && !userInfo().isEmpty()) {
static const CharacterSet uiChars = CharacterSet(UserInfoChars())
.remove('%')
.rename("userinfo-reserved");
absolute_.append(Encode(userInfo(), uiChars));
absolute_.append("@", 1);
}
absolute_.append(authority());
absolute_.append(path()); // TODO: Encode each URI subcomponent in path_ as needed.
}

Expand All @@ -723,15 +671,15 @@ AnyP::Uri::absolute() const
* and never copy the query-string part in the first place
*/
char *
urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &method, const AnyP::UriScheme &scheme)
urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &method)
{
LOCAL_ARRAY(char, buf, MAX_URL);

snprintf(buf, sizeof(buf), SQUIDSBUFPH, SQUIDSBUFPRINT(url));
buf[sizeof(buf)-1] = '\0';

// URN, CONNECT method, and non-stripped URIs can go straight out
if (Config.onoff.strip_query_terms && !(method == Http::METHOD_CONNECT || scheme == AnyP::PROTO_URN)) {
// CONNECT method and non-stripped URIs can go straight out
if (Config.onoff.strip_query_terms && method != Http::METHOD_CONNECT) {
// strip anything AFTER a question-mark
// leaving the '?' in place
if (auto t = strchr(buf, '?')) {
Expand Down Expand Up @@ -814,10 +762,6 @@ urlIsRelative(const char *url)
void
AnyP::Uri::addRelativePath(const char *relUrl)
{
// URN cannot be merged
if (getScheme() == AnyP::PROTO_URN)
return;

// TODO: Handle . and .. segment normalization

const auto lastSlashPos = path_.rfind('/');
Expand Down Expand Up @@ -962,7 +906,6 @@ urlCheckRequest(const HttpRequest * r)
/* does method match the protocol? */
switch (r->url.getScheme()) {

case AnyP::PROTO_URN:
case AnyP::PROTO_HTTP:
return true;

Expand Down
9 changes: 2 additions & 7 deletions src/anyp/Uri.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ namespace AnyP

/**
* Represents a Uniform Resource Identifier.
* Can store both URL or URN representations.
*
* Governed by RFC 3986
*/
Expand Down Expand Up @@ -138,8 +137,6 @@ class Uri
SBuf &absolute() const;

private:
void parseUrn(Parser::Tokenizer&);

SBuf parseHost(Parser::Tokenizer &) const;
int parsePort(Parser::Tokenizer &) const;

Expand Down Expand Up @@ -192,9 +189,7 @@ operator <<(std::ostream &os, const Uri &url)
os << url.getScheme().image();
os << ":";

// no authority section on URN
if (url.getScheme() != PROTO_URN)
os << "//" << url.authority();
os << "//" << url.authority();

// path is what it is - including absent
os << url.path();
Expand All @@ -211,7 +206,7 @@ void urlInitialize(void);
/// call HttpRequest::canonicalCleanUrl() instead if you have HttpRequest
/// \returns a pointer to a local static buffer containing request URI
/// that honors strip_query_terms and %-encodes unsafe URI characters
char *urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &, const AnyP::UriScheme &);
char *urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &);
const char *urlCanonicalFakeHttps(const HttpRequest * request);
bool urlIsRelative(const char *);
char *urlRInternal(const char *host, unsigned short port, const char *dir, const char *name);
Expand Down
2 changes: 1 addition & 1 deletion src/anyp/UriScheme.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ using KnownPort = uint16_t;
/// validated/supported port number (if any)
using Port = std::optional<KnownPort>;

/** This class represents a URI Scheme such as http:// https://, wais://, urn: etc.
/** This class represents a URI Scheme such as http:// https://, wais:// etc.
* It does not represent the PROTOCOL that such schemes refer to.
*/
class UriScheme
Expand Down
2 changes: 1 addition & 1 deletion src/cf.data.pre
Original file line number Diff line number Diff line change
Expand Up @@ -1261,7 +1261,7 @@ ENDIF
# destination TCP port (or port range) of the request [fast]
#
# Port 0 matches requests that have no explicit and no default destination
# ports (e.g., HTTP requests with URN targets)
# ports (e.g., HTTP requests with ICY, ICP, and HTCP targets).

acl aclname localport 3128 ... # TCP port the client connected to [fast]
# NP: for interception mode this is usually '80'
Expand Down
4 changes: 2 additions & 2 deletions src/client_side_request.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1797,7 +1797,7 @@ ClientHttpRequest::setLogUriToRawUri(const char *rawUri, const HttpRequestMethod
// Should(!request);

// TODO: SBuf() performance regression, fix by converting rawUri to SBuf
char *canonicalUri = urlCanonicalCleanWithoutRequest(SBuf(rawUri), method, AnyP::UriScheme());
const auto canonicalUri = urlCanonicalCleanWithoutRequest(SBuf(rawUri), method);

absorbLogUri(AnyP::Uri::cleanup(canonicalUri));

Expand All @@ -1823,7 +1823,7 @@ ClientHttpRequest::setErrorUri(const char *aUri)
uri = xstrdup(aUri);
// TODO: SBuf() performance regression, fix by converting setErrorUri() parameter to SBuf
const SBuf errorUri(aUri);
const auto canonicalUri = urlCanonicalCleanWithoutRequest(errorUri, HttpRequestMethod(), AnyP::UriScheme());
const auto canonicalUri = urlCanonicalCleanWithoutRequest(errorUri, HttpRequestMethod());
absorbLogUri(xstrndup(canonicalUri, MAX_URL));

al->setVirginUrlForMissingRequest(errorUri);
Expand Down
2 changes: 0 additions & 2 deletions src/clients/Client.cc
Original file line number Diff line number Diff line change
Expand Up @@ -488,8 +488,6 @@ purgeEntriesByHeader(HttpRequest *req, const char *reqUrl, Http::Message *rep, H
if (urlIsRelative(hdrUrl)) {
if (req->method.id() == Http::METHOD_CONNECT)
absUrl = hdrUrl; // TODO: merge authority-uri and hdrUrl
else if (req->url.getScheme() == AnyP::PROTO_URN)
absUrl = req->url.absolute().c_str();
else {
AnyP::Uri tmpUrl = req->url;
if (*hdrUrl == '/') {
Expand Down
1 change: 0 additions & 1 deletion src/error/forward.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ typedef enum {

/* DNS Errors */
ERR_DNS_FAIL,
ERR_URN_RESOLVE,

/* HTTP Errors */
ERR_ONLY_IF_CACHED_MISS, /* failure to satisfy only-if-cached request */
Expand Down
1 change: 0 additions & 1 deletion src/icmp/pinger.cc
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@
* This information is used in numerous ways:
\li - Sent in ICP replies so neighbor caches know how close
* you are to the source.
\li - For finding the closest instance of a URN.
\li - With the 'test_reachability' option. Squid will return
* ICP_OP_MISS_NOFETCH for sites which it cannot ping.
*/
Expand Down
Loading