hq-go-url
is a Go (Golang) package for working with URLs. It provides robust tools for extracting URLs from text and parsing them into granular components.
- URL Extraction: Utilizes advanced regular expression patterns to scan and extract valid URLs from any text.
- URL Parsing: Extends
net/url
parser to break down URLs into granular components.
To install hq-go-url
, run:
go get -v -u go.source.hueristiq.com/url
Make sure your Go environment is set up properly (Go 1.x or later is recommended).
The extractor
package lets you scan text and pull out URLs using advanced regex patterns.
package main
import (
"fmt"
"log"
"go.source.hueristiq.com/url/extractor"
)
func main() {
e := extractor.New(extractor.WithScheme())
regex := e.CompileRegex()
text := "Check these out: ftp://ftp.example.com, https://secure.example.com, and mailto:[email protected]."
urls := regex.FindAllString(text, -1)
fmt.Println("Extracted URLs:")
for _, u := range urls {
fmt.Println(u)
}
}
You can customize how URLs are extracted by specifying URL schemes, hosts, or providing custom regular expression patterns.
-
Extract URLs with Schemes Pattern:
e := extractor.New( extractor.WithSchemePattern(`(?:https?|ftp)://`), )
This configuration will extract URLs with
http
,https
, orftp
schemes. -
Extract URLs with Host Pattern:
e := extractor.New( extractor.WithHostPattern(`(?:www\.)?example\.com`), )
This configuration will extract URLs that have hosts matching
www.example.com
orexample.com
.
The parser
package extends Go's net/url
package to include detailed domain breakdown.
package main
import (
"fmt"
"go.source.hueristiq.com/url/parser"
)
func main() {
p := parser.New()
parsed, err := p.Parse("https://subdomain.example.com:8080/path/file.txt")
if err != nil {
fmt.Println("Error parsing URL:", err)
return
}
fmt.Printf("Scheme: %s\n", parsed.Scheme)
fmt.Printf("Host: %s\n", parsed.Host)
fmt.Printf("Hostname: %s\n", parsed.Hostname())
fmt.Printf("Subdomain: %s\n", parsed.Domain.Subdomain)
fmt.Printf("Second-Level Domain (SLD): %s\n", parsed.Domain.SecondLevelDomain)
fmt.Printf("Top-Level Domain (TLD): %s\n", parsed.Domain.TopLevelDomain)
fmt.Printf("Port: %s\n", parsed.Port())
fmt.Printf("Path: %s\n", parsed.Path)
}
You can customize how URLs are parsed by specifying default scheme, or providing custom TLDs.
-
Parse URLs with default scheme:
p := parser.New(parser.WithDefaultScheme("https"))
-
Parse URLs with custom TLDs:
p := parser.New(parser.WithTLDs("custom", "custom2"))
Contributions are welcome and encouraged! Feel free to submit Pull Requests or report Issues. For more details, check out the contribution guidelines.
A big thank you to all the contributors for your ongoing support!
This package is licensed under the MIT license. You are free to use, modify, and distribute it, as long as you follow the terms of the license. You can find the full license text in the repository - Full MIT license text.