Skip to content

hueristiq/hq-go-url

Repository files navigation

hq-go-url

made with go go report card open issues closed issues license maintenance contribution

hq-go-url is a Go (Golang) package for working with URLs. It provides robust tools for extracting URLs from text and parsing them into granular components.

Resources

Features

  • URL Extraction: Utilizes advanced regular expression patterns to scan and extract valid URLs from any text.
  • URL Parsing: Extends net/url parser to break down URLs into granular components.

Installation

To install hq-go-url, run:

go get -v -u go.source.hueristiq.com/url

Make sure your Go environment is set up properly (Go 1.x or later is recommended).

Usage

Extraction

The extractor package lets you scan text and pull out URLs using advanced regex patterns.

package main

import (
    "fmt"
    "log"

    "go.source.hueristiq.com/url/extractor"
)

func main() {
    e := extractor.New(extractor.WithScheme())

    regex := e.CompileRegex()

    text := "Check these out: ftp://ftp.example.com, https://secure.example.com, and mailto:[email protected]."

    urls := regex.FindAllString(text, -1)

    fmt.Println("Extracted URLs:")

    for _, u := range urls {

        fmt.Println(u)
    }
}

You can customize how URLs are extracted by specifying URL schemes, hosts, or providing custom regular expression patterns.

  • Extract URLs with Schemes Pattern:

     e := extractor.New(
     	extractor.WithSchemePattern(`(?:https?|ftp)://`),
     )

    This configuration will extract URLs with http, https, or ftp schemes.

  • Extract URLs with Host Pattern:

     e := extractor.New(
     	extractor.WithHostPattern(`(?:www\.)?example\.com`),
     )

    This configuration will extract URLs that have hosts matching www.example.com or example.com.

Parsing

The parser package extends Go's net/url package to include detailed domain breakdown.

package main

import (
	"fmt"

	"go.source.hueristiq.com/url/parser"
)

func main() {
	p := parser.New()

	parsed, err := p.Parse("https://subdomain.example.com:8080/path/file.txt")
	if err != nil {
		fmt.Println("Error parsing URL:", err)

		return
	}

    fmt.Printf("Scheme: %s\n", parsed.Scheme)
    fmt.Printf("Host: %s\n", parsed.Host)
    fmt.Printf("Hostname: %s\n", parsed.Hostname())
    fmt.Printf("Subdomain: %s\n", parsed.Domain.Subdomain)
    fmt.Printf("Second-Level Domain (SLD): %s\n", parsed.Domain.SecondLevelDomain)
    fmt.Printf("Top-Level Domain (TLD): %s\n", parsed.Domain.TopLevelDomain)
    fmt.Printf("Port: %s\n", parsed.Port())
    fmt.Printf("Path: %s\n", parsed.Path)
}

You can customize how URLs are parsed by specifying default scheme, or providing custom TLDs.

  • Parse URLs with default scheme:

     p := parser.New(parser.WithDefaultScheme("https"))
  • Parse URLs with custom TLDs:

     p := parser.New(parser.WithTLDs("custom", "custom2"))

Contributing

Contributions are welcome and encouraged! Feel free to submit Pull Requests or report Issues. For more details, check out the contribution guidelines.

A big thank you to all the contributors for your ongoing support!

contributors

Licensing

This package is licensed under the MIT license. You are free to use, modify, and distribute it, as long as you follow the terms of the license. You can find the full license text in the repository - Full MIT license text.