Extendable whois parser written in Go.
This project is in development stage and is not ready for production systems usage. Any support will be appreciated.
go get -u github.com/icamys/whois-parser
To try just copy and paste the following example to golang playground (don't forget to check the "imports" flag):
package main
import (
"encoding/json"
"fmt"
whoisparser "github.com/icamys/whois-parser"
)
func main() {
domain := "google.com"
whoisRaw := "Domain Name: GOOGLE.COM"
// whoisRecord is of Record type, see ./record.go
whoisRecord := whoisparser.Parse(domain, whoisRaw)
whois2b, _ := json.Marshal(whoisRecord)
fmt.Println(string(whois2b))
}
- com
- ru
- net
- org
- ua
- ir
- in
- br
- tr
- vn (requires POST request with captcha) https://www.vnnic.vn/en/whois-information?lang=en
- uk
- au
- info
- co
- gr (requires POST request with captcha) https://grweb.ics.forth.gr/public/whois
- de
- io
- id
- ca
- by
- jp
- fr
- tw
- xn--p1ai (рф)
- me
- pl
- kz
- za
- mx
- it
- eu
- tv
- xyz
- es (has restriction by whitelist, requires IP registration)
- il
- th
- nl
- my (connect: Connection timed out with whois client)
- online
- biz
- pro
- ar
- us
- club
- edu
- pk (requires POST request) https://pk6.pknic.net.pk/pk5/lookup.PK
- cn
- su
- ch (Requests of this client are not permitted. Please use https://www.nic.ch/whois/ for queries.)
- cl
- co.jp
Before contributing any code please check that following commands have no warnings nor errors.
-
Check cyclomatic complexity (15 is max acceptable value):
$ gocyclo -over 15 ./
-
Run tests:
# Use -count=1 to disable cache usage $ go test -count=1 ./...
-
Lint code:
$ golint ./...
Let's create new parser for TLDs .jp
and .co.jp
-
Create file named
parser_jp.go
in the root directory -
Define parser and register it:
package whoisparser import ( "github.com/icamys/whois-parser/internal/constants" "regexp" ) // Defining new parser with regular expressions for each parsed section var jpParser = &Parser{ errorRegex: &ParseErrorRegex{ NoSuchDomain: regexp.MustCompile(`No match!`), RateLimit: nil, MalformedRequest: regexp.MustCompile(`<JPRS WHOIS HELP>`), }, registrarRegex: &RegistrarRegex{ CreatedDate: regexp.MustCompile(`(?i)\[Created on] *(.+)`), DomainName: regexp.MustCompile(`(?i)\[Domain Name] *(.+)`), DomainStatus: regexp.MustCompile(`(?i)\[Status] *(.+)`), Emails: regexp.MustCompile(`(?i)` + EmailRegex), ExpirationDate: regexp.MustCompile(`(?i)\[Expires on] *(.+)`), NameServers: regexp.MustCompile(`(?i)\[Name Server] *(.+)`), UpdatedDate: regexp.MustCompile(`(?i)\[Last Updated] *(.+)`), }, registrantRegex: &RegistrantRegex{ Name: regexp.MustCompile(`(?i)\[Registrant] *(.+)`), Organization: regexp.MustCompile(`(?i)\[Organization] *(.+)`), }, adminRegex: &RegistrantRegex{ ID: regexp.MustCompile(`(?i)\[Administrative Contact] *(.+)`), }, techRegex: &RegistrantRegex{ ID: regexp.MustCompile(`(?i)\[Technical Contact] *(.+)`), }, } // Register newly created parser for the particular TLD func init() { RegisterParser(".jp", jpParser) }
-
Create file named
parser_co_jp.go
in the root directory. -
The whois for
.co.jp
extends whois for.jp
. So we copy the.jp
parser and extend ininit()
function:package whoisparser import "regexp" // copy jpParser var coJpParser = jpParser func init() { // extend coJpParser with additional regexes coJpParser.registrarRegex.CreatedDate = regexp.MustCompile(`\[Registered Date\] *(.+)`) coJpParser.registrarRegex.ExpirationDate = regexp.MustCompile(`\[State\] *(.+)`) coJpParser.registrarRegex.UpdatedDate = regexp.MustCompile(`\[Last Update\] *(.+)`) RegisterParser(".co.jp", coJpParser) }
-
Write tests.
- Creating whois fixture
test/whois_co_jp.txt
with valid whois - Write your parser tests in
parser_co_jp_test.go
- Creating whois fixture
In some cases the whole address is provided in a way that it would be more convenient and performant to parse the address using only one regular expression. For this purpose we use regex named groups.
Use regex group name for particular fields:
Field | Regex group name |
---|---|
Street | street |
StreetExt | streetExt |
City | city |
PostalCode | postalCode |
Province | province |
Country | country |
Lets take a look at an example.
-
Suppose we have an address:
Address: Viale Del Policlinico 123/B Roma 00263 RM IT
-
We can craft a regular expression as follows:
(?ms)Registrant(?:.*?Address: *(?P<street>.*?)$.*?)\n *(?P<city>.*?)\n *(?P<postalCode>.*?)\n *(?P<province>.*?)\n *(?P<country>.*?)\n.*?Creat
Here all address regex groups are optional. If any group name is missing, an empty string will be assigned as value.
-
Now we assign our crafted regex to some parser structure and the address will be successfully parsed:
var itParser = &Parser{ registrantRegex: &RegistrantRegex{ Address: regexp.MustCompile(`(?ms)Registrant(?:.*?Address: *(?P<street>.*?)$.*?)\n *(?P<city>.*?)\n *(?P<postalCode>.*?)\n *(?P<province>.*?)\n *(?P<country>.*?)\n.*?Creat`), }, // ... }
Parsing result:
{ "registrant": { "street" : "Viale Del Policlinico 123/B", "city": "Roma", "province": "RM", "postal_code": "00263", "country": "IT" } }
-
Note that if the
Address
field is set, than any other address regex fields will be ignored:registrantRegex: &RegistrantRegex{ Address: regexp.MustCompile(`(?ms)Registrant(?:.*?Address: *(?P<street>.*?)$.*?)\n *(?P<city>.*?)\n *(?P<postalCode>.*?)\n *(?P<province>.*?)\n *(?P<country>.*?)\n.*?Creat`), City: regexp.MustCompile(`City (.*)`), // This regex will be ignored as Address is set },