Whois Parser

From edegan.com
Jump to navigation Jump to search


Project
Whois Parser
Project logo 02.png
Project Information
Has title
Has owner Kunal Shah
Has start date
Has deadline date
Has keywords Tool
Has project status Complete
Has sponsor McNair Center
Has project output Tool
Copyright © 2019 edegan.com. All Rights Reserved.


Current Notes

Note: WHOIS is not an acronym but should be capitalized. It isn't here for legacy reasons.

The latest version of the script (based on v2 from Kunal) is in:

E:\tools\WhoisParser\WhoisParser.pl

Packages were updated on Father using PPM (as admin):

  • Net::WhoisNG
  • Net::Whois::Parser

Packages were installed on Mother:

  • cpanm Date::Manip
  • cpanm Net::WhoisNG --force
  • cpanm Net::Whois --force
  • cpanm Net::Whois::Parser

It was run on Father as:

perl WhoisParser.pl -file="DistinctIncubatorDomains.txt" -outfile="IncubatorWhois.txt"

Note that the Date::Manip functions were commented out in the version on Father, and that line 174 had a map to '' added in the join as most records have nulls for most fields.

2016 Version

This wiki page is under Additional Links/WhoisParser

The whoisParser was written by Kunal Shah on March 20, 2016 and is located

repository: Web_Crawler
branch: shoeb_patch/whoisParser
directory: /WhoIsParser
file: whoisParser.pl

Location:

E:\McNair\Projects\Houston\WhoIsParser

To use this parser, copy above perl program into a directory, make it current working directory (that is, use 'cd' command if needed) and run the following command. The directory should also have the input file(see below).

perl WhoIsParser.pl -file=listofurls.txt -outfile=listofurls_processed.txt

NAME

WhoIs Parser - Retrieves and parses Whois information Specifically, takes a file with a column of domain names and populates the corresponding columns with information from the WhoIs API.

SYNOPSIS

perl whoisParser -file=<file> [-outfile=<file>]

OPTIONS

   -file=<file>:           Name of file of domain names. 
   -outfile=<file>:        The name of the outfile 
   -h:                     Display help

USAGE & FEATURES

Arguments:

A text file with a column of domain names

Returns:

A text file of the domain names with the next 12 columns populated with information pulled from the Whois API. A header specifying each column is inserted into the first row of the file. The columns of information outputed are:

1. Domain Name

2. Creation Date

3. Expiration Date

4. Update Date

5. Registrant Name

6. Registrant Street

7. Registrant City

8. Registrant Postal Code

9. Registrant Country

10. Admin Street

11. Admin City

12. Admin Postal Code

13. Admin Country

BUGS & FEEDBACK

Worked as expected on all example files. Please report any discovered bugs to Kunal.

Tested files: Input: example_file.txt

Output: example_outfile.txt


Input Text:


WhoIs input file in Excel

http://1986ventures.com

http://2nd.md/

http://www.2ndsquare.com

http://www.32nddegree.com/

http://www.80legs.com

http://hotmailpasswordsupportnumber.info/

http://www.MidtownDelivery.com

http://accreu.com

http://www.actionfigurelabs.com

https://m.facebook.com/AddictivePerformance99

http://www.additech.com/

http://adknowledgents.wix.com/adknowledgents

http://www.rmudata.com

http://www.advancedcardiodr.com/

http://alwii.org

http://www.advancedseismic.com

http://www.AdvoWire.com

http://www.aggredyne.com

http://www.akrostechlabs.com/

http://www.aleedex.com

http://www.alertlogic.com/

http://www.aliceandlove.com

https://www.alignedsigns.com/ppcregistration6.htm

https://www.alliedwarranty.com/

http://none yet

http://www.alpheus.net

Output Text:

WhoIs output file in Excel

Domain Name Creation Date Expiration Date Update Date Registrant Name Registrant Street Registrant City Registrant Postal Code Registrant Country Admin Street Admin City Admin Postal Code Admin Country

http://1986ventures.com 2013-09-12T09:25:51Z 12-sep-2016 Domain Admin C/O ID#10760, PO Box 16 Note - Visit PrivacyProtect.org to contact the domain owner/operator Note - Visit PrivacyProtect.org to contact the domain owner/operator Nobby Beach QLD 4218 AU C/O ID#10760, PO Box 16 Note - Visit PrivacyProtect.org to contact the domain owner/operator Note - Visit PrivacyProtect.org to contact the domain owner/operator Nobby Beach QLD 4218 AU

http://2nd.md/ 2010-11-17 2017-11-17

http://www.2ndsquare.com 2013-10-16T04:01:29Z 16-oct-2016 2015-10-16T20:38:12Z Sameer Khan 22215 Tower Terr San Antonio 78259 US 22215 Tower Terr San Antonio 78259 US

http://www.32nddegree.com/ 2008-02-18T18:45:15Z 18-feb-2020 Cutshall, Wes 1321 Upland Dr. Houston 77043 US 1321 Upland Dr. Houston 77043 US

http://www.80legs.com 2008-07-17T21:09:48Z 17-jul-2016 Shion Deysarkar 904 West Avenue Austin 78701 US 904 West Avenue Austin 78701 US

http://hotmailpasswordsupportnumber.info/

http://www.MidtownDelivery.com 2012-01-23T05:01:21Z 23-jan-2017 2015-01-05T05:24:56Z Jim Wiseheart 7655 S. Braeswood#21 Houston 77071 US 7655 S. Braeswood#21 Houston 77071 US

http://accreu.com 2011-05-05T00:11:53.000Z 05-may-2016 Oneandone Private Registration 701 Lee Road Suite 300ATTN Chesterbrook 19087 US 701 Lee Road Suite 300ATTN Chesterbrook 19087 US

http://www.actionfigurelabs.com 2011-02-18T17:40:24Z 18-feb-2017 Phillip Leech 2223 Willowby Dr Houston 77008 US 2223 Willowby Dr Houston 77008 US

https://m.facebook.com/AddictivePerformance99

http://www.additech.com/ 1997-01-24T05:00:00Z 25-jan-2018 Additech, Inc. 10925 Kinghurst Houston 77099 US 10925 Kinghurst Houston 77099 US

http://adknowledgents.wix.com/adknowledgents

http://www.rmudata.com 2000-04-13T17:09:54Z 13-apr-2017 PERFECT PRIVACY, LLC 12808 Gran Bay Parkway West Jacksonville 32258 US 12808 Gran Bay Parkway West Jacksonville 32258 US

http://www.advancedcardiodr.com/ 2012-04-17T14:12:09Z 17-apr-2022 2015-01-08T22:09:14Z Sharafat Hussain Advanced Cardiovascular Care Center800 Peakwood Drive, Suite 8C Houston 77090 US Advanced Cardiovascular Care Center800 Peakwood Drive, Suite 8C Houston 77090 US

http://alwii.org 2011-05-31T21:48:05Z Chi Mao 1917 Ashland St, 2nd FloorIn Select Specialty Hospital Houston 77008 US 1917 Ashland St, 2nd FloorIn Select Specialty Hospital Houston 77008 US

http://www.advancedseismic.com 2009-10-30T19:00:47Z 30-oct-2016 2015-10-31T11:28:22Z na na na na 88888 US na na 88888 US

http://www.AdvoWire.com 2013-07-13T08:43:39Z 13-jul-2018 2013-07-13T08:43:39Z Jason Pampell 6516 North Gessner Houston 77040 US 6516 North Gessner Houston 77040 US

http://www.aggredyne.com 2011-04-01T21:03:52Z 01-apr-2018 Robert C. Hux 10530 Rockley Rd.,Suite 150 Houston 77099 US 10530 Rockley Rd.,Suite 150 Houston 77099 US

http://www.akrostechlabs.com/ 2008-03-24T17:34:07Z 24-mar-2017 2015-03-24T01:54:15Z Registration Private DomainsByProxy.com14747 N Northsight Blvd Suite 111, PMB 309 Scottsdale 85260 US DomainsByProxy.com14747 N Northsight Blvd Suite 111, PMB 309 Scottsdale 85260 US

http://www.aleedex.com 2012-12-27T20:15:55Z 10-jun-2019 2013-06-14T09:54:17Z Farid Premani 10500 Reserve at Fountain Lake Stafford 77477 US 10500 Reserve at Fountain Lake Stafford 77477 US

http://www.alertlogic.com/ 2003-10-10T21:24:13Z 10-oct-2019 PERFECT PRIVACY, LLC 12808 Gran Bay Pkwy West Jacksonville 32258 US 12808 Gran Bay Pkwy West Jacksonville 32258 US

http://www.aliceandlove.com 2014-08-07T01:42:29Z 07-aug-2016 c/o WHOIStrustee.com Limited Riverside View Thornes Lane WF1 5QW GB Riverside View Thornes Lane WF1 5QW GB

https://www.alignedsigns.com/ppcregistration6.htm

https://www.alliedwarranty.com/ 2004-03-31T20:07:28Z 31-mar-2018 2014-03-16T04:17:39Z Registration Private DomainsByProxy.com14747 N Northsight Blvd Suite 111, PMB 309 Scottsdale 85260 US DomainsByProxy.com14747 N Northsight Blvd Suite 111, PMB 309 Scottsdale 85260 US

http://none yet

http://www.alpheus.net 2003-03-27T23:14:33Z 27-mar-2018 2016-03-28T11:22:05Z Alpheus Firstcall 1301 Fannin St.20th Floor Houston 77002 US 1301 Fannin St.20th Floor Houston 77002 US

Summer 2018 Work

I used this parser after running my Google URL finder as detailed on http://mcnair.bakerinstitute.org/wiki/U.S._Seed_Accelerators#Finding_Company_URLs.

Type this in the command line:

perl whoisParser_v2.pl -file="inputfile" -outfile="outputfile"

Associated files can be found in:

E:\McNair\Projects\Accelerators\Summer 2018\url finder

Input file is allURLS.txt and output file is whoisresults.txt