-
Notifications
You must be signed in to change notification settings - Fork 0
imhsz/fake_identity_generator_for_whole_countries
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Fake Identity Generator Copyright (C) 2011-2017 Hibbard M. Engler (Bitcoin address 1ERDHsxtekdh5FAsxdb92PBFK7nnuwMkbp for gifts) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. This is the source code that made the websites xdd.org, xdd2.org, and xdduk.org. It uses bar_database as the database server, information processor. Lanuages are C, Perl, snobol4, bash, awk. Development took 18 months,a nd the website was released in 2012. The idea was to make data for corporate honeypots, to help keep the real infomration safe. Then google started crawling the website. They at one point (2014) had 140 million entries in their search engine, which caused some confusion. We were getting 70,000 page views from users a day, and made some money on advertizing until google did a manual edit, and said that xdd doesn't fit with their guidelines. I also did xdduk.org for england, and wante to do Austrlia, Canada, China, and India. What happened when this was on google, is that people would not get everyone's real information. I was protecting the identity of everyone in the country. :) Right now, xdd.org is down for maintennace, but it is still running. There are many components that make this real cool including: rtr - a web server in C that does single thread assymetric blocking. It could handle about 400 page views a second, that with reading a huge 600 gb database. The database (bar_database) - binary search on bar delimited files (bar|delimited|files|). The file was memory mapped, and the drives are 3 raid 0 SSD's. Real fast. And of course, the final result - a complete set of people with reasonable names, addresses, phone numbers, Social security numbers, passport numbers, credit card numbers, all consistent with eachother, and believable , but entirely fallacious. Thats the best part. I did this with extreme do-diligance. Including queryiong with the FBI before doing it. I also was investigated by the Social Security Administration special services, but never was commited of a crime. So, please, use this for good, maybe a video game that needs a realistic fake town or faks country with fake peopl? maybe as the honeypot to alert companies of a break-in before they get to the real customer data, things like that. Don't use it for comitting fraud. I do not hand have never condoned doing that. Some people have tried using the fake names to commit crimes, and I worked with law enforcement in those cases. Blah Blah Blah. Hib Below is the readme when I first started the project. My idea was simple: 1. make billions of fake names - the ultamite honey pot. 2. Send out a net of leaders to the honeypot, and collect information. 3. sell the informtion of who used the fake data 4. Make sure that all data is externally verifiable - to prove no illintent. 5. generate a context based knowledge sytem using prime numbers. 6. Become the defacto place to verify documents. Fake names: states cities zip code zip cities area codes prefiex (exchanges) first names last names ssn death index feed into ssn death, first name, last name ssn formula name frequency and grouping analysis - to identify weights and occurrences of names. zp4 ---> streets for cities and states use streets and city names to find real addresses with real people - whitepages at first. But terms of use does not allow - so instead Do not use zp4 -- purchase an occupant list for the entire country. Then for each batch For each city for each zipcode for each street for each house for the number of people in each house pick a random last name if the last name is on this street in real life, try again pick a random first name -- with a distribution to go with the last name or the group more likely. if the first name, last name is in theis cipcode or city in real life, try again. pick a middle name. If the state uses the names to produce the drivers license, and there is a real person in the state with first, middle initial, and last name, try again pick a birth date generate a ssn based on the birth date, and a city close to the living in city. generate a drivers license, based on the state rules. Generate 2-3 fake credit card numbers generate a phone number within the area - not picking a number already used. record the information - and sign it so there is a paper trail back to the beginning. Now the data is generated. The next step is to make the honeypot. Make the honeypot look like an apache web server with files just there for browsing. Actually though, the server is running a cgi program. serving up folders - in actuality it is reading the database, producing the results on the fly. For each hit - record the date and time, and sign it. For each new ip address, or ip addresses over a month old, performa traceroute. Use cookies- if allowed, to gain other information Use SSL - and record response times for each packet - if possiblem Find out about web browser plugins and whatnot - to get a computer fingerprint. All this information will then be signed by the program - keeps evidence from being altered. All documents will be hash coded and converted to something like the ID system I developed: ID_1h324873223481284g_ID.<document_type> All documents will get a "key" - a key is a large prime number. Docuemnts can be accessed by the ID- in some cases, or by the key. Cross reference -- association of atoms are performed by multiplying keys together. If n keys are known, then to search through the keys takes n divisions-- modulo's actually. If you know two large numbers, you can use the sieve to quickly find out all the primes of those numbers. :) All content - be it words, etc, will have a set of numbers that make up the content. The earlier primes will allow arrangement of the value primes +- the prime number 2 is reserved for the hyperdocument. So If you know something, and someone else knows something, and you get the idea s together, then you know more than you could know apart. This means that sharing increases the knowledge of information. number : number Well, thats enough of that. I got real stressed out about this thing. I sent a letter to the FBI asking if generating fake nmes is legal or not. I will table this idea for a while, maybee talk to gary about it. My time is better spent looking into a document solutions. Time with 16gb memory: ha - 231m 89737163 hb - 4.43m 89737163 hc -j 1 431m 89737163 hd - 164m 127113427 he -j 2 - 250375111 hf - -j 4 750m 250375111 hg_ssn -j 4 135m 250375111 hi_dl -j 2 62m 250375111 hj -j 2 165m 250375111 hk -j 2 23m 250375111 hl -j 2 132m 250375111 export SET=0 export OWN=0 hm - 55 hn - 258 ho - time make -j 2 405 + more cause of everything and stuff transfer to machine over 1000-base-t encrypted drive - 137 minutes Edit c programs - set minimum date to current date edit filter_me.sh in hc... hd... and he... - and add new exlfza stuff time with 32gb memory: export SET=PUB5 export OWN=KCD5 # copy leads.txt into ~/common # edit make_birthdate.c and make_birthdate2.c and make_ssn_numbers.c and add_birthdate_birth_date.c ha - 303 + 35 hb - short hc - 13 hours 11 minutes = 700 minutes hd - he cd he* time bash save_new.sh - to make the next set
About
Generates a whole country - 300 million fake identies for the us. Also 35 million for the UK.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- HTML 96.9%
- C 1.4%
- Perl 0.7%
- Makefile 0.5%
- Shell 0.4%
- Other 0.1%