An unthinkable number of credentials and sensitive information gets dumped into the wide web every minute. Hackers often paste the results of their attacks/exploits into the searchable web. Others, white hat hackers and experienced penetration testers, send the sensitive information to parties that could manage it appropriately. And others collect or receive such information as soon as it becomes available. One such party is Dump Monitor. Created in 2013 by security researcher Jordan Wright, Dump Monitor's twitter handle dumpmon provides links to pastes containing information from potential data breaches. You can read how Dump Monitor was created [here](http://raidersec.blogspot.com.au/2013/03/introducing-dumpmon-twitter-bot-that.html). You could simply follow dumpmon's twitter and check through the links they tweet every couple of minutes. I think that's too time consuming... Looking through such sensitive information that is publicly available is not wrongdoing. First of all, you did not get the information yourself, you did not publish it, and unless you use it for malicious purposes, there is nothing wrong with you accessing it. Looking through this type of information is often categorized as open source intelligence gathering (OSINT). According to the [White Paper](http://www.oss.net/dynamaster/file_archive/040320/fb893cded51d5ff6145f06c39a3d5094/OSS1997-02-33.pdf), open source intelligence: _"is intelligence derived from public information - tailored intelligence which is based on information which can be obtained legally and ethically from public sources."_ Open sources for [intelligence](https://en.wikipedia.org/wiki/Open-source_intelligence): - newspapers, radio, TV, magazines, etc. - web-based: social networks, blogs, wikis, etc. - public government records. - geospatial information. - deep web. - and many others. In this post I'm going to show you how you can use Python programming to create an automated tool that looks over dumpmon and downloads all the information dumps in local text files. ___ ## Gathering Sensitive Information with Python What you need: - Python 3.4 - [tweepy](https://github.com/tweepy/tweepy) - [twitter API credentials](https://dev.twitter.com/) Explanation of the algorithm: - I authenticate with my twitter credentials (I cannot parse twitter data through the API otherwise) - I look over dumpmon at twitter - I get their first 20 tweets - I retrieve and save the links as local files Here's the code: ```python import tweepy from tweepy import OAuthHandler import re import urllib consumer_key = 'your twitter consumer key' consumer_secret = 'consumer secret key' access_token = 'your access token' access_secret = 'your access secret' auth = OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_secret) api = tweepy.API(auth) urls = [] i=0 for event in api.user_timeline('dumpmon', count=20): stri = event.text m = re.match('([^\s]+)', stri) urls.append(m.group(0)) for url in urls: try: url = urllib.request.urlretrieve(url, 'dump-%s'%str(i)+'.txt') i=i+1 except: continue ``` Dump Monitor tweets all pastes and data breach dumps in a very standardized format. <center>http://s11.postimg.org/po955xjw3/Collecting_Sensitive_Information_like_an_Ethical.jpg</center> The above algorithm works with this standardized format and it parses the links out of it, and attaches them to a list: ```python for event in api.user_timeline('dumpmon', count=20): stri = event.text m = re.match('([^\s]+)', stri) urls.append(m.group(0)) ``` Then it accesses the links and saves them locally as text. Here's what I get after running the algorithm: <center>http://s11.postimg.org/m23btadir/Collecting_Sensitive_Information_like_an_Ethical.jpg</center> Some files contain plain-text (unhashed and unencrypted) credentials: <center>http://s11.postimg.org/lqlvgix2r/Collecting_Sensitive_Information_like_an_Ethical.jpg</center> ___ ## What you can Do - Be of Service! Before giving you ideas, I have to say that this algorithm can be modified in numerous ways. One would be to have it look for specific 'keywords' (your email?) in these data-breaches and save only the files containing those keywords. If you decide to use this algorithm, you should do it with good intentions in mind. You could look into the dumps and try to alert victims of the information/credential leak about what happened. You could play the investigator. Heck, you could even turn this into a paid job... ___ ### <center>To stay in touch with me, follow @cristi</center> #security #programming ___ [Cristi Vlad](http://cristivlad.com), Self-Experimenter and Author
author | cristi |
---|---|
permlink | collecting-sensitive-information-like-an-ethical-hacker |
category | programming |
json_metadata | {"tags":["programming","security"],"users":["cristi"],"image":["http://s11.postimg.org/po955xjw3/Collecting_Sensitive_Information_like_an_Ethical.jpg","http://s11.postimg.org/m23btadir/Collecting_Sensitive_Information_like_an_Ethical.jpg","http://s11.postimg.org/lqlvgix2r/Collecting_Sensitive_Information_like_an_Ethical.jpg"],"links":["http://raidersec.blogspot.com.au/2013/03/introducing-dumpmon-twitter-bot-that.html","http://www.oss.net/dynamaster/file_archive/040320/fb893cded51d5ff6145f06c39a3d5094/OSS1997-02-33.pdf","https://en.wikipedia.org/wiki/Open-source_intelligence","https://github.com/tweepy/tweepy","https://dev.twitter.com/","http://cristivlad.com"]} |
created | 2016-09-20 15:01:00 |
last_update | 2016-09-20 15:01:00 |
depth | 0 |
children | 4 |
last_payout | 2016-10-22 01:33:03 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 39.505 HBD |
curator_payout_value | 12.768 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 4,719 |
author_reputation | 128,305,218,872,904 |
root_title | "Collecting Sensitive Information like an Ethical Hacker" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 1,307,174 |
net_rshares | 34,178,541,015,035 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
val-b | 0 | 15,613,689,367,638 | 100% | ||
xeldal | 0 | 5,927,715,240,779 | 100% | ||
enki | 0 | 4,781,907,470,476 | 100% | ||
boy | 0 | 7,465,395,282 | 100% | ||
bue-witness | 0 | 9,065,657,972 | 100% | ||
bunny | 0 | 1,627,717,745 | 100% | ||
bue | 0 | 133,921,996,380 | 100% | ||
mini | 0 | 3,999,518,299 | 100% | ||
moon | 0 | 513,602,101 | 100% | ||
aizensou | 0 | 97,382,479,006 | 100% | ||
boombastic | 0 | 949,845,396,407 | 100% | ||
bingo-1 | 0 | 2,065,131,965 | 100% | ||
pfunk | 0 | 798,018,410,401 | 100% | ||
healthcare | 0 | 1,502,246,464 | 100% | ||
daniel.pan | 0 | 2,353,653,675 | 100% | ||
donkeypong | 0 | 2,756,745,437,489 | 100% | ||
alexgr | 0 | 46,149,175,082 | 100% | ||
helen.tan | 0 | 692,801,078 | 100% | ||
gavvet | 0 | 1,256,934,793,891 | 100% | ||
klye | 0 | 42,072,617,424 | 100% | ||
murh | 0 | 702,119,490 | 33.01% | ||
thecryptofiend | 0 | 54,706,592,465 | 100% | ||
justtryme90 | 0 | 79,927,659,056 | 100% | ||
coinbitgold | 0 | 139,291,260,872 | 100% | ||
tee-em | 0 | 4,918,468,049 | 100% | ||
the.bot | 0 | 3,675,507,224 | 100% | ||
johnbradshaw | 0 | 8,387,702,043 | 100% | ||
ericvancewalton | 0 | 601,608,147,802 | 100% | ||
rubybian | 0 | 93,524,873,782 | 100% | ||
furion | 0 | 112,153,984,075 | 100% | ||
bitshares101 | 0 | 14,686,999,556 | 100% | ||
alexbeyman | 0 | 112,822,580,397 | 100% | ||
biophil | 0 | 40,748,766,701 | 100% | ||
diana.catherine | 0 | 37,916,324,163 | 100% | ||
luisucv34 | 0 | 798,108,628 | 100% | ||
the-future | 0 | 3,259,904,507 | 100% | ||
sauravrungta | 0 | 38,150,798,552 | 100% | ||
chloetaylor | 0 | 9,276,956,359 | 100% | ||
trisnawati | 0 | 673,487,302 | 100% | ||
alex.chien | 0 | 1,640,791,796 | 100% | ||
logic | 0 | 10,916,790,568 | 100% | ||
jasonstaggers | 0 | 46,131,059,394 | 100% | ||
uwe69 | 0 | 549,943,306 | 5% | ||
jed78 | 0 | 2,666,995,414 | 32% | ||
merej99 | 0 | 5,271,777,430 | 100% | ||
ullikume | 0 | 3,744,304,451 | 100% | ||
kurtrohlandt | 0 | 5,092,430,077 | 100% | ||
future24 | 0 | 748,202,948 | 100% | ||
driv3n | 0 | 53,039,284,810 | 70% | ||
cristi | 0 | 16,233,290,996 | 100% | ||
scaredycatguide | 0 | 16,762,443,558 | 100% | ||
chrismarketing | 0 | 56,389,966 | 100% | ||
bitcalm | 0 | 47,532,281,494 | 100% | ||
kyriacos | 0 | 34,104,632,618 | 100% | ||
lemouth | 0 | 8,167,905,567 | 100% | ||
neptun | 0 | 84,074,278,516 | 100% | ||
mada | 0 | 13,781,709,523 | 100% | ||
andrewawerdna | 0 | 26,454,936,908 | 100% | ||
claudia | 0 | 433,377,257 | 100% | ||
naquoya | 0 | 4,505,166,401 | 100% | ||
pranisa | 0 | 61,375,917 | 100% | ||
sammie | 0 | 172,703,552 | 100% | ||
funkywanderer | 0 | 3,049,227,622 | 100% | ||
plotbot2015 | 0 | 1,594,127,707 | 100% | ||
anomaly | 0 | 182,155,123 | 100% | ||
ola1 | 0 | 103,202,980 | 100% | ||
adriantoma | 0 | 158,418,481 | 100% | ||
inspiring | 0 | 115,377,453 | 100% | ||
mikerano | 0 | 99,521,145 | 100% | ||
aurorax | 0 | 52,861,617 | 100% | ||
bolle | 0 | 145,699,863 | 100% |
@crisit, good stuff. A little scary how easily accessible people's information is in the digital age. This is something that could be a job, we have real life police. Digital police for hire.
author | scaredycatguide |
---|---|
permlink | re-cristi-collecting-sensitive-information-like-an-ethical-hacker-20160920t151022512z |
category | programming |
json_metadata | {"tags":["programming"],"users":["crisit"]} |
created | 2016-09-20 15:11:03 |
last_update | 2016-09-20 15:11:03 |
depth | 1 |
children | 1 |
last_payout | 2016-10-22 01:33:03 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 195 |
author_reputation | 983,507,426,757,770 |
root_title | "Collecting Sensitive Information like an Ethical Hacker" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 1,307,253 |
net_rshares | 0 |
@cristi :) yeah, information is in plain sight.
author | cristi |
---|---|
permlink | re-scaredycatguide-re-cristi-collecting-sensitive-information-like-an-ethical-hacker-20160920t160347282z |
category | programming |
json_metadata | {"tags":["programming"],"users":["cristi"]} |
created | 2016-09-20 16:03:45 |
last_update | 2016-09-20 16:03:45 |
depth | 2 |
children | 0 |
last_payout | 2016-10-22 01:33:03 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 47 |
author_reputation | 128,305,218,872,904 |
root_title | "Collecting Sensitive Information like an Ethical Hacker" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 1,307,763 |
net_rshares | 0 |
I wish to know so much info like you! Great work Cristi!
author | the-future |
---|---|
permlink | re-cristi-collecting-sensitive-information-like-an-ethical-hacker-20160920t174909796z |
category | programming |
json_metadata | {"tags":["programming"]} |
created | 2016-09-20 17:49:09 |
last_update | 2016-09-20 17:49:09 |
depth | 1 |
children | 1 |
last_payout | 2016-10-22 01:33:03 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 56 |
author_reputation | 64,560,224,887,999 |
root_title | "Collecting Sensitive Information like an Ethical Hacker" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 1,308,767 |
net_rshares | 0 |
hey, thanks! but relative to others, I'm nobody ;)
author | cristi |
---|---|
permlink | re-the-future-re-cristi-collecting-sensitive-information-like-an-ethical-hacker-20160920t180303026z |
category | programming |
json_metadata | {"tags":["programming"]} |
created | 2016-09-20 18:03:03 |
last_update | 2016-09-20 18:03:03 |
depth | 2 |
children | 0 |
last_payout | 2016-10-22 01:33:03 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 50 |
author_reputation | 128,305,218,872,904 |
root_title | "Collecting Sensitive Information like an Ethical Hacker" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 1,308,900 |
net_rshares | 0 |