Seems incredibly prone to false positives. For example, I can guarantee that I'm not the AdmiralAsshat on Reddit, Gmail, or Twitter, because that username was already taken by the time I tried to sign up for them.
You always need to balance Precision and Recall. For something like this, you want to be exhaustive.
I've worked on search engines, and depending on who is using them, that balance gets struck differently on the ROC curve. For legal matters, for example, they want every record that might match. For ad-hoc (google style) queries, nobody reads the second page so you care more about Precision @ 20 (or really, at 3)
> You always need to balance Precision and Recall. For something like this, you want to be exhaustive.
Respectfully disagree. You're right in principle if you build a tool like that for yourself. But since this is Open Source, you have to take into account that people who don't understand that will use the tool as well and then use that as "evidence" in whatever arguments they're having with someone.
> Respectfully disagree. You're right in principle if you build a tool like that for yourself. But since this is Open Source, you have to take into account that people who don't understand that will use the tool as well and then use that as "evidence" in whatever arguments they're having with someone.
Respectfully disagree with you respectfully disagreeing - this line of thinking can be used to argue against almost any information sharing.
i.e. Should we stop governments releasing statistics that might be misinterpreted by an uninformed press? Should we stop open access to medical journals because untrained readers might use them for incorrect medical advice? Should we stop companies releasing public annual reports, because investing consumers that are untrained in reading financial documents might misinterpret them?
I'm just stating that I think the data is likely a priori garbage (without a lot of sanitizing work), in the hope that that message spreads faster than the use of the tool. I think having that discussion is both important and preferable to censorship.
Actually, there is a recent ~"law"? (discussed on HN) that essentially forbids publishing research that may be misinterpreted by the pronoun people (i.e., any research can be banned).
Hopefully they read one sentence about what the tool does before using it. If they don't, their arguments will be weak due to their own sloppiness in research, which doesn't seem like a new problem.
the thing is, people who want to use this for defamation don't have an incentive to sanitize the data - the message might stick around regardless of whether it has any merit (we've seen this happening even at the most publicly scrutinized national level, imagine how much fewer defenses smaller players have)
Defamation is definitely one problem with this tool. Once there is “evidence”, it can be (ab)used without consequence by someone, because they can simply point at the tool output. If the output is usually “good enough”, it will provide an alibi for their actions: “I didn’t make a mistake, the computer did.”
Nobody who bothers to check what the tool does will be swayed by this. If someone wants to make shit up, and their audience will not bother to spend 30 seconds looking at their "sources", they could just write a script that outputs whatever they wanted and then point at that script.
You are putting a lot of faith in people to do that diligence. Most people will take the “evidence” at face value. And yes, people do write scripts to generate material to defame or hurt others.
You're making my point. People take evidence at face value, so a program like Maigret makes no difference. Why bother using Maigret when you can just make stuff up?
Somewhat offtopic, I'm curious about this "nobody reads the 2nd page" meme...
I find myself reading pages 2-5 quite often, because page 1 just didn't give enough results, and I doubt I'm that much in a minority ?
(I'm talking about actual generalist searches, not people that use a global search engine as a replacement to bookmarks or directly searching, for instance, Wikipedia.)
I imagine that each additional page presentation is exponentially less likely to be reached and considered a 'reputable' result by someone normal.
If I'm looking for a specific issue I'll sometimes try out things as deep as 10 or more pages of search results if nothing on the first 100 ish hits selects the issue, but then only if I can't think of any keyword variations to use that might get me a better result match. I don't expect the average user to go even remotely that far.
That's not how statistics work. I consider myself a frequent visitor of the second results page, but even for me the CTR of the results on the second page are < 1 %, because I almost always find the thing I want on the first page.
The question ought to be "conditional on not finding the result on the first page, how likely is the user to go to the second page versus balk, or re-try a different query?"
I'm fairly confident that number is higher than 1 %, but I don't have the data.
Yup; for common/short usernames it's all too common for other folks to register using the same identifier.
The end result seems to be that this tool decides you're interested in "dating", "porn", "stocks" and tags you with a "ru" country code - despite not owning any of the accounts that the determination has been based off of.
Isn't it obvious just from reading one sentence about what it does? As it clearly says, it's based on username only, so multiple people's results may be mixed in the dossier if they share a username. We don't need a disclaimer for every thing that a moment's thought can reveal.
The people that use this tool to dig up dirt on someone will love to see a mixed bag, because it could be useful to cast shade on their actual target. This tool will be abused on purpose.
Have you ever been part of a civil lawsuit? The lawyers will absolutely lie through their teeth to make their cases. They will introducw any “evidence” that they think will help them. Would you really trust a jury or a judge to understand and believe that this report is not accurate? Having been through the process, let me tell you straight up: you should not put your faith in the justice system.
Not sure where the rant about the justice system came from. You are making my point — people willing to lie through their teeth can just make up any report they like, the existence or accuracy of Maigret changes nothing.
I intentionally steal nicknames I've seen. I'm an asshole as well, so sometimes it works out. I have 13 different nicknames so far that I use/have used since 1997, though I tend to rotate between all of them regularly. I DO hope they try to use "AI" to track me. That will be fun.
I look at it differently, even if someone isn’t going around posting racist/horrible things, people and tastes change over time. I’ve been a part of fandoms that are now seen as cringey or toxic. I’ve also grown up more as a person and I look back at a lot of my old comments as sophomoric. I write differently, and my opinions on things have changed as well. I’ve had people dig through my post history on sites like Reddit to try and find a “gotcha” based on some remark I made years ago.
All in all, I personally feel like it is a good thing to cycle through usernames throughout life.
> I’ve been a part of fandoms that are now seen as cringey or toxic.
—BuyMyBitcoins
Joking aside, there's definitely value to rotating usernames frequently. I've started using random strings on various sites because I really don't see an up side (for me) to being trackable from site to site and definitely across time. (I use very long random strings for my banking usernames because I don't trust them to have enough bits of entropy in their passwords.)
I'm dropping some hot takes on hacker news tonight, so I agree. Although the idea that you can stay anonymous is probably naive. Between database leaks and AI text analysis, good luck.
It’s not so much of a benefit, as that some of us simply don’t care what others think. Actions speak louder than words and all that.
A Internet comment from 10 years ago might cost you a job, or it might cost a friendship. But at least I’m not living a facade about being a perfect and flawless individual, and that helps me sleep better at night.
Seems like the solution is built-in to their strategy. If you get accused of being racist because a racist on twitter uses the same handle, just make a new handle and start over.
Some annoying people seem to have used my gmail to sign up for things like Twitch. I have all the three factor authentication stuff set up, so I don't think they can get in to click the "verify" links, but I wonder if this sort of tool would be able to verified that the account was... verified. Probably not.
My Gmail is first.last name. I get very sensitive documents for a lawyer with my same name who resides in Texas. We've actually had some decent conversations over the years.
I have an extremely common (for German-speaking countries) first-last name combination and I have firstlast@gmail.com (not used anymore, but I check it around once a week). I get so many confidential documents or personal photos.
The most hilarious/sad was the insurance provider domcura which advertised how they got some kind of award for their great processes, yet writing to 2 or 3 different service emails that they are sending me confidential documents resulted in nothing until I wrote to their data protection officer.
It depends, probably Max Müller or Hans Müller, just anything Müller. Unlike the US, we don’t normally use a real name as placeholder name [0], the options there would be Max or Erika Mustermann (literally example or pattern man), for an average person it’s Otto Normalverbraucher (Otto Average-Consumer) and Lieschen Müller.
Holy shit. I never made the connection with Otto Normalverbraucher. Thanks for making me click with this little piece of German trivia on the day German history was made! :)
There was a John Smith who was briefly the leader of the Labour Party in the UK (he sadly died very soon after becoming leader from a heart attack), and I remember him saying that before he became well-known, when he checked into hotels with his real name the receptionist's often eyed him suspiciously, thinking it was a false name to cover up the fact he was checking in with his mistress.
It was a recipt for a medical purchase, at first I thought I was getting scammed. What tipped me off was the email was sent to firstnamelastname@gmail.com and NOT firstname.lastname@gmail.com. That was the day I realized google would even do that.
I ended up using the phone number in the email to contact the person and forwarded the email. And yes, they had my first and last name :)
Yup! I ended up having to look him up via the states bar association's website. Was able to get his bar association email which was different then the gmail. It's the only contact I have in google contacts because it happens about 3 times a year.
He got a congratulations email on a bmw purchase one time. Had a good discussion about cars, we are both gear heads.
I'm subscribed to so many informal church groups in Alabama originally but "I've" since moved to Texas. I tried quite hard to clear up the confusion but with virtually no action ever taken, so now I just observe.
From those emails, it's nice seeing how people support each other and apparently I'm in demand giving scripture classes.
But my e-mail address has been used by real people to subscribe to services in Sweeden, Turkey and somewhere South America. At least language helps to sort things.
I used to have a similar-enough email to a dentist office in Texas and would get a lot of patient scans / files as well. Reached out a dozen times before it finally stopped.
The annoying this is, my email address is based on my name, which is fairly unique (no relation to my account name here), to the point where I'm nearly 100% certain that it is intentional (gotten from a leak of emails which have signed up for some service or another).
My wife has a firstname@gmail.com account and she gets a lot of other peoples’ emails including some rather sensitive stuff (bank statements, employment info, etc.)
Your current username at geronnimo mail dot kazooie? Fun fact: at email dot com is a functional AOL address, still recognized by all corporate services, even with completely made up user part. e.g. birdperson @ email dotcom
Side question: do people still believe that email scrapers are unable parse “username at email dotcom” or “my username at gmail”? Or is it just a cultural thing?
I'm more surprised by the fact that people have used my standard username on not just one, but three porn sites, and that I am now a 31 year old camgirl from Thailand, a 36 year old man from Norfolk looking for love and a 31 year old man from South Africa.
> Maigret collect a dossier on a person by username only...
The About field:
> Collect a dossier on a person by username from thousands of sites
"on a person" seems to imply that they'd belong to someone in particular. Obviously if you have any experience of creating accounts you'd know that's unlikely to be the case, and it's not written in a promissory tone. But it does imply it.
Please don't steal. You may be making it difficult for the people you steal from.
I often use the same username(s) on multiple forums where people discuss similar topics because I want other people who visit the same forums to recognise me as being the same guy.
I've found this username taken a few times, and it has bugged me every time. While it is from Tolkien's Elvish, it is obscure and then misspelled (actually a portmanteau). I've had to add a prefix, such as "TheReal" to it.
An interesting extension to this tool would be to let you specify the interests and the site you want to sign up to, and it would generate a username that fits that profile and is available on the specific site.
I've tried out both today. As far as I can tell maigret generates nice reports whereas sherlock only gives you the urls and you have to dig yourself int those.
Well, that is creepy as hell, but I guess it is obvious that such a tool could exist, and it is better not to have it exclusively in the hands of data-brokers, etc.
I wonder how hard it would be to add the functionality: go back to the email address that has registered these accounts, find any other names they've registered, and search off those. (EDIT: err, wait, I bet that's the "recursive" functionality they mention).
You have probably already clicked ”accept”[1] on one of those ”cookie notices” which actually make you give your consent to having your data processed and all your profiles and devices linked by hundreds, if not thousands of companies all over the world.
[1]: more accurately, you didn’t go through the list of all vendors and didn’t object to their ”legitimate interests”, if that was even an option.
I've started to just generate a 10-char alphanumerical string with the password generator feature and use that as the username on new accounts, starting to update them retroactively, too (by scrapping the old account and making a new one).
Because it might be tied back to your email account or some other common denominator. Using different usernames is good for dox attacks like this but doesn't necessarily help you if your email is compromised. So you need different passwords (AND MFA!!!!)
But people are commonly lazy, and you'd find that most people who don't even consider resistance to deanonymization as a concern will just use the same name on many sites.
Speaking personally, I kind of want people to know I'm the same person in different places. I do realize this is a terrible idea from a privacy perspective, and probably a lot of other perspectives. But I think a bunch of people want to use the net this way.
It's only bad from a privacy perspective if you are sharing things you don't want linked together or are sharing things that can be unintentionally linked back to your real life. If you are the same username across multiple sites and you want those linked together then that's just branding.
Using the same username may allow you to manually poison your own profile to make your data less useful than a shadow profile generated through stylometry
I do this. I have a handle I've been using for 20+ years (not this one and nothing on HN) and I lie about stupid shit rather frequently.
I also also create one-off usernames in addition to my 'permanent' name. That way everybody sees the permanent name and assumes I'm too stupid/ignorant to use different names, so nobody ever suspects the other names are me. (Using different emails because I do the same thing with email addresses and - when I can afford it - phone numbers).
> It's only bad from a privacy perspective if you are sharing things you don't want linked together or are sharing things that can be unintentionally linked back to your real life
You must live in a location where you have very little fear of political oppression. If you're an Iranian or something these days the situation is very different.
I use recognisable usernames in places (like here) where I'm fine with my comments being linked to my real name, and less obvious ones without explicitly hiding it places where I'd prefer my comments don't show up in Google searches for my name (e.g. Reddit - if you trawl through my reddit profile, linking it to my name is easy, and it wouldn't be a problem if you found it, but I don't get a bunch of reddit comments when I Google my name), and then totally separate user names for anything I actually want to be anonymous for.
My situation is a bit unusual in that I almost certainly have a globally unique name (my last name is a corruption of an uncommon name in Norway; said corruption occurred two places independently, and there are less than 500 people with that last name in Norway, and probably about the same in the US, and to date I've seen no indication of anyone combining it with my first name), my first name plus last initial is also unusual enough that I only rarely can't get it as a username.
I've started doing this too. I use DuckDuckGo's email forwarder and they provide random email addresses to hide your real one. Every service I sign up with now has a random email address and what ever the email address is that becomes my username. Couple this with a password manager it works well. If I ever lose access to the password manager I am screwed.
This is technologically impressive, but I was disappointed (but not particularly surprised) to see no mention of the ethics involved in using this sort of project - in either the GitHub issues or the README/docs.
This seems like a great tool for stalking people; particularly the recursive functionality (for tying identifiers together).
I'm not saying the world is worse-off due to the release of software like this - indeed, you could argue that publishing these sorts of OSINT tools allow folks to take a proactive and protective view of their own information and e.g. make changes to their own profiles/privacy settings.
But the question of ethics seems like something all-too-often glossed over in the infosec world. Software doesn't exist in a vacuum.
Software like this _already_ exists. Credit bureaus, advertising firms, data brokers, skip trace databases, three-letter agencies across the globe, and nearly all of the Big Tech companies do their business building centralized profiles of basically everyone and that certainly includes correlating online identities.
The only difference here is that one is more limited and open source.
There's a difference between being able to buy a gun at the gun store where your gun is registered to your name and you can only pick it up after a multi-day waiting period, and being able to pick up a gun at the free gun dispensary at the push of a button in a back alley with no supervision.
I realize this may be a failure to communicate because of a difference in shared values but I come from the perspective that harm reduction is worthwhile even if you can't fully eliminate the problem. That you can't prevent all crimes doesn't mean you shouldn't try to prevent crime.
Not to mention that the risk profile of having three-letter agencies come after you or having a random obsessive weirdo "collect a dossier" and share it with other obsessive weirdos who get a kick out of making your life miserable because they project their life failures on you is very different. Namely, if you're prominent enough to have an entire state agency try to destroy you, it's probably over an active decision you made (e.g. focusing your life on political activism) whereas for random weirdos to harass you and attempt to drive you into suicide you just need to show up on their radar long enough for them to start making up reasons to hate you.
So, the question is: are you more afraid of random nobodies online, or are you more afraid of already organized, already powerful people who had similar capability already?
Unless you're actively plotting to overthrow the government, commit violence against politicians or public property, or participate in major civil rights movements, you should be more worried about a random nobody funnelling all their built up hate and frustration into making your life miserable than a three letter agency, yes.
The difference is that the three letter agency has a budget, process and middle management so at the end of the day someone needs to justify why they're expending resources on making your life miserable whereas the random obsessive weirdo just needs to convince a bunch of other obsessive weirdos that you're a garbage person they can turn into their "lolcow" of the day/week/month/year.
The three letter agency is more likely to inconvenience you out of apathy than to actually try to destroy you intentionally. The random nobody is more likely to try to drive you into suicide for a thrill.
Definitely nobodies online. I don't think anything I've ever done or will do would put me in the interest radius of big agencies, government or private (outside of perhaps advertising and a few exceptions).
I have however, encountered death threats or obsessive/hostile behavior from people online who decided they have a beef with me for one reason or another. I'm much more inclined to be wary of some unstable individual having the means to find my identity (and therefore my location)- potentially using this to act on their emotions IRL.
I'm a nobody, and fiercely proud of it. And I pity the NSA employee who has to monitor my Internet usage, they're going to see an awful lot of Javadoc, programming blogs, websites about plane crashes (and I hope they read some AdmiralCloudberg on Medium for themselves), and websites about rocks and faultlines.
In reality, as mentioned by other commenters, I'm not anything they're interested in, at best I'm a datapoint that advertisers try to categorise into a demographic (male, 30 - 50, doesn't give a shit about cars, into tech).
And anyway, they're already doing it, but impersonally.
It's the person who is very personal that I'm not enthused about.
Not only are you happy to never challenge current power in any meaningful way, but you are sure you never will be. If there is a revolution tomorrow and the Trumpists are in charge, or the Marxists, or the Russians or whatever, you're still sure you will be fine with that order too and will never rock the boat.
You're also confident that if you do nothing wrong by them you have nothing to worry about.
Instead, you've been taught to fear the weak, the nobodies. The nameless savage in the night. As if they have any more reason to hate you than the powerful have.
Absolutely more worried about the "nobodies". I'm boring to big scary organizations. But individuals have a wide potential for varying levels of instability and vengeance. Especially online.
As a commentary on the conundrum, not so much your post.
I always think it’s strange when people make the argument that it’s good for harmful tools to be out in the open because people can/will change their behavior because of them.
99.999999% of people will never see this repo, never know things like this exist, and don’t visit sites like HN but can, and probably will be, affected by it in some way.
> I always think it’s strange when people make the argument that it’s good for harmful tools to be out in the open because people can/will change their behavior because of them.
> 99.999999% of people will never see this repo, never know things like this exist, and don’t visit sites like HN but can, and probably will be, affected by it in some way.
Yup, I do think you're right. And honestly, the folks who'd probably most benefit from being able to run one of these tools to empower themselves with data that can help them change their behaviour/privacy settings - they're probably actually the folks least likely to be able to install and run the tool.
I disagree entirely. As someone with a moderately public persona who is also in the LGBTQ+ community, having tools like this is extremely helpful, both to help me keep a grasp on information about myself and for helping less technically inclined people in my community protect themselves. It's all information that a motivated attacker could find easily with or without the tool, and there are a sufficient number of motivated attackers that any moderately public queer person will, at some point, be doxed and subjected to some level of internet harassment.
Yeah but this tool just lowered the barrier-to-entry for any motivated attacker. What before was an, oh, 20 hour slog through a bunch of sites looking for the same username, is now a 15 minute run of a script.
For an analogy, consider another tool, an app you can side-load and use to unlock any Prius made between 2010 and 2013. Is this beneficial? Certainly some affected Prius owners will be helped by its release. But affected Prius owners will be (probabilistically) harmed. Even if you're one of the group that was helped (probably arguing the exploit was already well known prior to the tool's release, perhaps?) I think it is dishonest to assert that the release of the tool is "good" from a utilitarian sense.
I definitely see your point, but when there are, at minimum, dozens of people who are willing to undergo that 20 hour slog, I'm glad there's a tool that makes it take mere minutes to find OSINT leaks myself.
I'm not saying you're necessarily wrong in general. It's possible this is bad for the world. But I maintain that for myself and the people I care about, it's more useful than it is damaging.
I agree. I think this publication is a net-loss for the world, for sure. People will find and use this code--people who could/would never create it themselves.
It will be used for many reasons, but the stalking is the most obvious.
Not all of my usernames are as resistant as this one, unfortunately.
You have to squint really hard to see "many reasons". Look at the list of sites it checks. Porn / fetish sites, gambling URLs, photo hosting, hobby forums, etc. It's a tool for doxxing and stalking, and little else.
It's essentially like arguing that releasing open-source ransomware toolkit is beneficial. I mean, maybe it's your right, and one can make some strenuous arguments about how it helps the "defense", but really, it just makes it easier to be a terrible person on the internet.
I feel this is similar to Firesheep[1], a browser extension from a decade ago that put cookie session hijacking into the hands of everyday users and became wildly popular until its removal. Pre-Snowden it pushed various major sites to implement HTTPS encryption, including Facebook[2].
Granted there the solution was a rather simple one but I feel it's at least worthwhile for more to be conscientious of singular identities online and what info is disclosed publicly with them.
I think the argument is more that this tool by virtue of being easily accessible creates more malicious actors. Similar to how most people wouldn't steal a locked bike, but a larger portion would steal an unlocked one. It isn't that much harder to steal a locked bike vs an unlocked one, but the threshold is just a tiny bit higher so more people will attempt it. Conversely this tool lowers the threshold for stalking, so more people, who otherwise wouldn't, will use it maliciously. That isn't the fault of the tool or its developers, but it is something to be aware of when building any tool. When you release it, it may get abused by people for bad purposes.
Assuming your online identity is secure by obscurity is not a realistic option. If all it takes is a flashlight for everyone to become “vulnerable”, we need to just assume the light is always on. Someone handing out flashlights just highlights the bigger problem.
If this truly causes worry, adopt better opsec and create generative usernames for different sites. If not, assume anyone will easily be able to link your bowel issue subreddit comments to your LinkedIn profile.
If this wasn’t on GitHub, we wouldn’t be aware of it
The more availability this type of tool has, the less professional its users are. As a result, while it makes easier for you to see what “they” can see about you, “they” become much more personal and bitchy than credit bureaus or ads companies who already have it. E.g. a credit bureau would never sell your comments or “private preferences” to your boss to step over you in career. It increases attack vectors enormously.
Is blissful ignorance better?
It is in this case, I believe. It’s like spreading free covert time-travel-enabled surveillance devices among general public. Someone will pat it onto your back just for dark fun.
The actual fix is to not reuse usernames. Does distributing password stuffing tools increase the ease of password stuffing? Yes. Is that the problem? No - people reusing passwords is the problem.
It doesn’t work retrospectively. You either start a new web life or live under a risk of accidentally exposing or linking to one of your already vulnerable accounts.
Also, if we don’t raise issues like this, the next actual fix will be “don’t reuse writing styles and vocabularies on different sites”.
This line of thinking reminds me of a Roko’s basilisk situation. Embrace it and be prepared vs. don’t mind this nonsense and just actively stop creating/spreading it. And if it spreads enough, make it a punishable offense like hacking/piracy (e.g. via “canary” links) for it to live only in underground. I hope this project will just fly under the public radar due to non-ease of use by an average person or for a similar reason.
This project is exceptionally minor compared to all the public hacking tools that exist. Are you advocating to start censoring any tool that can be used for malicious purposes?
Wrong approach imho. This means, all the tools developed for ethical hacking are "bad" because they can used for "bad". This is like saying we shouldn't sell knifes because knifes can be used to kill someone. It's just silly.
This tool could be used to teach people, e.g. in OSINT challenges, it could be used to gather information for a pentesting job, it can be used to teach people about best practices online etc ...
A knife has many uses, from cutting wood to stabbing people. This tool is very much specialized in the stabbing (not necessarily people, as you said, you could stab a ballistic dummy to see the damage it does). Do you really a think a stab-only knife should be sold with no background checks, with no tracking of who buys it ?
> This tool could be used to teach people, e.g. in OSINT challenges, it could be used to gather information for a pentesting job, it can be used to teach people about best practices online etc
Or just... look yourself up. Like most of the people on this HN thread have done.
We could debate where they are but surely there are limits? I wouldn't want everyone to have access to things like Pegasus (https://citizenlab.ca/2022/10/new-pegasus-spyware-abuses-ide...). You could argue Pegasus could be used to hack a phone of a child predator to save a life, just as you could argue this tool could be used to educate. Maybe for you this username tool falls outside of yours, but it looks more like a switchblade than a butter knife to me.
This is simply a demonstration in essential opsec: don't reuse handles across sites (especially uncommon ones), cycle handles often on high leak platforms (such as Hacker News), don't link your identities by creating profiles.
Anything you put online is perpetual and will be used against you, any innocent hobby will or adolescent joke will become a major transgression at the right time in your life and for the right audience. Even slight grammatical idiosyncrasies in the words you type can be used to root you out by motivated parties.
Curate you official public profile like you are preparing to run for office, one day you just might; anything else should be anonymous.
Ethics plays no part in this reality. The term 'ethics' is only ever used to justify the unethical, eg ethics committees find that it is fine to use embryos in science. It is about applying the veneer of morality over the top.
Ethical decisions are down to the individual.
Unfortunately, lots of people think ethics is what the law allows, or what government says, or even what teachers teach. Most individuals don't take the time to realise or uncover and then act upon the ethics they have innately - aka following one's heart/conscience.
Worse still, many paychecks depend upon the unethical, so don't expect any critical introspection or changes soon.
This is why I use many different usernames, I must have used maybe three dozen different ones by now. The only time I use the same username is for games because I want people to know what other games I play.
My opsec is pretty awful but I at least try to not be "simply type in the username I always use into google or some tool on GitHub" bad. I at least want somebody to have to use a few hours or pay a data broker to reveal all my secrets because my bet is that nobody cares about me enough to do that.
I would be willing to bet that, around the same time those tools become widely distributed (kind of like what stable diffusion did for AI artwork), adversarial tools that emulate or obfuscate writing styles will exist as well.
I guess to analyse your writing style from thousands of billions of messages on Internet in order to simply link two of your accounts together is... overcomplicated?
I mean of course it could work, but except if there is a very special interest for an account, we may need to wait some decades before a list of "linked accounts" containing it in particular could emerge, right?
Comparing the options described in their readme files, maigret adds --html, --pdf, --tags beyond what sherlock supports. And, it adds documentation beyond a README file.
I took a look at my own user names and it's pretty obvious that the results here are mostly worthless.
So many people with empty blogs and the same user name. Unless the blog is auto-created, I don't see the point.
I do find all the .ru tags to be concerning. I can imagine out-of-touch security types at Paypal deciding to sanction me based on it without any recourse.
Yeah I've seen a similar tool fail hard on any wiki site that returns a placeholder page for any username whether or not it actually exists. I've never seen anything except 95% false positives, which is so many that after a few runs you just ignore the tool.
It's mostly worthless but the recursive search via aliases can give some interesting insights. I ran it on my GitHub username and while the result ended up linking me to countries I have never been to (probably via a yandex account as the username is very common and a lot of the profiles were of other people) it did include my full name and cities I've lived in. It seems the most "dangerous" link in my case was Gravatar.
Interestingly, searching by e-mail gave the most garbage results, likely because most services don't expose the e-mail address directly. I guess it's an advantage that neither my usernames nor my legal name are particularly unique.
I gave it a try feeding it with one of my community nicknames. It was rather generic so I did expect some false positives. Color me surprised when it automatically started a second search iteration using my real name!
Turns out that one of the websites I'm registered to uses Gravatar to display profile pictures. I've naively been using the same WP account to set a picture for all my emails, believing that you couldn't get information about my other accounts using a simple picture.
I guess I was wrong, and I'll probably split my Gravatar account this evening.
I used to do what this tool does but manually when networking. Find someone I wanted to network with then find all of their online accounts to try to find an "in." Definitely very creepy but I if called it networking not stalking and its okay right.
You can make it not feel creepy by using the research as a way to seed the conversation rather than replacing it. Ask good questions because you know [part of] the answer, but let the person decide how much they want to say in the moment.
An even better trick is to invite them for a podcast recording. Then all this research is suddenly very normal and expected.
>The Maigret database contains not only the original websites, but also mirrors, archives, and aggregators.
Deeply unsettling. Having had my share of encounteres with people online taking personal interest in an unfriendly context, this is worrying. I don't need to run a search of my info to know to use different usernames, but an automated feature that allows just about anybody determined enough to easily find not only linked accounts but archived data. I'm glad I no longer use social media, at least infrequently and not the major ones- but the fact someone can find archived information? Time to make some adjustments.
I think you're supposed to do this with Stylometrics[1]. So I guess this is sort of the greedy hacky approach to try and associate users across websites.
My operating assumption of Palantir as a company is that they have a very advanced system of a similar shape that tries to accomplish the same goal of linking accounts across services, but I have no insider knowledge.
I've been fantasizing about something like this for years.
But my idea was more along the lines of supporting API keys for many different sites, taking your time to configure all these API settings for all these different "site scraping engines" and that would give you the activity of someone you might be tracing. Their comments, their posts and so forth.
I was thinking this would be great for snooping employers who want to know what their WFH employees are up to.
Funny enough I was even imagining support for Steam, and friend request coming from these robots that in rare cases you could even see live as an employee started playing a game.
Tested docker version: nice but the report is a long list, much like an "index" of the provided username present on a set of sites.
It should provide at least some numbers/graph to "measure" your presence on the web, like I do not now, number of GitHub repositories, instaram photos and so on.
A user with little instagram photos but a lot of GitHub repositories is different from a guy with a lot of Blog posts, Pinterest entries, Twich videos...and so on.
This works pretty well but it's pretty basic. It doesn't for example list all HN comments and ascertain interests etc from there. Just the account info.
It's nice for an initial scan of a target but I expected a more comprehensive report. It's a good start but it could use a lot more.
Yes also with plus addressed E-Mail, or a catch-all E-Mail generator or if you use Firefox Relay etc a generator for these. But the username generator is just a random
English word plus a number.
Mai re-gret is reusing my first username for multiple accounts when I first joined the internet. I've tried to transfer accounts to separate work, personal, and anonymous emails, but nothing is ever really deleted.
Did something similar as an experiment a few years ago, except I used photos and name strings as fuzzy identifiers across social media profiles.
We also scraped individual reactions from social media apps to get a _very_ detailed profile on what they engaged with (like using the "Angry" reaction emoji when Trump said something stupid vs using the "Angry" reaction emoji when someone AOC said something stupid).
Never released it in the wild for obvious ethnical reasons, but was an interesting technical challenge. Also led to super interesting insights – like learning that videos and text links were watched by entirely different audiences on Facebook and Twitter [1]