Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Back in the early 2000's I worked for a company that was was hired (via intermediary law firms) by publicly listed companies to investigate pump and dump schemes on Yahoo message boards (you might call that the "Reddit of the early 2000's").

It involved a lot of web scraping Yahoo with Perl (specifically LWP) and then lots of analysis by humans with some help from automated tools. For example, if you plotted a histogram of each user's posts, you could clearly see when someone was at work (posted between 9am and 5pm with a drop off around noon) and at home (posts between 6pm and 2am with a peak around 10pm).

The analysts would often find a piece of information from a 2 year old post, e.g. 'Go Cubs!', and a one week old post "I just attended my 20 year UofI reunion" and quickly be able to narrow down on who the person might be. Coupled with Lexis Nexis (which was just coming online at the time), we routinely narrowed down individuals to just one possible person.

Given that this was done back in the early 2000's using ancient servers (by today's standards) and basic statistical analysis with a lot of legwork, I would be surprised if there weren't companies also trying to find the people on Reddit today.



Hah... The guys on the scox message board automated this pretty well; there was a user named warmcat who even opened sourced some of it.

The Reddit comments usually aren't long enough to be meaningful, and I think that there are now organised p&d teams, so that level of investigation isn't very useful anymore


You'd be surprised what you can get out of short Reddit comments if they're posting in multiple subs for long enough, see https://redditmetis.com/ for example.


> The Reddit comments usually aren't long enough to be meaningful, and I think that there are now organised p&d teams, so that level of investigation isn't very useful anymore

The Yahoo comments usually weren't that long either. The key point was that while each individual post gave you, on average, only a little but of information, in the aggregate you got enough to narrow down who the source was to within 5 people. From there, you could usually narrow it down using "legwork" etc.


With something like Robinhood, where you're trading, presumably they have your bank details (and phone details) so pinning your trades on a irl person is a given.


> Coupled with Lexis Nexis (which was just coming online at the time)

Lexis Nexis has been around a lot longer than that. By "online" do you mean web accessible?


Correct, web accessible is what I was referring to.


What was the result of the investigations?


But was P&D happening?


P&D happens with penny stocks, where a small amount of interest can yield very real change. You can't P&D with any big cap.

And the best part with P&D is that most everyone involved knew what was happening, but they played along hoping to get out before it peaks.

And it was very real. You could get the emails pimping whatever the OTC stock was and watch as it pumped up, pumped up, pumped up, and then crashed to nothing.


I had a lovely strategy of signing up with the stock promoters of Vancouver and South Florida and trying to short all of it. Worked nicely until someone published a paper on returns of spam stocks, at which point it died instantly.


I'm aware of this, but the parent said they did an investigation and I'm asking if there was any actual foul play going on or just suspected.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: