UPDATE (February 4, 2024): This is the discussion about this project on HN: here. Please specifically read @dang's comment regarding the core assumption of this project: here. On a personal note, the number of Stories removed yesterday (Saturday, February 3, 2024) was the lowest ever recorded by the service. This includes 2 duplicate Stories. As a side note, in the list always check whether a Story is a duplicate or not: this is a very reasonable reason for removal and unfortunately I have no way of automatically determining it in the service!
The purpose of this project is to try to understand the type and scale of the moderation of the Hacker News Front Page.
NOTE: I love Hacker News. I try to read it every day. In the case of OnnxStream (here for example), 95% of the comments were helpful and intelligent. I also understand that moderating a site with huge traffic and where users are basically anonymous must be a very difficult task.
Returning to the purpose of this project, from what I have been able to see, the "public" (i.e. observable from the outside) moderation of the Front Page consists of two main tools: modification of the title of a Story (voluntarily or involuntarily influencing its growth in terms of rank) or directly its removal.
Regarding the first type of moderation, an excellent site is already available that tracks changes to Story titles. Here instead I will focus on the second type.
For the reasons explained in the "Why?" section below, I have developed a small application that logs all the Stories that are removed from the Front Page, for personal use. I later discovered that there is no tool/website that provides this type of information and I decided to make it public here. It was a difficult decision but my rationale is: is it better to have more transparency or less transparency?
If you know of a tool/website similar to this, please let me know: I will archive this repo or set it to private.
A possible very positive outcome for this project could be to have a list similar to this, but available directly among the HN lists. Or even to notify a user when a Story is penalized on the Front Page, perhaps indicating the number of flags and/or the reason, for example.
Feel free to skip this part or click to expand
A friend of mine posted two Stories on Hacker News related to OnnxStream (31 days apart), the first related to SDXL Turbo support and the second related to TinyLlama and Mistral 7B support.
In the case of the first, the Story was among the first on the Front Page, until its title was changed from "Stable Diffusion Turbo on a Raspberry Pi Zero 2 generates an image in 29 minutes" to "OnnxStream: Stable Diffusion XL 1.0 Base on a Raspberry Pi Zero 2". This effectively "killed" the Story. One user pointed out that the new title didn't reflect the spirit of the Story (thanks @practice9).
In the case of the second, the Story was in third place on the Front Page, less than an hour after the submission. In this case it was simply removed from the Front Page.
Having discovered this, perplexed, I sent an email to the moderator. @dang, who was very kind and quick in his response, explained to me that the Story had been flagged by users even without being explicitly [flagged], and that he could therefore only hypothesize the causes of the flag. His hypothesis was that (some?) users might be fed up with news related to LLMs.
While I have no reason to doubt Daniel's good faith, it's hard to believe that HN users would be tired of LLM-related news.
So I decided to develop a small console application to determine the frequency of this phenomenon (actually I was also motivated by the prospect of writing some C# code, after more than 2 years of complete abstinence). I subsequently discovered that there were no tools/websites that monitored this specific phenomenon and I therefore decided to make it public here.
Using the official HN API, the service fetches 90 Top Stories every minute and makes a comparison with the first 30 Top Stories (i.e. the Front Page) fetched the previous minute. It logs all missing Stories here. The assumption is that a Story cannot go from the top 30 to a position greater than 90 in a single minute, without having been explicitly removed. If a Story reappears on the Front Page, it is removed from this log. All Stories present in the second-chance pool are excluded from the log. Title and URL are those from when the Story first appeared in the top 30. The number of points and comments and the rank are those from when the Story was removed from the Front Page. The ID points to the news.social-protocols.org page for that Story, which provides a graph of the Story's position on the Front Page over time.
NOTE: always check whether a Story is a duplicate or not: this is a very reasonable reason for removal and unfortunately I have no way of automatically determining it in the service!
- 42198783 #22 11 points 4 comments -> Show HN: Shop on Amazon with Crypto
- 42200925 #3 5 points 0 comments -> Ford lost $3.7B on its EV sales
- 42199301 #13 364 points 98 comments -> Z-Library Helps Students to Overcome Academic Poverty, Study Finds
- 42198256 #14 104 points 30 comments -> /usr/bin/env -S uv run
- 42201068 #27 138 points 109 comments -> Boeing overcharged the U.S. Air Force 8,000% above market for soap dispensers
- 42194540 #21 209 points 36 comments -> Pipe Viewer – A Unix Utility You Should Know About
- 42203245 #28 6 points 4 comments -> Vercel acquires Grep
- 42200987 #25 29 points 12 comments -> Why one would use Qubes OS? (2023)
- 42197824 #25 107 points 125 comments -> New Calculation Finds we are close to the Kessler Syndrome [video]
- 42204713 #21 4 points 0 comments -> Las Vegas man who called 911 for help killed by police in his home
- 42205132 #17 6 points 2 comments -> Child safety org launches AI model trained on real child sex abuse images
- 42207804 #26 4 points 0 comments -> Microsoft tries to convince Windows 10 users with full-screen prompts
- 42208021 #17 11 points 3 comments -> Comparison of Claude Sonnet 3.5, GPT-4o, o1, and Gemini 1.5 Pro for coding
- 42210690 #1 -> Microsoft Copilot Customers Discover It Can Let Them Read HR Docs and CEO Emails
- 42211720 #13 10 points 0 comments -> Tesla has the highest rate of fatal accidents among all car brands, report shows
- 42208580 #12 121 points 51 comments -> Security researchers identify new malware targeting Linux
- 42211320 #16 80 points 34 comments -> Pidgin 3.0.0 Experimental 1 Announcement
- 42210022 #17 60 points 32 comments -> Broadcastarr: Stream web content through your Jellyfin instance
- 42211367 #19 80 points 7 comments -> Oppose the Patent-Troll-Friendly Prevail Act
- 42212920 #17 4 points 0 comments -> Mastodon's weaknesses and how to fix them
- 42182047 #29 30 points 40 comments -> Is Python That Slow?
- 42213629 #20 6 points 1 comments -> The Soviet scientist who survived a particle accelerator beam through his head
- 42215043 #24 18 points 11 comments -> Toddlers Shoot Three People Every Month in Texas
- 42215202 #27 8 points 0 comments -> Majority of people believe their devices spy on them to serve up ads
- 42215608 #30 6 points 0 comments -> Feds release options for Colorado River as negotiations between states stall
- 42217095 #20 17 points 7 comments -> Texas Opens Investigation into Conspiracy to Boycott Certain Social Platforms
- 42218757 #27 4 points 7 comments -> Microsoft's New PC Looks Just Like a Mac Mini but Serves a Whole New Purpose
- 42219036 #20 3 points 2 comments -> Pulling gold out of e-waste suddenly becomes super-profitable
- 42170144 #20 4 points 2 comments -> Who Really Wrote the Bible: The Story of the Scribes
- 42219646 #6 5 points 1 comments -> UK Farmers Trigger the Revolution – Politely
- 42219695 #22 29 points 7 comments -> Teslas Are Involved in More Fatal Accidents Than Any Other Brand, Study Finds
- 42220545 #13 8 points 0 comments -> Texas approves Bible-infused curriculum for public schools
- 42220719 #21 7 points 0 comments -> iFixit Shares M4 MacBook Pro Teardown
- 42220062 #27 36 points 25 comments -> How the ZX Spectrum became a 1980s icon
- 42221615 #10 6 points 0 comments -> Kent Overstreet restricted from participation in kernel development
- 42224891 #26 7 points 6 comments -> An Idaho County Will Publish Everyone's Ballots to Combat Mistrust
- 42224398 #24 51 points 40 comments -> Wordpress.org released Secure Custom Fields "PRO" version with ACF pro features
- 42226860 #13 8 points 2 comments -> Win for Internet freedom: Google must sell its Chrome browser
- 42226963 #4 5 points 4 comments -> Not Using Copilot
- 42227290 #12 14 points 10 comments -> PHP Is Legacy, in 2024
- 42227151 #20 20 points 40 comments -> Homeless people to be given cash in first major UK trial to reduce poverty
- 42189125 #12 11 points 5 comments -> Why Smart C Coders Love Lua
- 42222717 #16 145 points 42 comments -> 1 Dataset. 100 Visualizations
- 42227996 #19 4 points 0 comments -> C++ Standards Contributor Expelled for 'The Undefined Behavior Question'
- 42230240 #12 6 points 2 comments -> Comparison of Claude Sonnet 3.5, GPT-4o, o1, and Gemini 1.5 Pro for coding
- 42231007 #13 -> WebSockets cost us $1M on our AWS bill
- 42232430 #24 3 points 1 comments -> Semantic Transpiler Agent
- 42213485 #24 10 points 2 comments -> Slow Email (2008)
- 42234097 #6 12 points 7 comments -> I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever
- 42235887 #20 3 points 0 comments -> Adding Bluesky-powered comments to any website in five minutes
- 42229299 #12 302 points 96 comments -> WireGuard: Beyond the most basic configuration
- 42228472 #12 240 points 5 comments -> Full LLM training and evaluation toolkit
- 42226953 #14 263 points 356 comments -> Charset="WTF-8"
- 42234174 #24 44 points 9 comments -> Raspberry Pi Pico 2 W on sale now at $7
- 42197918 #20 30 points 0 comments -> Dec(k)-Month 2: A Decker Game Jam
- 42234147 #8 98 points 11 comments -> Judge's Investigation into Patent Troll Results in Criminal Referrals
- 42212033 #7 9 points 2 comments -> Show HN: VR CPR app where the heart and lungs compress based on ur hand position
- 42232715 #19 29 points 14 comments -> Computing Industry Doesn't Care about Performance: how I made things faster
- 42232289 #12 139 points 60 comments -> Wildlife monitoring technologies used to intimidate and spy on women
- 42231754 #27 20 points 14 comments -> Worldtimeapp.com Easy Timezone Converter
- 42236574 #8 5 points 1 comments -> Bhutan Cashes Out $33.5B in Bitcoin, Still Holds $1.11B in BTC
- 42203290 #14 8 points 3 comments -> Code Search – Grep by Vercel
- 42171135 #18 4 points 0 comments -> The OSI Model Revisited (2023)
- 42237297 #20 5 points 0 comments -> Maths in Computer Science. What I wish I knew before starting university
- 42235762 #24 27 points 40 comments -> What happened when a city started accepting - not evicting - homeless camps
- 42235015 #13 11 points 0 comments -> A Non-Technical Guide to Interpreting SHAP Analyses
- 42237014 #28 4 points 0 comments -> Do Coding Boot Camps Make Sense in an A.I. World?
- 42238531 #14 22 points 35 comments -> Bluesky is breaking the rules in the EU
- 42239263 #29 6 points 2 comments -> Deno vs. Oracle: Canceling the JavaScript Trademark
- 42180746 #29 16 points 2 comments -> The heirloom tomato org chart [video]
- 42239319 #16 6 points 1 comments -> Ending Affirmative Action Harms Diversity Without Improving Academic Merit [pdf]
- 42240350 #25 8 points 2 comments -> Supreme Court wants US input on whether ISPs should be liable for users' piracy
- 42240316 #26 4 points 0 comments -> The next big arenas of competition
- 42240938 #26 4 points 0 comments -> Nvidia claims a new AI audio generator can make sounds never heard before
- 42188795 #29 -> Hurricane Watch: The Peter McNeeley Website
- 42240364 #24 27 points 32 comments -> Synapse still can't find its money
- 42241438 #24 4 points 0 comments -> Why Are All Tech Products Now Shit? [YouTube] [video]
- 42222773 #21 -> Apollo 68080: high performance 68k processor on FPGA
- 42183312 #18 13 points 0 comments -> Astra Dynamic Chunks: How We Saved by Redesigning a Key Part of Astra
- 42163477 #22 17 points 3 comments -> Making waves through the Wallace Line
- 42164058 #27 89 points 30 comments -> Transactional Object Storage?
- 42242825 #11 -> CEO fired 90% of his staff for missing a morning meeting
- 42243375 #26 20 points 40 comments -> Tesla appears to be building a teleoperations team for its robotaxi service
- 42244183 #14 28 points 0 comments -> Linux 6.13 KVM Eliminates an "Awful Idea", Many x86_64 Improvements
- 42213712 #19 32 points 11 comments -> New Comic Book: La BD de L'Avent, Le Lombard Publishing
- 42243689 #29 22 points 12 comments -> Your docs are your infrastructure
- 42245916 #19 15 points 6 comments -> Judge blocks Louisiana law that requires classrooms to display Ten Commandments
- 42246891 #10 7 points 0 comments -> Israel cracks down on Palestinian citizens who speak out against the war in Gaza
- 42245400 #28 28 points 40 comments -> Medicare Pays Different Prices for the Same Drug
- 42247001 #8 25 points 21 comments -> An update on Google's compliance with the DMA
- 42246596 #19 5 points 1 comments -> Qodo Merge integration with Jira — ensure code complies with ticket
- 42248380 #12 9 points 5 comments -> Why you should never kiss a baby on the head
- 42243500 #27 301 points 384 comments -> Lies we tell ourselves to keep using Golang (2022)
- 42211735 #18 4 points 2 comments -> A Guide to Server-Side Rendering
- 42248167 #9 61 points 2 comments -> Hats Off to NASA's Webb: Sombrero Galaxy Dazzles in New Image
- 42252041 #3 27 points 2 comments -> Reply on Bluesky and Decentralization
- 42250429 #22 37 points 14 comments -> Show HN: Clean Your Mac with a Script
- 42252904 #21 4 points 0 comments -> Why Rust and Its Memory Safety Lulls Developers into a False Sense of Security
- 42255043 #18 5 points 1 comments -> Raspberry Pi Compute Module 5
- 42255092 #22 9 points 1 comments -> SpaceX rocket explosion shredded the upper atmosphere