Saturday, April 9, 2022

Friday, April 8, 2022

Show HN: A poem inside HTTP response headers https://ift.tt/qBkrtO8

Show HN: A poem inside HTTP response headers https://ift.tt/GPLHQaS April 8, 2022 at 05:27PM

Show HN: I Made a Puzzle Game in HTML5 https://ift.tt/HGVxmre

Show HN: I Made a Puzzle Game in HTML5 https://eightcolors.net April 8, 2022 at 07:24PM

Show HN: Colludle – Collaborative Wordle Game https://ift.tt/iRMrqa8

Show HN: Colludle – Collaborative Wordle Game https://ift.tt/z7Uqyj3 April 8, 2022 at 07:07PM

Show HN: Programmatic – a REPL for creating labeled data https://ift.tt/dzHNJq9

Show HN: Programmatic – a REPL for creating labeled data Hey HN, I’m Jordan cofounder of Humanloop (YC S20) and I’m excited to show you Programmatic — an annotation tool for building large labeled datasets for NLP without manual annotation . Programmatic is like a REPL for data annotation. You: 1. Write simple rules/functions that can approximately label the data 2. Get near-instant feedback across your entire corpus 3. Iterate and improve your rules Finally, it uses a Bayesian label model [1] to convert these noisy annotations into a single, large, clean dataset, which you can then use for training machine learning models. You can programmatically label millions of datapoints in the time taken to hand-label hundreds. What we do differently from weak supervision packages like Snorkel/skweak[1] is to focus on UI to give near-instantaneous feedback. We love these packages but when we tried to iterate on labeling functions we had to write a ton of boilerplate code and wrestle with pandas to understand what was going on. Building a dataset programmatically requires you to grok the impact of labeling rules on a whole corpus of text. We’ve been told that the exploration tools and feedback makes the process feel game-like and even fun (!!). We built it because we see that getting labeled data remains a blocker for businesses using NLP today. We have a platform for active learning (see our Launch HN [2]) but we wanted to give software engineers and data scientists a way to build the datasets needed themselves and to make best use of subject-matter-experts’ time. The package is free and you can install it now as a pip package [2]. It supports NER / span extraction tasks at the moment and document classification will be added soon. To help improve it, we'd love to hear your feedback or any success/failures you’ve had with weak supervision in the past. [1]: We use a HMM model for NER tasks, and Naive-Bayes for classification using the two approaches given in the papers below: Pierre Lison, Jeremy Barnes, and Aliaksandr Hubin. "skweak: Weak Supervision Made Easy for NLP." https://ift.tt/rCsUQqy (2021) Alex Ratner, Christopher De Sa, Sen Wu, Daniel Selsam, Chris RĂ©. "Data Programming: Creating Large Training Sets, Quickly" https://ift.tt/NpztrfE (NIPS 2016) [2]: Our Launch HN for our main active learning platform, Humanloop – https://ift.tt/puJhGLo [3]: Can install it directly here https://ift.tt/OqgB267... https://ift.tt/T1xHpaS April 8, 2022 at 05:05PM

Show HN: Disable now useless “What's new” page in Firefox https://ift.tt/k6Hd5on

Show HN: Disable now useless “What's new” page in Firefox Firefox 99 started serving a "What's New" page that is an ad for Pocket instead of listing what's new in the browser. Another disappointment. Here's how to disable the now useless "What's new page": 1. Go to about:config 2. Change the value of "browser.startup.homepage_override.mstone" to "ignore". Bingo! One less page with ads. Thanks for nothing, Mozilla. April 8, 2022 at 12:49PM

Show HN: Do You Know RGB? https://ift.tt/t8kUpbO

Show HN: Do You Know RGB? https://ift.tt/OWhvmMT June 24, 2025 at 01:49PM