Friday, June 14, 2024

Show HN: Real News or Satire? Test Your ability with this game https://ift.tt/lTg6KUu

Show HN: Real News or Satire? Test Your ability with this game Hey HN! I'm sharing a project I developed. It's a Tinder-like game where you swipe left or right to guess if a post is from Hacker News or a satire website. https://ift.tt/ZvWsr9l June 14, 2024 at 01:26AM

Show HN: XDeck – An ad-blocking client app for macOS, like TweetDeck https://ift.tt/SsbE2Fj

Show HN: XDeck – An ad-blocking client app for macOS, like TweetDeck Hi everyone, XDeck is an client app for macOS as a TweetDeck alternative, with Ad-Blocking! I developed this for myself after feeling disappointed that TweetDeck has become a paid service. I hope you find it useful too. https://ift.tt/6Unxvfd June 13, 2024 at 09:21PM

Show HN: Paramount, an OSS Package for *Human* Evals of AI Customer Support https://ift.tt/RDVkyqA

Show HN: Paramount, an OSS Package for *Human* Evals of AI Customer Support Hey HN, Hakim here from Fini (YC S22). Fini is a startup founded by ex-Uber engineers, focusing on providing Automated Customer Support bots for Enterprises that have a high volume of support requests. Today, one of the largest use cases of LLMs is for the purpose of automating support. As the space has evolved over the past year, there has subsequently been a need for evaluations of LLM outputs - and a sea of LLM Evals packages have been released. "LLM evals" refer to the evaluation of large language models, assessing how well these AI systems understand and generate human-like text. These packages have recently relied on "automatic evals," where algorithms (usually another LLM) automatically test and score AI responses without human intervention. In our day to day work, we have found that Automatic Evals are not enough to get the required 95% accuracy for our Enterprise customers. Automatic Evals are efficient, but still often miss nuances that only human expertise can catch. Automatic Evals can never replace the feedback of a trained human who is deeply knowledgeable on an organization's latest product releases, knowledgebase, policies and support issues. The key to solve this is to stop ignoring the business side of the problem, and start involving knowledgeable experts in the evaluation process. That is why we are releasing Paramount - an Open Source package which involves human feedback directly into the evaluation process. By simplifying the step of gathering feedback, ML Engineers can pinpoint and fix accuracy issues (prompts, knowledgebase issues) much faster. Paramount provides a framework for recording LLM function outputs (ground truth data) and facilitates human agent evaluations through a simple UI, reducing the time to identify and correct errors. Developers can integrate Paramount with a python decorator that logs LLM interactions into a database, followed by a straightforward UI for expert review. This process aids the debugging and validation phase of launching accurate support bots. https://ift.tt/mT91rPv June 13, 2024 at 11:50PM

Thursday, June 13, 2024

Show HN: We built an AI Copilot for end to end project development workflow https://ift.tt/9fUBwMb

Show HN: We built an AI Copilot for end to end project development workflow Omniflow creates, customizes and automates project workflow from requirement creation, tech design, dev scheduling to release and more. All done in 3 minutes. https://omniflow.team/ June 13, 2024 at 11:36AM

Show HN: I created a tiny web crawler for Python https://ift.tt/CA4cQOi

Show HN: I created a tiny web crawler for Python https://ift.tt/0VPDTkK June 13, 2024 at 06:11AM

Wednesday, June 12, 2024

Store Your Bike in Style: Introducing a New Parking Option

Store Your Bike in Style: Introducing a New Parking Option
By Jason Hyde

Inside our new Bikehangar at 4th & Minna streets. There’s a new and affordable way to store your bike in San Francisco with an extra layer of security. Today, the SFMTA opened two eye-catching Bikehangars as part of a two-year pilot program. These bike storage lockers require a signup with the BikeLink system and have monitored access. This makes them even more secure than our short-term bike racks. They also feature designs from local artists. We'll share how to find and access the new Bikehangars -- and why this pilot marks a first for bike storage in the U.S. Where to find the Bikehangars...



Published June 06, 2024 at 05:30AM
https://ift.tt/AajHuZt

Show HN: Arewedownyet.com https://ift.tt/hXyYHBE

Show HN: Arewedownyet.com We've built this to quickly check the status of several popular services on a single status page. https://ift.tt/9taKvd0 June 11, 2024 at 11:32PM

Show HN: Tablr – Supabase with AI Features https://ift.tt/uZsg6oX

Show HN: Tablr – Supabase with AI Features https://www.tablr.dev/ June 30, 2025 at 04:35AM