Monday, February 19, 2024

Show HN: The History Chronicle – daily historical facts in newspaper form https://ift.tt/HU3IsE5

Show HN: The History Chronicle – daily historical facts in newspaper form https://ift.tt/8ucvxrG February 19, 2024 at 01:56AM

Show HN: Caps-log (Captain's log) – A small TUI journaling tool https://ift.tt/ZUNBqKp

Show HN: Caps-log (Captain's log) – A small TUI journaling tool Caps-log is a compact TUI (Text-based User Interface) journaling application crafted in C++ and leveraging the FTXUI library for its terminal interface. It allows users to save daily log entries as simple markdown files, making it an appealing tool for those who prefer working within a terminal environment. The interface is designed with a calendar feature that stands out by marking the days associated with a log entry. Furthermore, it can accentuate days based on specific 'tags' or 'sections' identified in the logs, which are either markdown list items starting with '*' or level one headers. In addition to these features, caps-log includes password protection for your entries and offers a somewhat 'hacky' (for now) method for remote storage. This is achieved by integrating with a pre-configured Git repository, enabling remote storage via a git remote. https://ift.tt/6Fq8msd February 19, 2024 at 01:07AM

Show HN: Domino Fit – Domino Tiling Puzzle https://ift.tt/eyfFzQE

Show HN: Domino Fit – Domino Tiling Puzzle Domino fit is a domino tiling puzzler I spent a lot of time both making and playing. Its like sudoku but with a geometric angle, the sum of the dots must match the row and column numbers. It's running on Betsy, the server under my couch, and was made with typescript. I'm proud of it. I hope you give it a shot, and appreciate any thoughts or criticisms! https://ift.tt/C6bUqZV February 18, 2024 at 10:56PM

Sunday, February 18, 2024

Show HN: I Built an Open Source API with Insanely Fast Whisper and Fly GPUs https://ift.tt/pf0KNHv

Show HN: I Built an Open Source API with Insanely Fast Whisper and Fly GPUs Hi HN! Since the launch of JigsawStack.com we've been trying to dive deeper into fully managed AI APIs built and fine tuned for specific use cases. Audio/video transcription was one of the more basic things and we wanted the best open source model and at this point it is OpenAI's whisper large v3 model based on the number languages it supports and accuracy. The thing is, the model is huge and requires tons of GPU power for it to run efficiently at scale. Even OpenAI doesn't provide an API for their best transcription model while only providing whisper v2 at a pretty high price. I tried running the whisper large v3 model on multiple cloud providers from Modal.com, Replicate, Hugging faces dedicated interface and it takes a long time to transcribe any content about ~30mins long for 150mins of audio and this doesn't include the machine startup time for on demand GPUs. Keeping in mind at JigsawStack we aim to return any heavy computation under 25s or 2mins for async cases and any basic computation under 2s. While exploring Replicate, I came across this project https://ift.tt/oeyRZ8J by Vaibhav Srivastav which optimises the hell out of this whisper large v3 model with a variety of techniques like batching and using FlashAttention 2. This reduces computation time by almost 30x, check out the amazing repo for more stats! Open source wins again!! First we tried using Replicates dedicated on-demand GPU service to run this model but that did not help, the cold startup/booting time alone of a GPU made the benefits of the optimised model pretty useless for our use case. Then tried Hugging face and modal.com and we got the same results, with a A100 80GB GPU, we were seeing around an average of ~2mins start up time to load the machine and model image. It didn't make sense for us to have a always on GPU running due to the crazy high cost. At this point I was inches away from giving up. Next day I got an email from Fly.io: "Congrats, Yoeven D Khemlani has GPU access!" I totally forgot the Fly started providing GPUs and I'm a big fan of their infra reliability and ease to deploy. We also run a bunch of our GraphQL servers for JigsawStack on Fly's infra! I quickly picked up some Python and Docker by referring to a bunch of other Github repos and Fly's GPU tutorials then wrote the API layer with the optimised version of whisper 3 and deployed on Fly's GPU machines. And wow the results were pretty amazing, the start up time of the machine on average was ~20 seconds compared to the other providers at ~2mins with all the performance benefits from the optimised whisper. I've added some more stats in the Github repo. The more interesting thing to me is cost↓ Based on 10mins of audio: OpenAI Whisper v2 API -> $0.06/10mins Insanely Fast Whisper large v3 API on Fly GPU (Cold startup) -> ~$0.029/10mins Insanely Fast Whisper large v3 API on Fly GPU (Warm startup) -> ~$0.011/10mins (Note: This are rough estimates I did by taking averages after running 5 rounds each) If you guys want to run this on any other GPU providers you can as long as they support Docker. We'll be optimising this more over the next few days specific to Fly's infrastructure allowing for global distributed instances of whisper and will soon be providing a fully managed API on JigsawStack.com. Stay tuned! https://ift.tt/e4R6rjJ February 18, 2024 at 01:48PM

Show HN: Programming is easier than you think https://ift.tt/8kNyhsf

Show HN: Programming is easier than you think https://ift.tt/keO8R67 February 17, 2024 at 10:23PM

Show HN - tool that converts image receipts to Excel https://ift.tt/CWngTmG

Show HN - tool that converts image receipts to Excel Hey I'm excited to share my first project, Receipts2CSV, a web application designed to simplify bookkeeping by converting receipt images into CSV files. https://ift.tt/AglXmbC Problem Statement: Keeping track of expenses and managing receipts can be a tedious task, especially for small businesses and freelancers. Traditional methods involve manually entering data from receipts into spreadsheets, which is time-consuming and prone to errors. With Receipts2CSV, users can streamline this process by simply uploading images of their receipts and obtaining structured CSV files ready for import into accounting software. If you are lazy like I am, you could accumulate receipts in just one folder and re-run all images every time, remove duplicates and merge with older CSV to minimize looking through receipts on a monthly/annual basis. Questions for Validation: Do you find a receipt image to CSV converter useful? Would you consider using such a tool for your bookkeeping needs? Considering higher costs for AI models like GPT-4 Vision Preview, how are other indie hackers able to create and sustain offering free products like these? Do small products like these have a monetization market? If so, where do I begin? Curious to hear your candid thoughts about this web app. Should I explore it further or move to the next idea? Feel free to share your thoughts, suggestions, or any additional features you'd like to see in the product! Thank you for your valuable input and support! https://ift.tt/6DHc12V February 17, 2024 at 11:35PM

Show HN: Polybar Module for Using AirPods https://ift.tt/fUihI8A

Show HN: Polybar Module for Using AirPods https://ift.tt/NVqYeot February 18, 2024 at 01:06AM

Show HN: Happy Coder – End-to-End Encrypted Mobile Client for Claude Code https://ift.tt/vt1BkI0

Show HN: Happy Coder – End-to-End Encrypted Mobile Client for Claude Code Hey all! Few weeks ago we realized AI models are now so good you d...