Growing India News, world news, nation news, our news, people's news, grow news, entertainment, fashion, movies, tech, automobile and many more..
Saturday, August 26, 2023
Show HN: Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B https://ift.tt/2bm4nYs
Show HN: Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B Hi HN, We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset. The CodeLlama models released yesterday demonstrate impressive performance on HumanEval. - CodeLlama-34B achieved 48.8% pass@1 on HumanEval - CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens. Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples. The methodology is: - For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters. - A match was identified if any sampled substring was a substring of the processed training example. For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report. Presented below are the pass@1 scores we achieved with our fine-tuned models: - Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval - Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval Note on GPT-4 According to the official technical report in March, OpenAI reported a pass@1 score of 67% for GPT-4's performance on HumanEval. Since then, there have been claims reporting higher scores. However, it's essential to note that there hasn't been any concrete evidence pointing towards an enhancement in the model's coding abilities since then. It's also crucial to highlight that these elevated figures lack the rigorous contamination analysis that the official statistic underwent, making them less of a reliable comparison. As a result, we consider 67% as the pass@1 score for GPT-4. Download We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results. Phind-CodeLlama-34B-v1: https://ift.tt/Qmw5nDN Phind-CodeLlama-34B-Python-v1: https://ift.tt/VmWcrbA We'd love to hear your thoughts! Best, The Phind Team https://ift.tt/aWsm6SC August 26, 2023 at 03:38AM
Show HN: Mail Memories – Export your email photos https://ift.tt/DIEmUzY
Show HN: Mail Memories – Export your email photos Hey HN, I’m Carlos, the maker behind Mail Memories ( https://ift.tt/Js80MyO ), a web app that helps you find and save photos from your (Gmail) email. The app connects with your email account, finds all the images you’ve received and shows them in a gallery where you can view and download the ones you want to save. I made this out of curiosity, just to see what pictures were in my account when I first signed up for Gmail 18 years ago. I ended up finding photos of my grandmother and other family members, and old friends and colleagues I’d completely forgotten about. I was surprised by what I found, I hope you will be too. Can’t wait to hear your thoughts. Demo: https://ift.tt/ZVSkeRP https://ift.tt/Js80MyO August 26, 2023 at 12:42AM
Friday, August 25, 2023
Show HN: Budget Zen – Simple, Encrypted Budgets and Expenses https://ift.tt/eIDVqJP
Show HN: Budget Zen – Simple, Encrypted Budgets and Expenses https://budgetzen.net August 25, 2023 at 02:57PM
Show HN: JSON Wrapper for React Native https://ift.tt/qjbiTZ9
Show HN: JSON Wrapper for React Native https://ift.tt/1Ze6VCF August 25, 2023 at 10:56AM
Show HN: A simple web app to combat phone addiction https://ift.tt/l3xXeFp
Show HN: A simple web app to combat phone addiction When I'm stuck on coding something, I find myself reaching for my phone even if I don't have any particular reason to do so. Inspired by Calm's DoNothingFor2Minutes.com which launched on HN 13 years ago [1], I made this simple webapp to see if my friends and I could go an hour without touching our phones. It is surprisingly difficult. According to a 2022 survey [2], the average US adult picks up their phone 352 times per day, or approximately once every 2m43s while they're awake. On browsers that support it (iOS 16.4+, most versions of Android Chrome), it uses the Screen Wake Lock API [3] to keep the page open, and falls back to nosleep.js [4] otherwise. From testing on my iPhone 14 Pro Max running iOS 16.6, battery life only went down 3 or 4 percentage points after an hour with the wake lock. Made this as a web app as a quick demo to be compatible across all mobile devices. As an app, we can probably save more on battery + not have the screen on. One caveat is that on iOS this will actually increase your Screen Time (although hopefully reduce your other category usage). I currently only track time on page through Google Analytics 4. No other calls are made to a server, although if we actually wanted to verify that you kept the page open vs. javascript/inspector-system clock-fu, we could add a verified mode that pings the server every X minutes. As a PWA, possibly due to an iOS/Mobile Safari quirk/bug [5], neither wake lock nor nosleep.js appear to work . [1] https://ift.tt/OR2BGnU [2] https://ift.tt/eoj5dIL [3] https://ift.tt/7DH6YEj... [4] https://ift.tt/utwsbW3 [5] https://ift.tt/JzUjsdp https://ift.tt/GkAFsOl August 25, 2023 at 02:15AM
Show HN: Collie – A minimal RSS reader just for you https://ift.tt/p5EY2dg
Show HN: Collie – A minimal RSS reader just for you Collie is a minimal RSS feed reader application running on your desktop. With Collie, you can subscribe to multiple RSS/Atom feeds to organize your own news feed, receive a real-time notification when a new item is added to the subscribed feed, and save the items to read again or later. All you need is a local machine and the Internet. No virtual machine, no cloud infrastructures, no always-on database, and no account registration with privacy information required. I've been getting tech news from HackerNews, Lobsters, etc. on Twitter (It's X now, but I'll keep calling it Twitter anyway), but many of them have been terminated due to changes in Twitter's API policy. I went from place to place: Bluesky, Mastodon, Slack, and newsletter. However, I couldn't settle anywhere. The social media services such as Bluesky and Mastodon had too many unnecessary features as news feed. Slack RSS was good to get the news in real-time, but the notifications mixed with other workspaces overwhelmed me. The newsletters gave me a lot of high-quality information, but not in real-time. Then, I remembered Miniflux, the "minimalist and opinionated feed reader" that I had used past. This is the best option for my goal, but I had to pay for the hosted version or keep running docker machine on my local computer which did not have enough resources. Additionally, I didn't need a system that maintains multi-user sessions. Eventually, I had no choice but to create my own application, and that's why I made Collie, the minimal RSS reader just for me. https://ift.tt/PG0DQpd August 25, 2023 at 10:58AM
Show HN: Convert Text, PDF, Docs, Scan or Image to Speech https://ift.tt/YEjvURg
Show HN: Convert Text, PDF, Docs, Scan or Image to Speech https://ift.tt/EFvc2tY August 24, 2023 at 11:13PM
Subscribe to:
Posts (Atom)
Show HN: Mosaic – arrange iOS icons by color using an evolutionary algorithm https://ift.tt/DbGE483
Show HN: Mosaic – arrange iOS icons by color using an evolutionary algorithm It started out as a way for me to freshen up my C++ skills duri...
-
Show HN: An AI logo generator that can also generate SVG logos Hey everyone, I've spent the past 2 weeks building an AI logo generator, ...
-
Show HN: Simple Gantt Chart Software https://ift.tt/sa3dQKF May 7, 2022 at 12:39PM
-
Breaking #FoxNews Alert : Number of dead rises after devastating tornadoes, Kentucky governor announces — R Karthickeyan (@RKarthickeyan1)...