Growing India News, world news, nation news, our news, people's news, grow news, entertainment, fashion, movies, tech, automobile and many more..
Sunday, December 19, 2021
Show HN: Release 0.8 of sbctl, Secure Boot key manager https://ift.tt/3qaQ7l8
Show HN: Release 0.8 of sbctl, Secure Boot key manager https://ift.tt/3J0Y70L December 18, 2021 at 10:38PM
Saturday, December 18, 2021
Show HN: All timeless articles posted on Hacker News, written 1321 to 2021 https://ift.tt/3e76sBI
Show HN: All timeless articles posted on Hacker News, written 1321 to 2021 https://ift.tt/3e7qXOB December 18, 2021 at 09:30PM
Show HN: ReleaseChurch – a fun-website to cast a prayer for your release https://ift.tt/3GVzd0Q
Show HN: ReleaseChurch – a fun-website to cast a prayer for your release https://ift.tt/3qaVcda December 18, 2021 at 07:06PM
Show HN: Searchall – search all major indexes on one page (with iframes) https://ift.tt/3q67KTg
Show HN: Searchall – search all major indexes on one page (with iframes) https://ift.tt/3skUmxd December 18, 2021 at 07:08AM
Show HN: Engula – A serverless storage engine in Rust for building databases https://ift.tt/33sVc0r
Show HN: Engula – A serverless storage engine in Rust for building databases https://ift.tt/3E1oxf3 December 17, 2021 at 01:17PM
Show HN: Type-level Lambda Calculus interpreter in TypeScript https://ift.tt/327yQBh
Show HN: Type-level Lambda Calculus interpreter in TypeScript https://ift.tt/3s6EWwm December 18, 2021 at 01:07AM
Show HN: A labelling tool to easily extract and label Wikipedia data https://ift.tt/3dYf5yC
Show HN: A labelling tool to easily extract and label Wikipedia data Hi HN! I am Maria, solo founder of DataQA ( https://dataqa.ai/ ), a tool to search and label documents for various NLP tasks (e.g. entity extraction, entity linking, etc). I have worked as a data scientist and ML engineer for the better part of a decade, and over that time have specialised mainly in applications involving natural language processing (NLP). One of the key questions I have always had at the back of my mind is whether my time was well spent. Whenever I spent more time on feature engineering or trying different models, I always wondered whether I would get better return on investment by simply labelling more data. I have created DataQA to enhance exploration & labelling of documents. It is open-source and ships with the elasticsearch text search engine which I have packaged as a python package (might be topic of a future technical post), as well as a rules-based engine to do pre-labelling of documents using NLP rules. It is very easy to install with a single pip command. One of the key things I wanted to add to DataQA is an integration to Wikipedia. Even though wikipedia is the largest living repository of human knowledge in the world, I still always found it difficult to process it and create structured datasets for my specific applications. Since wiki pages are long-form articles, it is important to divide the text into smaller text chunks. A lot of the interesting data is also sometimes displayed in tables. With DataQA you can now upload a list of wikipedia page urls and the tool will extract the articles, process them and even parse the tables, so you can then label any entities you want. You can find a tutorial here: https://ift.tt/3GSdlDh... . The open-source version of DataQA currently only supports csv, but I have an enterprise version with premium features such as labelling of pdfs (with understanding of tables). If you're interested in a free trial, please contact me at contact@dataqa.ai :-). December 17, 2021 at 11:12PM
Subscribe to:
Posts (Atom)
Show HN: Audiopipe – Pipeline for audio diarization, denoising and transcription https://ift.tt/wQkI7Jl
Show HN: Audiopipe – Pipeline for audio diarization, denoising and transcription Audiopipe is a one-liner for denoising, diarization and tra...
-
Show HN: An AI logo generator that can also generate SVG logos Hey everyone, I've spent the past 2 weeks building an AI logo generator, ...
-
Show HN: Snap Scope – Visualize Lens Focal Length Distribution from EXIF Data https://ift.tt/yrqHZtDShow HN: Snap Scope – Visualize Lens Focal Length Distribution from EXIF Data Hey HN, I built this tool because I wanted to understand which...
-
Show HN: Federated IndieAuth Server implemented as a notebook https://ift.tt/32IC633 April 27, 2021 at 04:37PM