Growing India News, world news, nation news, our news, people's news, grow news, entertainment, fashion, movies, tech, automobile and many more..
Saturday, July 22, 2023
Show HN: Datalake for Computer Vision Projects https://ift.tt/6uKTr1Y
Show HN: Datalake for Computer Vision Projects Buddhika, Kelum, and Chong Han here. We are building a self-hosted data infrastructure platform for computer vision. Our community page is https://ift.tt/swfbVh8 In the past, we worked on a couple of high-scale computer vision projects in retail, farming, and hospitals in various capacities. These projects involved 2D object sections, 3D object tracking, and more advanced 3D perception. Like other CV Engineers, we observed a common factor during these projects: one needs a large volume of high-quality data to build a production-deployable CV system. Our biggest challenge was not having a robust data infrastructure to handle large volumes of data. Our S3 buckets were like a data swamp; we had so much raw image and video in storage buckets without tracking. Instead of working on CV, we had to develop tools for data operations. We understand that many of us have our own custom scripts and stitch them together to make things happen in the CV pipeline. However, it is brittle and cumbersome to maintain. We wanted to build a system on top of the cloud buckets such as S3 that store all file indexes, labels, metadata attributes, inference outputs, model training outcomes, and literally anything related to machine learning/computer vision. This makes it possible for us to search for anything and consume efficiently. This behaves as a DataLake (by the way, "DataLake" is an overused term). All other downstream processes in the CV pipeline can access data more efficiently via SDK and can also return data back to the Lake (e.g., training/inference outcomes). The reason we made it self-hosted is to address data security and privacy concerns. Since data is fundamental to AI, we believe that companies and organizations should have complete control over it. Currently, we support AWS, GCP, and Azure cloud buckets; soon, we will support local storage. We ship this as a Docker container so you can just install it on any VM or local server. The installation script will do all the configuration automatically. The Python SDK and documentation are available but not perfect yet. We’ve launched this under MIT and Elastic licenses so any developer can use it. Our goal is not to charge individual developers. We make money by charging a license fee for things like multiple users, multiple buckets, scalability with K8, and providing support. Give it a try: https://ift.tt/swfbVh8 Let us know what you think. July 22, 2023 at 04:45AM
Subscribe to:
Post Comments (Atom)
Show HN: Pocket2Linkding – Migrate from Mozilla Pocket to Linkding https://ift.tt/IwYJfju
Show HN: Pocket2Linkding – Migrate from Mozilla Pocket to Linkding With the Mozilla Pocket shutdown coming up in about two weeks, I thought ...
-
Show HN: An AI logo generator that can also generate SVG logos Hey everyone, I've spent the past 2 weeks building an AI logo generator, ...
-
Show HN: Snap Scope – Visualize Lens Focal Length Distribution from EXIF Data https://ift.tt/yrqHZtDShow HN: Snap Scope – Visualize Lens Focal Length Distribution from EXIF Data Hey HN, I built this tool because I wanted to understand which...
-
Show HN: Federated IndieAuth Server implemented as a notebook https://ift.tt/32IC633 April 27, 2021 at 04:37PM
No comments:
Post a Comment