Growing India News, world news, nation news, our news, people's news, grow news, entertainment, fashion, movies, tech, automobile and many more..
Saturday, May 24, 2025
Show HN: Advanced Chunking in JavaScript/TypeScript with Chonkie https://ift.tt/fJ6pF7j
Show HN: Advanced Chunking in JavaScript/TypeScript with Chonkie Hi HN, We’re Shreyash and Bhavnick. We built Chonkie, an open-source library for advanced chunking and embedding of text and code. It was previously Python-only, but we just released a TypeScript version: https://ift.tt/0x6eZ4K Many AI projects in JS/TS (like those using Vercel's AI SDK or Mastra) rely on basic text splitters. But better chunking = better retrieval = better performance. That’s what Chonkie is built for. Current native chunkers (in TS): - Code Chunker – handles Python, TypeScript, etc. - Recursive Chunker – rule-based, hierarchical splitting - Token Chunker – split by token count (fully customizable) - Sentence Chunker – split on sentence boundaries. Delimiters are customizable, so it works for multiple languages. All chunkers support custom tokenizers, chunk overlap, delimiters, and more. Coming soon in native TS (already available via the API client): - Semantic Chunker – splits texts wherever it detects a shift in meaning. - SDPM Chunker – merges semantically similar disjoint chunks - Late Chunker – generates context-aware embeddings for each chunk - Slumber Chunker – LLM-refined recursive chunks. Significantly reduces token usage (and thus cost) while maximizing chunk quality. - Embeddings Refinery - Embed chunks with any embedding model - Overlap Refinery – Create overlaps between consecutive chunks for better context preservation. Chonkie is free, open-source, and MIT licensed. GitHub: https://ift.tt/0x6eZ4K We’d love your feedback, ideas, or contributions. Thanks! May 24, 2025 at 01:33AM
Subscribe to:
Post Comments (Atom)
Show HN: Pocket2Linkding – Migrate from Mozilla Pocket to Linkding https://ift.tt/IwYJfju
Show HN: Pocket2Linkding – Migrate from Mozilla Pocket to Linkding With the Mozilla Pocket shutdown coming up in about two weeks, I thought ...
-
Show HN: An AI logo generator that can also generate SVG logos Hey everyone, I've spent the past 2 weeks building an AI logo generator, ...
-
Show HN: Snap Scope – Visualize Lens Focal Length Distribution from EXIF Data https://ift.tt/yrqHZtDShow HN: Snap Scope – Visualize Lens Focal Length Distribution from EXIF Data Hey HN, I built this tool because I wanted to understand which...
-
Show HN: Federated IndieAuth Server implemented as a notebook https://ift.tt/32IC633 April 27, 2021 at 04:37PM
No comments:
Post a Comment