Stars
一个基于nano banana pro🍌的原生AI PPT生成应用,迈向"Vibe PPT"; 支持上传任意模板图片,上传任意素材&智能解析,一句话/大纲/页面描述自动生成PPT,口头修改指定区域、一键导出可编辑ppt - An AI-native slides generator based on nano banana pro🍌
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
[CVPR 2026 Highlight] MatAnyone 2: Scaling Video Matting via a Learned Quality Evaluator
Instant, Concurrent, Secure & Lightweight Sandbox for AI Agents.
Qianfan-VL: Domain-Enhanced Universal Vision-Language Models
A library for efficient similarity search and clustering of dense vectors.
Out-of-the-box DeepSeek OCR document parsing Web Studio
A comprehensive book on neural networks and large language models in NLP
ncnn android yolov8 realtime detection, segmentation, pose estimation, classification and obb
Document Artifical Intelligence
该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作
This repository contains the official implementation of the research papers, "MobileCLIP" CVPR 2024 and "MobileCLIP2" TMLR August 2025
Open-source LLM knowledge platform: turn raw documents into a queryable RAG, an autonomous reasoning agent, and a self-maintaining Wiki.
A comprehensive list of awesome document image rectification papers.
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
[ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".
A lightweight LMM-based Document Parsing Model
Industry leading face manipulation platform
(Pattern Recognition) Pytorch implementation of “HTR-VT: Handwritten Text Recognition with Vision Transformer”
ICCV 2023 "Neural Video Depth Stabilizer" (NVDS) & TPAMI 2024 "NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation" (NVDS+)
Easy to use open source fast database for search | Good alternative to Elasticsearch | Drop-in replacement for E in the ELK stack
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
A lightweight data processing framework built on DuckDB and 3FS.
FlashMLA: Efficient Multi-head Latent Attention Kernels


