site:www.marktechpost.com

OpenAI Releases GPT-5.5, a Fully Retrained Agentic Model That Scores 82.7% on Terminal ...

OpenAI Releases GPT-5.5, a Fully Retrained Agentic Model That Scores 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval ...

marktechpost

NVIDIA Open Sources Parakeet TDT 0.6B: Achieving a New Standard for Automatic Speech ...

At the heart of Parakeet TDT 0.6B’s appeal is its unmatched speed and transcription quality. The model can transcribe 60 minutes of audio in just one second, a ...

marktechpost

Sakana AI Introduces KAME: A Tandem Speech-to-Speech Architecture That Injects LLM ...

The fundamental tension in conversational AI has always been a binary choice: respond fast or respond smart. Real-time speech-to-speech (S2S) models — the kind that power natural-feeling voice ...

marktechpost

Meta Introduces Autodata: An Agentic Framework That Turns AI Models into Autonomous Data ...

The bottleneck in building better AI models has never been compute alone — it has always been data quality. Meta AI’s RAM (Reasoning, Alignment, and Memory) team is now addressing that bottleneck ...

marktechpost

Open Source

Mistral AI's latest release brings async cloud-based coding sessions, a new 128B flagship model, and an agentic Work mode to Le Chat — a meaningful step forward for developers building with AI agents.

marktechpost

DeepSeek Just Released a 3B OCR Model: A 3B VLM Designed for High-Performance OCR and ...

DeepSeek-AI released 3B DeepSeek-OCR, an end to end OCR and document parsing Vision-Language Model (VLM) system that compresses long text into a small set of vision tokens, then decodes those tokens ...

marktechpost

A Coding Guide on LLM Post Training with TRL from Supervised Fine Tuning to DPO and GRPO ...

In this tutorial, we walk through a complete, hands-on journey of post-training large language models using the powerful TRL (Transformer Reinforcement Learning) library ecosystem. We start from a ...

marktechpost

Qwen AI Releases Qwen-Scope: An Open-Source Sparse AutoEncoders (SAE) Suite That Turns LLM ...

Large language models are remarkably capable, yet frustratingly opaque. When a model misbehaves — generating responses in the wrong language, repeating itself endlessly, or refusing safe requests — AI ...

marktechpost

Asif Razzaq

Asif Razzaq is an AI Journalist and Cofounder of Marktechpost, LLC. He is a visionary, entrepreneur and engineer who aspires to use the power of Artificial Intelligence for good. Asif’s latest venture ...

marktechpost

Comparing the Top 6 OCR (Optical Character Recognition) Models/Systems in 2025

Optical character recognition has moved from plain text extraction to document intelligence. Modern systems must read scanned and digital PDFs in one pass, preserve layout, detect tables, extract key ...

marktechpost

DeepSeek AI Releases DeepSeek-V4: Compressed Sparse Attention and Heavily Compressed ...

DeepSeek-AI has released a preview version of the DeepSeek-V4 series: two Mixture-of-Experts (MoE) language models built around one core challenge making one-million-token context windows practical ...

marktechpost

Build a Reinforcement Learning Powered Agent that Learns to Retrieve Relevant Long-Term ...

In this tutorial, we build a Reinforcement Learning–driven agent that learns how to retrieve relevant memories from a long-term memory bank. We start by constructing a synthetic memory dataset and ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果