OpenAI Releases Research Preview of ‘gpt-oss-safeguard’: Two Open-Weight Reasoning Models for Safety Classification Tasks

20 hours 1 minute ago

OpenAI has released a research preview of gpt-oss-safeguard, two open weight safety reasoning models that let developers apply custom safety policies at inference time. The models come in two sizes, gpt-oss-safeguard-120b and gpt-oss-safeguard-20b, both fine tuned from gpt-oss, both licensed under Apache 2.0, and both available on Hugging Face for local use. Why Policy-Conditioned Safety […]

The post OpenAI Releases Research Preview of ‘gpt-oss-safeguard’: Two Open-Weight Reasoning Models for Safety Classification Tasks appeared first on MarkTechPost.

Michal Sutter

How to Design an Autonomous Multi-Agent Data and Infrastructure Strategy System Using Lightweight Qwen Models for Efficient Pipeline Intelligence?

23 hours 41 minutes ago

In this tutorial, we build an Agentic Data and Infrastructure Strategy system using the lightweight Qwen2.5-0.5B-Instruct model for efficient execution. We begin by creating a flexible LLM agent framework and then develop specialized agents that handle different layers of data management, from ingestion and quality analysis to infrastructure optimization. We integrate these agents into an […]

The post How to Design an Autonomous Multi-Agent Data and Infrastructure Strategy System Using Lightweight Qwen Models for Efficient Pipeline Intelligence? appeared first on MarkTechPost.

Asif Razzaq

Ant Group Releases Ling 2.0: A Reasoning-First MoE Language Model Series Built on the Principle that Each Activation Enhances Reasoning Capability

1 day 7 hours ago

How do you build a language model that grows in capacity but keeps the computation for each token almost unchanged? The Inclusion AI team from the Ant Group is pushing sparse large models in a methodical way by releasing Ling 2.0. Ling 2.0 is a reasoning based language model family built on the idea that each […]

The post Ant Group Releases Ling 2.0: A Reasoning-First MoE Language Model Series Built on the Principle that Each Activation Enhances Reasoning Capability appeared first on MarkTechPost.

Asif Razzaq

How to Build Ethically Aligned Autonomous Agents through Value-Guided Reasoning and Self-Correcting Decision-Making Using Open-Source Models

1 day 22 hours ago

In this tutorial, we explore how we can build an autonomous agent that aligns its actions with ethical and organizational values. We use open-source Hugging Face models running locally in Colab to simulate a decision-making process that balances goal achievement with moral reasoning. Through this implementation, we demonstrate how we can integrate a “policy” model […]

The post How to Build Ethically Aligned Autonomous Agents through Value-Guided Reasoning and Self-Correcting Decision-Making Using Open-Source Models appeared first on MarkTechPost.

Asif Razzaq

IBM AI Team Releases Granite 4.0 Nano Series: Compact and Open-Source Small Models Built for AI at the Edge

1 day 23 hours ago

Small models are often blocked by poor instruction tuning, weak tool use formats, and missing governance. IBM AI team released Granite 4.0 Nano, a small model family that targets local and edge inference with enterprise controls and open licensing. The family includes 8 models in two sizes, 350M and about 1B, with both hybrid SSM […]

The post IBM AI Team Releases Granite 4.0 Nano Series: Compact and Open-Source Small Models Built for AI at the Edge appeared first on MarkTechPost.

Asif Razzaq

Microsoft Releases Agent Lightning: A New AI Framework that Enables Reinforcement Learning (RL)-based Training of LLMs for Any AI Agent

2 days 6 hours ago

How do you convert real agent traces into reinforcement learning RL transitions to improve policy LLMs without changing your existing agent stack? Microsoft AI team releases Agent Lightning to help optimize multi-agent systems. Agent Lightning is a open-sourced framework that makes reinforcement learning work for any AI agent without rewrites. It separates training from execution, […]

The post Microsoft Releases Agent Lightning: A New AI Framework that Enables Reinforcement Learning (RL)-based Training of LLMs for Any AI Agent appeared first on MarkTechPost.

Michal Sutter

Liquid AI Releases LFM2-ColBERT-350M: A New Small Model that brings Late Interaction Retrieval to Multilingual and Cross-Lingual RAG

2 days 21 hours ago

Can a compact late interaction retriever index once and deliver accurate cross lingual search with fast inference? Liquid AI released LFM2-ColBERT-350M, a compact late interaction retriever for multilingual and cross-lingual search. Documents can be indexed in one language, queries can be written in many languages, and the system retrieves with high accuracy. The Liquid AI […]

The post Liquid AI Releases LFM2-ColBERT-350M: A New Small Model that brings Late Interaction Retrieval to Multilingual and Cross-Lingual RAG appeared first on MarkTechPost.

Asif Razzaq

How Exploration Agents like Q-Learning, UCB, and MCTS Collaboratively Learn Intelligent Problem-Solving Strategies in Dynamic Grid Environments

3 days 4 hours ago

In this tutorial, we explore how exploration strategies shape intelligent decision-making through agent-based problem solving. We build and train three agents, Q-Learning with epsilon-greedy exploration, Upper Confidence Bound (UCB), and Monte Carlo Tree Search (MCTS), to navigate a grid world and reach a goal efficiently while avoiding obstacles. Also, we experiment with different ways of […]

The post How Exploration Agents like Q-Learning, UCB, and MCTS Collaboratively Learn Intelligent Problem-Solving Strategies in Dynamic Grid Environments appeared first on MarkTechPost.

Asif Razzaq

MiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x Faster

3 days 5 hours ago

Can an open source MoE truly power agentic coding workflows at a fraction of flagship model costs while sustaining long-horizon tool use across MCP, shell, browser, retrieval, and code? MiniMax team has just released MiniMax-M2, a mixture of experts MoE model optimized for coding and agent workflows. The weights are published on Hugging Face under […]

The post MiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x Faster appeared first on MarkTechPost.

Michal Sutter

Zhipu AI Releases ‘Glyph’: An AI Framework for Scaling the Context Length through Visual-Text Compression

3 days 19 hours ago

Can we render long texts as images and use a VLM to achieve 3–4× token compression, preserving accuracy while scaling a 128K context toward 1M-token workloads? A team of researchers from Zhipu AI release Glyph, an AI framework for scaling the context length through visual-text compression. It renders long textual sequences into images and processes […]

The post Zhipu AI Releases ‘Glyph’: An AI Framework for Scaling the Context Length through Visual-Text Compression appeared first on MarkTechPost.

Asif Razzaq

Meet Pyversity Library: How to Improve Retrieval Systems by Diversifying the Results Using Pyversity?

3 days 21 hours ago

Pyversity is a fast, lightweight Python library designed to improve the diversity of results from retrieval systems. Retrieval often returns items that are very similar, leading to redundancy. Pyversity efficiently re-ranks these results to surface relevant but less redundant items. It offers a clear, unified API for several popular diversification strategies, including Maximal Marginal Relevance […]

The post Meet Pyversity Library: How to Improve Retrieval Systems by Diversifying the Results Using Pyversity? appeared first on MarkTechPost.

Arham Islam

How to Build a Fully Interactive, Real-Time Visualization Dashboard Using Bokeh and Custom JavaScript?

3 days 21 hours ago

In this tutorial, we create a fully interactive, visually compelling data visualization dashboard using Bokeh. We start by turning raw data into insightful plots, then enhance them with features such as linked brushing, color gradients, and real-time filters powered by dropdowns and sliders. As we progress, we bring our dashboard to life with Custom JavaScript […]

The post How to Build a Fully Interactive, Real-Time Visualization Dashboard Using Bokeh and Custom JavaScript? appeared first on MarkTechPost.

Asif Razzaq

How to Build an Agentic Decision-Tree RAG System with Intelligent Query Routing, Self-Checking, and Iterative Refinement?

4 days 20 hours ago

In this tutorial, we build an advanced Agentic Retrieval-Augmented Generation (RAG) system that goes beyond simple question answering. We design it to intelligently route queries to the right knowledge sources, perform self-checks to assess answer quality, and iteratively refine responses for improved accuracy. We implement the entire system using open-source tools like FAISS, SentenceTransformers, and […]

The post How to Build an Agentic Decision-Tree RAG System with Intelligent Query Routing, Self-Checking, and Iterative Refinement? appeared first on MarkTechPost.

Asif Razzaq

Meet ‘kvcached’: A Machine Learning Library to Enable Virtualized, Elastic KV Cache for LLM Serving on Shared GPUs

5 days 4 hours ago

Large language model serving often wastes GPU memory because engines pre-reserve large static KV cache regions per model, even when requests are bursty or idle. Meet ‘kvcached‘, a library to enable virtualized, elastic KV cache for LLM serving on shared GPUs. kvcached has been developed by a research from Berkeley’s Sky Computing Lab (University of […]

The post Meet ‘kvcached’: A Machine Learning Library to Enable Virtualized, Elastic KV Cache for LLM Serving on Shared GPUs appeared first on MarkTechPost.

Asif Razzaq

5 Common LLM Parameters Explained with Examples

5 days 6 hours ago

Large language models (LLMs) offer several parameters that let you fine-tune their behavior and control how they generate responses. If a model isn’t producing the desired output, the issue often lies in how these parameters are configured. In this tutorial, we’ll explore some of the most commonly used ones — max_completion_tokens, temperature, top_p, presence_penalty, and […]

The post 5 Common LLM Parameters Explained with Examples appeared first on MarkTechPost.

Arham Islam

How to Build, Train, and Compare Multiple Reinforcement Learning Agents in a Custom Trading Environment Using Stable-Baselines3

5 days 19 hours ago

In this tutorial, we explore advanced applications of Stable-Baselines3 in reinforcement learning. We design a fully functional, custom trading environment, integrate multiple algorithms such as PPO and A2C, and develop our own training callbacks for performance tracking. As we progress, we train, evaluate, and visualize agent performance to compare algorithmic efficiency, learning curves, and decision […]

The post How to Build, Train, and Compare Multiple Reinforcement Learning Agents in a Custom Trading Environment Using Stable-Baselines3 appeared first on MarkTechPost.

Asif Razzaq

A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models

6 days 1 hour ago

AI companies use model specifications to define target behaviors during training and evaluation. Do current specs state the intended behaviors with enough precision, and do frontier models exhibit distinct behavioral profiles under the same spec? A team of researchers from Anthropic, Thinking Machines Lab and Constellation present a systematic method that stress tests model specs […]

The post A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models appeared first on MarkTechPost.

Asif Razzaq

How to Build a Fully Functional Computer-Use Agent that Thinks, Plans, and Executes Virtual Actions Using Local AI Models

6 days 17 hours ago

In this tutorial, we build an advanced computer-use agent from scratch that can reason, plan, and perform virtual actions using a local open-weight model. We create a miniature simulated desktop, equip it with a tool interface, and design an intelligent agent that can analyze its environment, decide on actions like clicking or typing, and execute […]

The post How to Build a Fully Functional Computer-Use Agent that Thinks, Plans, and Executes Virtual Actions Using Local AI Models appeared first on MarkTechPost.

Asif Razzaq

Google vs OpenAI vs Anthropic: The Agentic AI Arms Race Breakdown

6 days 17 hours ago

In this article we will analyze how Google, OpenAI, and Anthropic are productizing ‘agentic’ capabilities across computer-use control, tool/function calling, orchestration, governance, and enterprise packaging. Agent platforms, not only models, now define competitive advantage. Google is aligning Gemini 2.0 with an enterprise control plane on Vertex AI and a new ‘front door’ called Gemini Enterprise. […]

The post Google vs OpenAI vs Anthropic: The Agentic AI Arms Race Breakdown appeared first on MarkTechPost.

Michal Sutter
Checked
4 hours 4 minutes ago
Marktechpost
An Artificial Intelligence News Platform
Subscribe to Marktechpost feed