Unisami AI News

Alibaba’s Qwen team releases AI models that can control PCs and phones

January 27, 2025 | by AI

pexels-photo-7650780

Alibaba’s Qwen Team Drops AI Models That Can CONTROL Your PC and Phone

Move Over, DeepSeek — Alibaba Just Dropped a Game-Changer

While DeepSeek might be stealing headlines this week, Alibaba’s Qwen team just unleashed a BEAST of AI innovation that’s ready to dominate the scene. Meet Qwen2.5-VL, a family of AI models that doesn’t just analyze text and images — it CONTROLS your devices. Think OpenAI’s Operator, but with a Chinese twist and a whole lot more firepower.

What Can Qwen2.5-VL Do? Let’s Break It Down

  • Parse Files & Analyze Videos: It can extract data from invoices, understand hours-long videos, and even count objects in images. Talk about multitasking!
  • Control PCs & Phones: Watch it launch apps, book flights, and switch tabs like a pro. It’s like having a personal assistant on steroids.
  • Beat the Competition: According to Alibaba’s benchmarks, Qwen2.5-VL outperforms OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 2.0 Flash in video understanding, math, and document analysis.

“Qwen 2.5 VL is a Vision Language Model that can control your computer, similar to OpenAI’s Operator, extract structured information from charts, and more!”

Philipp Schmid, Technical Lead at Hugging Face

But Wait, There’s a Catch…

Being a Chinese-developed AI, Qwen2.5-VL has some restrictions. Ask it about sensitive topics like Taiwan’s autonomy or Xi Jinping’s “mistakes,” and you’ll hit a wall. China’s internet regulator ensures these models “embody core socialist values.” So, while it’s powerful, it’s also politically cautious.

Real-World Applications That’ll Blow Your Mind

In a demo shared on X, Qwen2.5-VL launched the Booking.com app and booked a flight from Chongqing to Beijing. On a Linux desktop, it controlled apps like a boss, though it’s still learning to handle complex tasks. But hey, Rome wasn’t built in a day!

“LMAO Qwen 2.5 VL can perform Computer Use, out of the box, taking on OpenAI Operator HEAD ON!”

Vaibhav (VB) Srivastav, Tech Enthusiast

Licensing: Open for Some, Restricted for Others

The smaller models, Qwen2.5-VL-3B and Qwen2.5-VL-7B, are available under a permissive license. But the flagship Qwen2.5-VL-72B? That’s under Alibaba’s custom license. If your company has over 100 million monthly active users, you’ll need Alibaba’s permission to deploy it commercially. Big players, take note!

Why This Matters

Qwen2.5-VL isn’t just another AI model — it’s a glimpse into the future of AI-powered productivity. Whether you’re analyzing data, controlling devices, or booking flights, this model is here to make your life easier. And with Alibaba backing it, you know it’s built to last.

Ready to test it out? Head over to Alibaba’s Qwen Chat app or download it from Hugging Face. The future of AI is here, and it’s called Qwen2.5-VL.

“`

Image Credit: Anna Tarazevich on Pexels

RELATED POSTS

View all

view all