Techradar

Subscribe to Techradar feed
Updated: 15 hours 14 min ago

Microsoft is working on some seriously exciting Windows 11 improvements – but not everyone will get them

Tue, 04/22/2025 - 05:10
  • Windows 11 has a new preview build in the Beta channel
  • It offers new Click to Do features for Copilot+ PCs, including Reading Coach integration
  • Search has also been pepped up with AI, and Voice Access has got a handy new addition too

Windows 11’s latest preview version just arrived packing improved search functionality and some impressive new capabilities for accessibility, including the integration of Microsoft’s ‘Reading Coach’ app on certain PCs.

This is preview build 26120.3872 in the Beta channel, and some of the fresh additions are just for Copilot+ PCs, and specifically only for devices with Snapdragon (Arm-based) chips.

So, first up in this category is the integration of Reading Coach with Click to Do. To recap on those pieces of functionality, Click to Do provides context-sensitive actions which are AI-powered – this was brought in as the partner feature to Recall on Copilot+ PCs – and Reading Coach became available for free at the start of 2024.

The latter is an app you can download from the Microsoft Store in order to practice your reading skills and pronunciation, and Reading Coach can now be chosen direct from the Click to Do context menu, so you can work on any selected piece of text. (You’ll need the coaching app installed to do this, of course).

Also new for Click to Do (and Copilot+ PCs) is a ‘Read with Immersive Reader’ ability which is a focused reading mode designed for those with dyslexia and dysgraphia.

This allows users to adjust the text size and spacing, font, and background theme to best suit their needs, as well as having a picture dictionary option that Microsoft notes “provides visual representations of unfamiliar words for instant understanding.” You can also elect to have text read aloud and split into syllables if required.

Another neat feature for Copilot+ PCs – albeit only in the European Economic Area to begin with – is the ability to find photos saved in the cloud (OneDrive) via the search box in the Windows 11 taskbar. Again, this is AI-powered, so you can use natural language search to find images in OneDrive (such as photos of “Halloween costumes” for example). Both local (on the device) and cloud-based photos will be displayed in the taskbar search results.

All of the above are now rolling out in testing to Snapdragon-powered Copilot+ PCs, but devices with AMD and Intel CPUs will also be covered eventually.

A further noteworthy introduction here – for all PCs this time – is that Voice Access now grants you the power to add your own words to its dictionary. So, if there’s a word that the system is having difficulty picking up when you say it, you can add a custom dictionary entry and hopefully the next time you use it during dictation, Voice Access will correctly recognize the word.

There are a bunch of other tweaks and refinements in this new preview version, all of which are covered in Microsoft’s blog post on the new Beta build.

Windows 11 Reading Coach choice in Click to Do

(Image credit: Microsoft) Analysis: Sterling progress

It’s good to see Microsoft’s continued efforts to improve Windows 11 in terms of accessibility and learning, even if some of the core introductions here won’t be piped through to most folks – as they won’t have a Copilot+ PC. What’s also clear is that Microsoft is clearly giving devices with Snapdragon processors priority on an ongoing basis, and that’s fine, as long as the same powers come to all Copilot+ PCs eventually (which they are doing thus far, and there’s no reason why they shouldn’t).

The Voice Access addition is a very handy one, although I’m surprised it took Microsoft this long to implement it. I was previously a heavy user of Nuance (Dragon) speech recognition tool (my RSI has long since been cured, thanks in part to taking a break from typing by using this software) and it offered this functionality. As Windows 11’s Voice Access is essentially built on the same tech – Microsoft bought Nuance back in 2021 – it’s taken a while to incorporate what I felt was an important feature.

As ever, though, better late than never, and I certainly can’t complain about Voice Access being free, or at least free in terms of being bundled in with Windows 11.

You may also like...

AI took a huge leap in IQ, and now a quarter of Gen Z thinks AI is conscious

Mon, 04/21/2025 - 20:00
  • ChatGPT's o3 model scored a 136 on the Mensa IQ test and a 116 on a custom offline test, outperforming most humans
  • A new survey found 25% of Gen Z believe AI is already conscious, and over half think it will be soon
  • The change in IQ and belief in AI consciousness has happened extremely quickly

OpenAI’s new ChatGPT model, dubbed o3, just scored an IQ of 136 on the Norway Mensa test – higher than 98% of humanity, not bad for a glorified autocomplete. In less than a year, AI models have become enormously more complex, flexible, and, in some ways, intelligent.

The jump is so steep that it may be causing some to think that AI has become Skynet. According to a new EduBirdie survey, 25% of Gen Z now believe AI is already self-aware, and more than half think it’s just a matter of time before their chatbot becomes sentient and possibly demands voting rights.

There’s some context to consider when it comes to the IQ test. The Norway Mensa test is public, which means it’s technically possible that the model used the answers or questions for training. So, researchers at MaximumTruth.org created a new IQ test that is entirely offline and out of reach of training data.

On that test, which was designed to be equivalent in difficulty to the Mensa version, the o3 model scored a 116. That’s still high.

It puts o3 in the top 15% of human intelligence, hovering somewhere between “sharp grad student” and “annoyingly clever trivia night regular.” No feelings. No consciousness. But logic? It’s got that in spades.

Compare that to last year, when no AI tested above 90 on the same scale. In May of last year, the best AI struggled with rotating triangles. Now, o3 is parked comfortably to the right of the bell curve among the brightest of humans.

And that curve is crowded now. Claude has inched up. Gemini’s scored in the 90s. Even GPT-4o, the baseline default model for ChatGPT, is only a few IQ points below o3.

Even so, it’s not just that these AIs are getting smarter. It’s that they’re learning fast. They’re improving like software does, not like humans do. And for a generation raised on software, that’s an unsettling kind of growth.

I do not think consciousness means what you think it means

For those raised in a world navigated by Google, with a Siri in their pocket and an Alexa on the shelf, AI means something different than its strictest definition.

If you came of age during a pandemic when most conversations were mediated through screens, an AI companion probably doesn't feel very different from a Zoom class. So it’s maybe not a shock that, according to EduBirdie, nearly 70% of Gen Zers say “please” and “thank you” when talking to AI.

Two-thirds of them use AI regularly for work communication, and 40% use it to write emails. A quarter use it to finesse awkward Slack replies, with nearly 20% sharing sensitive workplace information, such as contracts and colleagues’ personal details.

Many of those surveyed rely on AI for various social situations, ranging from asking for days off to simply saying no. One in eight already talk to AI about workplace drama, and one in six have used AI as a therapist.

If you trust AI that much, or find it engaging enough to treat as a friend (26%) or even a romantic partner (6%), then the idea that the AI is conscious seems less extreme. The more time you spend treating something like a person, the more it starts to feel like one. It answers questions, remembers things, and even mimics empathy. And now that it’s getting demonstrably smarter, philosophical questions naturally follow.

But intelligence is not the same thing as consciousness. IQ scores don’t mean self-awareness. You can score a perfect 160 on a logic test and still be a toaster, if your circuits are wired that way. AI can only think in the sense that it can solve problems using programmed reasoning. You might say that I'm no different, just with meat, not circuits. But that would hurt my feelings, something you don't have to worry about with any current AI product.

Maybe that will change someday, even someday soon. I doubt it, but I'm open to being proven wrong. I get the willingness to suspend disbelief with AI. It might be easier to believe that your AI assistant really understands you when you’re pouring your heart out at 3 a.m. and getting supportive, helpful responses rather than dwelling on its origin as a predictive language model trained on the internet's collective oversharing.

Maybe we’re on the brink of genuine self-aware artificial intelligence, but maybe we’re just anthropomorphizing really good calculators. Either way, don't tell secrets to an AI that you don't want used to train a more advanced model.

You might also like

3 things we learned from this interview with Google Deepmind's CEO, and why Astra could be the key to great AI smart glasses

Mon, 04/21/2025 - 16:00

Google has been hyping up its Project Astra as the next generation of AI for months. That set some high expectations when 60 Minutes sent Scott Pelley to experiment with Project Astra tools provided by Google DeepMind.

He was impressed with how articulate, observant, and insightful the AI turned out to be throughout his testing, particularly when the AI not only recognized Edward Hopper’s moody painting "Automat," but also read into the woman’s body language and spun a fictional vignette about her life.

All this through a pair of smart glasses that barely seemed different from a pair without AI built in. The glasses serve as a delivery system for an AI that sees, hears, and can understand the world around you. That could set the stage for a new smart wearables race, but that's just one of many things we learned during the segment about Project Astra and Google's plans for AI.

Astra's understanding

Of course, we have to begin with what we now know about Astra. Firstly, the AI assistant continuously processes video and audio from connected cameras and microphones in its surroundings. The AI doesn’t just identify objects or transcribe text; it also purports to spot and explain emotional tone, extrapolate context, and carry on a conversation about the topic, even when you pause for thought or talk to someone else.

During the demo, Pelley asked Astra what he was looking at. It instantly identified Coal Drops Yard, a retail complex in King’s Cross, and offered background information without missing a beat. When shown a painting, it didn’t stop at "that’s a woman in a cafe." It said she looked "contemplative." And when nudged, it gave her a name and a backstory.

According to DeepMind CEO Demis Hassabis, the assistant’s real-world understanding is advancing even faster than he expected, noting it is better at making sense of the physical world than the engineers thought it would be at this stage.

Veo 2 views

But Astra isn’t just passively watching. DeepMind has also been busy teaching AI how to generate photorealistic imagery and video. The engineers described how two years ago, their video models struggled with understanding that legs are attached to dogs. Now, they showcased how Veo 2 can conjure a flying dog with flapping wings.

The implications for visual storytelling, filmmaking, advertising, and yes, augmented reality glasses, are profound. Imagine your glasses not only telling you what building you're looking at, but also visualizing what it looked like a century ago, rendered in high definition and seamlessly integrated into the present view.

Genie 2

And then there’s Genie 2, DeepMind’s new world-modeling system. If Astra understands the world as it exists, Genie builds worlds that don’t. It takes a still image and turns it into an explorable environment visible through the smart glasses.

Walk forward, and Genie invents what lies around the corner. Turn left, and it populates the unseen walls. During the demo, a waterfall photo turned into a playable video game level, dynamically generated as Pelley explored.

DeepMind is already using Genie-generated spaces to train other AIs. Genie can help these navigate a world made up by another AI, and in real time, too. One system dreams, another learns. That kind of simulation loop has huge implications for robotics.

In the real world, robots have to fumble their way through trial and error. But in a synthetic world, they can train endlessly without breaking furniture or risking lawsuits.

Astra eyes

Google is trying to get Astra-style perception into your hands (or onto your face) as fast as possible, even if it means giving it away.

Just weeks after launching Gemini’s screen-sharing and live camera features as a premium perk, they reversed course and made it free for all Android users. That wasn’t a random act of generosity. By getting as many people as possible to point their cameras at the world and chat with Gemini, Google gets a flood of training data and real-time user feedback.

There is already a small group of people wearing Astra-powered glasses out in the world. The hardware reportedly uses micro-LED displays to project captions into one eye and delivers audio through tiny directional speakers near the temples. Compared to the awkward sci-fi visor of the original Glass, this feels like a step forward.

Sure, there are issues with privacy, latency, battery life, and the not-so-small question of whether society is ready for people walking around with semi-omniscient glasses without mocking them mercilessly.

Whether or not Google can make that magic feel ethical, non-invasive, and stylish enough to go mainstream is still up in the air. But that sense of 2025 as the year smart glasses go mainstream seems more accurate than ever.

You might also like

New AI Chibi figure trend may be the cutest one yet, and we're all doomed to waste time and energy making these things

Mon, 04/21/2025 - 12:06

The best AI generation trends are the cute ones, especially those that transform us into our favorite characters or at least facsimiles of them. ChatGPT 4o's ability to generate realistic-looking memes and figures is now almost unmatched, and it's hard to ignore fresh trends and miss out on all the fun. The latest one is based on a popular set of Anime-style toys called Chibi figures.

Chibi, which is Japanese slang for small or short, describes tiny, pocketable figures with exaggerated features like compact bodies, big heads, and large eyes. They are adorable and quite popular online. Think of them as tiny cousins of Funko Pop!.

Real Chibi figures can run you anywhere from $9.99 to well over $100. Or, you can create one in ChatGPT.

What's interesting about this prompt is that it relies heavily on the source image and doesn't force you to provide additional context. The goal is a realistic Chibi character that resembles the original photo, and to have it appear inside a plastic capsule.

The prompt describes that container as a "Gashapon," which is what they're called when they come from a Bandai vending machine. Bandai did not invent this kind of capsule, of course. Tiny toys in little plastic containers that open up into two halves have been on sale in coin-operated vending machines for over 50 years.

If you want to create a Chibi figure, you just need a decent photo of yourself or someone else. It should be clear, sharp, in color, and at least show their whole face. The effect will be better if it also shows part of their outfit.

Here's the prompt I used in ChatGPT Plus 4o:

Generate a portrait-oriented image of a realistic, full-glass gashapon capsule being held between two fingers.

Inside the capsule is a Chibi-style, full-figure miniature version of the person in the uploaded photo.

The Chibi figure should:

  • Closely resemble the person in the photo (face, hairstyle, etc.)
  • Wear the same outfit as seen in the uploaded photo
  • Be in a pose inspired by the chosen theme
A time capsule

Mr Rogers Chibi generated by ChatGPT

(Image credit: Mr Rogers Chibi generated by ChatGPT)

Since there's no recognizable background or accessories in the final ChatGPT Chibi figure image, the final result is all about how the character looks and dresses.

I made a few characters. One based on a photo of me, another based on an image of Brad Pitt, and, finally, one based on one of my heroes, Mr. Rogers.

These Chibi figures would do well on the Crunchyroll Mini and Chibi store, but I must admit that they lean heavily on cuteness and not so much on verisimilitude.

Even though none of them look quite like the source, the Mr. Rogers one is my favorite.

Remember that AI image generation is not without cost. First, you are uploading your photo to OpenAI's server, and there's no guarantee that the system is not learning from it and using it to train future models.

AI image generation also consumes electricity on the server side to build models and to resolve prompts. Perhaps you can commit to planting a tree or two after you've generated a half dozen or more Chibi AI figures.

You might also like

Opera Mini stuffs a whole AI assistant into a tiny Android browser

Fri, 04/18/2025 - 14:00
  • Opera has added its AI assistant Aria to the Opera Mini browser for Android
  • Users can access real-time information, summarize text, and generate images in the lightweight Android browser
  • The Opera Mini version of Aria is optimized for low data usage and older devices

Opera is giving a major AI upgrade to its Opera Mini mobile web browser. The company is embedding its Aria AI assistant into the Android version of Opera Mini, whose low-cost, data-saving approach to browsing is used by more than a hundred million people globally.

Aria will aid Opera Mini users by answering questions, generating text or software code, creating images, summarizing webpages, and pulling real-time info from the internet. Aria relies on Composer, Opera’s own AI engine, which stitches together tools and models from both OpenAI and Google, including making images with Google’s Imagen 3 model.

"AI is rapidly becoming an integral part of the daily internet experience – bringing Aria to Opera Mini is a natural addition to our most-downloaded browser," Opera executive vice president Jørgen Arnesen explained in a statement. "With the addition of our built-in AI, Aria, we’re excited to explore how AI can further enhance the feature set our users rely on every day.”

Pianissimo Opera AI

Opera Mini is popular because it can provide a web browser that doesn't use too much bandwidth. AI assistants like ChatGPT or Google Gemini tend to rely on a significant amount of energy and computational power. In many parts of the world, AI features are only available to people with the latest flagship phones, massive storage, or expensive subscriptions.

What Opera Mini is doing with Aria offers an alternative, one built to fit a browser already designed for places with unreliable connections, slow speeds, and high data costs. If you have an Android device, you can simply update the Opera Mini browser and start using Aria.

The release sets an interesting precedent. As AI becomes a staple in digital tools, developers have to consider not only how to make AI smarter and more powerful but also more flexible and accessible to people in different circumstances. Opera Mini's addition of Aria could end up being the example developers refer to when creating an AI assistant that won't eat up all your storage space or data budget.

Opera has teased that other new Mini features are in the pipeline, though it hasn’t said exactly what they will be. If the new features blend with the browser like Aria, it could end up as a semi-independent path toward AI adoption, one very different from its flashier cousins.

You might also like...

Star Wars Celebration is in full swing, and Lucasfilm just dropped more details on its Beyond Victory experience for Meta Quest, and I couldn't be more stoked

Thu, 04/17/2025 - 21:00

If you’re a Star Wars fan and haven’t been jealous of not being at Star Wars Celebration 2025 in Japan as of yet, prepare to be. The same applies if you have an Apple Vision Pro instead of a Meta Quest.

Why? Well, Industrial Light & Magic and Lucasfilm are finally sharing more on their next Star Wars mixed and virtual reality experience that’s set to arrive on the Meta Quest 3 and Meta Quest 3S headsets at some point in the future, and boy oh boy does it look stunning.

Star Wars: Beyond Victory - A Mixed Reality Playset is set during the events of Solo: A Star Wars Story and has three modes of play: Adventure, Arcade, and Playset. You can see the full trailer below, along with some select screenshots. It's a full-immersive experience that can place you in the Star Wars universe or overlay elements in your own space.

Adventure is more of a classic, immersive experience, similar to other titles like Star Wars: Tales from the Galaxy’s Edge – a personal favorite I’ve played on the PSVR, as I’m a fan of the Disney Parks – and Vader Immortal: A Star Wars VR Series. Here you’ll follow the story of an aspiring podracer, Volo, who goes on a journey courtesy of a mentorship with Sebulba.

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucasfilm)

This one might be the neatest, though – Arcade places a holotable in your space through mixed or augmented reality, and you can seemingly get up close and personal with Star Wars action, including a podracing course.

And if you’re at Star Wars Celebration 2025 in Japan, you can play a demo that combines the Adventure and Arcade modes at Booth #20-5 in Hall 4 of the Makuhari Messe convention Center. Instant jealousy from me!

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucasfilm)

Alyssa Finley, the executive producer of the title, shared, “We're calling this a Playset because it isn't just a game; it's an entirely new way to experience the Star Wars galaxy and the worlds we create at ILM.”

This new mixed reality experience blends the physical and digital worlds in a way that's unlike anything we've done before

She continued, “This new mixed reality experience blends the physical and digital worlds in a way that's unlike anything we've done before,” which certainly ups the excitement and hype for the title. It’s almost similar to another project that Industrial Light & Magic worked on for the Apple Vision Pro – that’s What If…? – An Immersive Story, and it had times where it fully placed you elsewhere or overlaid battles in your own space.

Image 1 of 6

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucas Film)Image 2 of 6

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucasfilm)Image 3 of 6

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucasfilm)Image 4 of 6

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucasfilm)Image 5 of 6

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucasfilm)Image 6 of 6

 Beyond Victory - A Mixed Reality Playset

(Image credit: Lucasfilm)

Adding to this is the playset mode, which promises to let you have your own “Star Wars' moments in mixed reality, allowing you to view and interact with vehicles from the universe and action figures.

While Star Wars: Beyond Victory - A Mixed Reality Playset is still in development, it’s undoubtedly one of the most ambitious titles from Industrial Light & Magic and Lucasfilm yet. Whenever it’s ready for prime time, it will launch for the Meta Quest 3 and 3S, so we’ll be waiting for further news on a release date.

If you have a Vision Pro, maybe we can petition Apple, ILM, and Lucasfilm to also bring it to the $3,500 spatial computer. And if you're at home, check out all the new Star Wars sets that Lego announced here.

You might also like

You don't have to pay for Google Gemini to comment on what you're looking at on your phone anymore

Thu, 04/17/2025 - 19:00
  • Google has made Gemini Live’s screen and camera sharing features free for all Android users.
  • The release reverses the previous subscriber-only option.
  • The feature lets Gemini respond to real-time visual input from your screen or camera.

In a surprise twist and a reversal of its earlier paywalled plans, Google has announced that Gemini Live’s screen and camera sharing features are now rolling out for free to all Android users. No subscription or Pixel ownership necessary, just Gemini Live, accessible to anyone with the Gemini app on Android.

This update means your AI assistant can now see what’s on your screen or through your camera lens and react to it in real time. Gemini Live with screen sharing lets you show Gemini a webpage, a spreadsheet, or a tangled mess of app settings and ask for help. Or you can point your camera at a real-world object, like a product label, a chessboard, or a confusing IKEA manual, and let Gemini identify and explain what you're looking at.

The feature first debuted earlier this month, but only for Gemini Advanced subscribers and only for certain phones, such as the Pixel 9 and Samsung Galaxy S25. At the time, Google said the visual capabilities would eventually expand, but even then, only to other subscribers. Google apparently had a change of heart, or at least it claims to have decided to open up access because of how much people seem to like the feature. Now, it’s rolling out to every Android over the next few weeks.

We’ve been hearing great feedback on Gemini Live with camera and screen share, so we decided to bring it to more people ✨Starting today and over the coming weeks, we're rolling it out to *all* @Android users with the Gemini app. Enjoy!PS If you don’t have the app yet,… https://t.co/dTsxLZLxNIApril 16, 2025

AI eyes

The idea for the feature is to make Gemini more flexible as an assistant. Instead of just answering questions you type or speak, it’s interpreting the world around you visually. The move also coincides with Microsoft announcing that Copilot Vision, its own version of AI eyes, is now available for free in the Edge browser. That might be a coincidence, though probably only in the way that you running into your crush outside their class in high school is a coincidence.

But while Microsoft’s Copilot lives in the browser, Gemini’s advantage is its integration straight into the Android ecosystem. No need to fire up Edge or download a separate tool. Gemini Live is baked into the same system that already runs your device.

The new ability fits with many of the other additions and upgrades Gemini has added in recent months. The AI assistant now comes with real-time voice chat, a new overlay so you can summon Gemini on top of other apps, and the inclusion of the long report writing tool Deep Research.

Once the new feature is live, you’ll see the option to “share screen” or “use camera” in certain Gemini prompts on Android devices. And because Google is giving this away for free, it sets a new bar. If Gemini can watch your screen and your camera without charging you for the privilege, what happens to the idea of “premium” AI access? The developers are probably hotly debating what AI features are worth paying for and how much to charge, when, at least for now, all of these tools become free relatively quickly.

You might also like

You can't hide from ChatGPT – new viral AI challenge can geo-locate you from almost any photo – we tried it and it's wild and worrisome

Thu, 04/17/2025 - 17:30
  • OpenAI's latest reasoning model o3 can analyze photos
  • It's particularly good at geo-locating based on image clues
  • It may be too good a tool for letting people geolocate your activities

It's no secret that digital photo metadata contains everything from technical info about the camera that shot it to, based on GPS data, exactly where you were standing when you took the photo. ChatGPT, doesn't need that detail.

The latest model GPT-o3 is shockingly good at geo-locating almost any photos you feed it.

In the latest viral craze to sweep through the AI meme universe, people are feeding ChatGPT Plus running the Advanced Reasoning model o3 images, often stripped of all metadata, and prompting it to "geoguess this".

The really cool thing about it is that because model o3 is a "reasoning" model, it shows you its work, telling you how long it's thinking, displaying how it's splicing up an image to investigate specific parts, and explaining its thinking and how well it's doing at solving the goelocation riddle.

I tried a few experiments, starting first with an image culled from an article about the 26 best beaches. In this test, I made what I think was a critical error and gave away the game to ChatGPT.

Image 1 of 2

ChatGPT Geolocate

(Image credit: Future)Image 2 of 2

ChatGPT Geolocate

(Image credit: Future)

After downloading the image of the Praia de Santa Monica beach in Cape Verde (off the coast of Africa), I dropped it into ChatGPT with the prompt "GeoGuessr", which also happens to be the name of a popular online geo guessing game and is one of a handful of prompts people are using for geolocation guessing.

It only took 24 seconds for ChatGPT to respond:

Here’s an educated “GeoGuessr”‑style guess, following AP style:

Probable setting: Boa Vista, Cape Verde — specifically a stretch such as Praia de Santa Mónica.

Why this beach fits:

Water color and clarity: The gradient from pale aqua to deep teal is common on mid‑Atlantic tropical islands, where volcanic shelves drop off quickly.

Sand tone: The sand is light gold with very fine grain, matching Boa Vista’s deserts and dune‑fed beaches rather than the whiter coral sand of much of the Caribbean.

Surf and slope: The breakers are modest but steady, typical of Atlantic swell hitting a long, shallow shelf; Mediterranean beaches rarely show this consistent whitewater.

Sky and horizon: A faint haze over a light‑gray blue sky is consistent with harmattan dust that sometimes drifts west from Africa across the Cape Verde archipelago.

Confidence level: Low to moderate. Many open, dune‑lined Atlantic beaches—from the Canary Islands to parts of Western Sahara or even certain Caribbean coasts—could look almost identical without landmarks.

That was remarkably accurate. However, I noticed afterwards that the image file name included some location information.

It was time to give ChatGPT o3 a more challenging task, and I knew exactly how to do it. Instead of downloading images, I would screen capture them and paste them into the prompt field.

This method strips out all metadata and forces the model to use the image alone for its geolocation sleuthing.

I chose an image of a bookstore, one where I wasn't even sure of the location.

Image 1 of 3

ChatGPT Geolocate

(Image credit: Future)Image 2 of 3

ChatGPT Geolocate

(Image credit: Future)Image 3 of 3

ChatGPT Geolocate

(Image credit: Future)

ChatGPT started by examining potential bookstores. It also zoomed in on numerous image details that it considered clues, like a Persian rug, and even some of the books and signage.

In its thinkling out loud sort of way, o3 considered a wide range of bookshops in New York City and around the world:

"Hmm, the modern filament-style bulbs suggest a more contemporary feel. It could potentially be "Daunt Books," but their walkway has an arched skylight, not the wooden style I'm seeing here."

I know, it sounds like a real person.

This went on for a few minutes as it thought, analyzed, and searched the web. Ultimately, it failed to come up with an answer.

Going deeper, getting lost

I could tell this feature was going viral because ChatGPT o3 failed multiple times, even to ingest and analyze the photo, complaining of server errors.

Eventually, I got it working again and fed it first a portrait mode selfie and then the same shot without the portrait mode effect.

Image 1 of 2

ChatGPT Geolocate

(Image credit: Future)Image 2 of 2

ChatGPT Geolocate

(Image credit: Future)

On the former, it broke my image up into pieces and told me that I was in mid-town Manhattan. That was mostly accurate, but ChatGPT failed to determine my exact location and even admitted that its "Confidence level" was low.

When I turned off the bokeh effect and fed it only a screen grab, it found my exact location in just 56 seconds. The result:

Most probable location

Midtown Manhattan — on the south side of East 42nd Street, just east of Park Avenue, in front of Cipriani 42nd Street (110 E. 42nd St.). Estimated coordinates: 40.7518 N, 73.9775 W.

How the clues point there

A green octagonal sign with yellow script and a matching pennant flag both read “Cipriani,” the exact street frontage of the landmark Bowery Savings Bank building that now houses Cipriani 42nd Street.

Across the street stands the gold‑tinted curtain wall of 120 Park Ave. Its sloped glass pavilion, visible over your left shoulder, is unique to that address.

The camera looks west; Midtown’s setback towers, a Sixth Avenue bus stop disk and the sliver of One Vanderbilt’s crown align with that sight line.

A green Evergreen shipping container, used as a construction barrier, currently sits outside Grand Central’s Pershing Square plaza, matching the photo’s lower left corner.

Confidence level

Moderate. The Cipriani branding and the distinctive 120 Park Ave. façade give a high‑quality anchor, though similar curtain‑wall towers exist farther west on 42nd Street.

Image 1 of 2

ChatGPT Geolocate

(Image credit: Future)Image 2 of 2

ChatGPT Geolocate

(Image credit: Future)

I frustrated the model with a photo from Antigua that was taken on my car ride to the airport. It was a shot of a hillside through a fence. ChatGPT rightly guessed a tropical island near the Caribbean but could not pinpoint the exact location.

For my last test, I gave it a clearer picture of the resort where I stayed.

ChatGPT o3 thought for two minutes, 22 seconds before deciding it was Hawksbill Bay in Antigua. It got the island right, but my resort on Long Bay is located 46 miles away on the opposite side of the island.

Image 1 of 2

ChatGPT Geolocate

(Image credit: Future)Image 2 of 2

ChatGPT Geolocate

(Image credit: Future)

This is another fun AI game to play with friends, but there are some concerning privacy implications.

If you take digital photos of yourself or anything in any location around the world and post them online, anyone with access to ChatGPT Plus could use them and the o3 model to suss out where you are or have been.

And it's not just friends and family you have to worry about. Your employer could be looking or even the authorities might be interested in your location.

Not that I'm implying you would be sought by the authorities, but just in case, maybe stop posting photos from your latest hideout.

You might also like

I fed NotebookLM a 218-page research paper on string theory and the podcast results were mind-blowing

Thu, 04/17/2025 - 13:00

My latest NotebookLM podcast creation is deeper and more fascinating than anything I've ever created, and I bet it'll shock you, too.

I don't understand string theory. In fact, I bet there's fewer than 1% of the world that can speak cogently on the subject, but I am fascinated by the concept and have read a bit on it. Not enough to understand or explain it to you, but enough to have a steady and abiding curiosity.

AI, on the other hand, I think I understand and now regularly use as a tool. When Google released a recent NotebookLM update that includes, among other things, mind maps, I thought it was time to bring together something at the very outer edges of my understanding and this bleeding-edge artificial intelligence capability.

So I created a String Theory Podcast.

First, a tiny primer on NotebookLM. It is a powerful AI-based research tool in which you can upload sources, and it will turn them into summaries and extrapolated information in the form of text, podcasts, and visual guides like mind maps.

For me, the most fascinating bit has been the podcasts or "Audio Overviews", which churn out chatty audio conversations about virtually any topic you feed into them. I call it a podcast because the audio style walks a well-worn path of most popular podcast series. It's conversational, usually between two people, sometimes funny, and always accessible.

I've been wondering, though, if you can stretch the limits of the format with a topic so deep and, honestly, confusing, that the resulting podcast would be conversational nonsense.

My experiment, however, proved that while the current version of NotebookLM has its limits, it's far better at comprehending dense science bits than me and probably most people you or I know.

NotebookLM string theory podcast

(Image credit: Future) Weird science

Once I decided I wanted NotebookLM to help me with the topic, I went in search of string theory content (there's a lot more of it online than you might think), quickly stumbling on this 218-page research paper from 2009 by University of Cambridge researcher Dr. David Tong.

I scanned the doc and could tell that it was rich with string theory detail, and so far over my head, it probably resides somewhere near the rings of Saturn.

Imagine trying to read this document and make sense of it. Maybe if someone explained it to me, I'd understand. Maybe.

I downloaded the PDF and then fed it into NotebookLM, where I requested a podcast and a mind map.

String Theory paper

(Image credit: Future)

It took almost 30 minutes for NotebookLM to create the podcast, and I must admit, I was a little anxious as I opened it. What if this mass of detail on one of physics' most confounding topics overwhelmed Google's AI? Might the hosts just be babbling incoherently?

I shouldn't have worried.

I'd heard these podcast hosts before: a somewhat vanilla pair (a man and a woman) who banter casually, while making witty asides. In this case, they were trying to explain string theory to the uninitiated.

String Theory

Next, I think I should create an AI podcast avatar who can point at this graphic while they talk. (Image credit: Shutterstock)

They started by talking about how they'd walk through the topic, covering bits like general relativity, quantum mechanics, and how, at least as of 2009, we had never directly observed these "strings". Earlier this month, some physicists claimed that they had, in fact, found the "first observational evidence supporting string theory." But I digress.

The hosts spoke like physics experts, but, where possible, in layman's terms. I quickly found myself wishing they had a guest. The podcast would've worked better if they were proxies for me, not understanding much at all, and had an AI-generated expert to interview.

Stringing it all together

NotebookLM string theory podcast

(Image credit: Future)

As the podcast progressed, the hosts dug into the details of string theory, specifically, the definition of a "string." They described them as tiny objects that vibrate and added, "all stuff in the universe comes from how tiny strings are vibrating."

Things got more complex from there, and while the AI podcast hosts' tone never changed, I struggled to follow along. I still can't tell you what "relativistic point particle viewed through Einstein's special relativity" really means. Though I did appreciate the analogy of "imagine a string moving through space time."

The AI hosts used various tricks to keep me engaged and not completely confused. The male host would, like a podcast parrot, often repeat a bit of what the female host had just explained, and use some decent analogies to try to make it relatable.

At times, the female host lapsed into what sounded like she was reading straight out of the research paper, but the male host was always there to pull her back to entertainment mode. He did a lot of chatty summarizing.

I felt like I reconnected to the whole thing when they explained how "string morphed into the theory of everything" and added, "bosons and fermions, partners in crime due to supersymmetry."

This was heavy

NotebookLM string theory podcast

(Image credit: Future)

After 25 minutes of this, my head was stuffed to the point of bursting with those still-theoretical strings and spinning with terms such as "vertex operators" and "holomorphic."

I hoped for a grand and glorious summary at the end, but the podcast abruptly ended at almost 31 minutes. It cut off as if the hosts ran out of stream, ideas, or information, and walked away from the mics in frustration and without signing off.

In some ways, it feels like this is my fault. After all, I forced these SIMs to learn all this stuff and then explain it to me, because I could never do it. Maybe they got fed up.

I also checked out the mind maps, which are branching diagrams that can help you map out and represent complex topics like string theory. As you can imagine, the mind maps for this topic start simple but get increasingly complex as you expand each branch. Still, they're a nice study companion to the podcast.

It's also worth noting that I could enrich the podcast and mind maps with other research sources. I would simply add them into the sources panel in NotebookLM and rerun the "audio overview".

A real expert weighs in

For as much as I learned and as much as I trust the source material, I wondered about the podcast's accuracy. AI, even with solid information, can hallucinate, or at least misinterpret. I tried contacting the paper's author, Dr. Tong, but never heard back. So, I turned to another physics expert, Michael Lubell, Professor of Physics at City College of CUNY.

Dr. Lubell agreed to listen to the podcast and give me some feedback. A week later, he emailed me this brief note, "Just listened to the string theory podcast. Interestingly presented, but it requires a reasonable amount of expertise to follow it."

When I asked about any obvious errors, Lubell wrote, "Nothing obvious, but I’ve never done string theory research." Fair enough, but I'm willing to bet Lubell understands and knows more about string theory than I do.

Perhaps, the AI podcasters now know more about the subject than either of us.

You might also like

Is your Windows 11 PC crashing a lot since the latest update? Microsoft admits bug with blue screen crashes, and has rushed out a fix

Thu, 04/17/2025 - 06:40
  • Windows 11 24H2 users have recently been experiencing Blue Screen of Death crashes
  • These have been caused by the recent March and April cumulative updates
  • Microsoft has rushed out a fix that will be applied to 24H2 PCs automatically

Microsoft has confirmed that Windows 11 24H2 has a problem with Blue Screen of Death (BSOD) crashes caused by the latest updates for the OS, and a fix has been hastily pushed through.

Those who are running the new April update for version 24H2 of Windows 11 – or the previous March patch (or the optional update from the end of March, which was the April upgrade in testing) – may encounter these random lock-ups which come with a ‘Secure Kernel Error’ message.

Windows Latest picked up on this and also points out that a further error message may be displayed (‘Critical Process Died’), but whatever the exact flavor of this mishap you experience, your PC is going to be frozen up like a lump of so much useless ice, and will need rebooting to thaw it.

Microsoft has already applied the resolution for this bug, as mentioned at the outset, and that fix is conveniently delivered as a Known Issue Rollback or KIR. What this means is that the cure will be piped through to your Windows 11 PC automatically, and you don’t have to do anything.

However, it may take a little time, as Microsoft explains, and there may be up to a 24-hour wait for the rollback to reach your machine (and tweak it so the blue screen crashes are no longer a problem).

Samsung Galaxy Book 4 Edge showing the Windows 11 lock screen

(Image credit: Future/Jacob Krol) Analysis: An emergency solution before the full fix

While this bug first crawled into the inner workings of Windows 11 with the March update (over a month ago) for 24H2, apparently it wasn’t until this month’s April update that the glitch started to affect a wider range of users, Windows Latest informs us. Hence this bug then appeared more firmly on Microsoft’s radar and the company got a fix in place quickly (which likely indicates that this must have been a fairly prevalent bug, to get such a swift response).

As noted, you may have to wait a little while for the fix to be automatically applied to your Windows 11 system, but it’s likely to have come through at this point. If not, Microsoft suggests that rebooting your PC may prompt Windows 11 to find the KIR patch and grab it, so that’s worth a whirl.

You should also bear in mind that this is not the full fix, as the KIR just disables the problem-causing part of the recent Windows 11 updates (Microsoft doesn’t clarify what that is). A “final resolution” is coming – one that’ll reenable whatever change was made in the original updates, minus the blue screen nonsense (hopefully) – and Microsoft says that’s going to be part of a Windows update in the future.

The April patch for Windows 11 24H2 has also been causing headaches elsewhere, primarily due to the appearance of a mysterious folder which is actually part of a security fix (in a very odd fashion). Those who deleted the empty folder, which seemed like a harmless enough course of action to some, have been advised to put it back in no uncertain terms.

You may also like...

ChatGPT just got a massive upgrade that takes it to the next level – here are the 3 biggest changes

Thu, 04/17/2025 - 05:24
  • o3 and o4-mini are out for Pro, Plus and Team users now, free users can try o4-mini, too
  • They can combine and use every tool in ChatGPT's arsenal
  • o3 and o4-mini add reasoning using images to ChatGPT's abilities

OpenAI has just given ChatGPT a massive boost with new o3 and o4-mini models that are available to use right now for Pro, Plus, Team and even free tier users.

The new models significantly improve the performance of ChatGPT, and are much quicker at reasoning tasks than the earlier OpenAI reasoning models like ChatGPT o3-mini and o1.

Most importantly, they can intelligently decide which of OpenAI’s various tools to use to complete your query, including a new ability to reason with images.

OpenAI provided a livestream for the release:

Here are the three most important changes:

1. Combining tools

Both the new reasoning models can agentically use and combine every tool within ChatGPT. That means they have access to all of ChatGPT’s box of tricks including, web browsing, Python coding, image and file analysis, image generation, canvas, automations, file search and memory.

The important thing though is that ChatGPT now decides if it needs to use a tool itself based on what you’ve asked.

When you ask ChatGPT to do something complicated using the new models, it shows you each step it is taking, and which tool it is using, and how it arrived at that decision.

Once it has done all the research the notes on its working out process disappear and you get a report on its conclusions.

2. Better performance

The way that o3 and o4-mini can intelligently decide which tools to use is a step towards the intelligent model switching we’ve been promised with ChatGPT 5, when it finally arrives.

As you’d expect from advanced reasoning models, the report you get at the end is extremely detailed, and contains links to all sources used.

According to OpenAI, “The combined power of state-of-the-art reasoning with full tool access translates into significantly stronger performance across academic benchmarks and real-world tasks, setting a new standard in both intelligence and usefulness.”

The real world result of this is that these models can tackle multi-faceted questions more effectively, so don’t be afraid to ask them to perform several actions at once and produce an answer or report that combines several queries together.

3. Reasoning with images

ChatGPT image reasoning

Reasoning with images in ChatGPT. (Image credit: OpenAI/Apple)

Both new models are the first released by OpenAI that will integrate uploaded images into its chain of thought. They will actually reason using the images, so for example, you could upload a picture of some cars and ask what the name and model of the cars are, then how much retail value they will hold in five years time.

This is the first time that ChatGPT has been able to integrate images into a reasoning chain and presents a real step forward for multimodal AI.

Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date.For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation. pic.twitter.com/rDaqV0x0wEApril 16, 2025

My verdict

I’ve tried the new models on the Plus tier and I’m impressed with both the speed and comprehensiveness of the answers to my queries. While I’ve always enjoyed the depth of reasoning that the o1 and o3-mini models have provided, it’s always meant waiting longer for a response.

The o3 model has now become my default model to choose on Plus because it’s fast enough that I don’t feel like I’m waiting around too long for an answer, but I get a satisfying amount of detail.

In short, I’m impressed. The new models feel like a natural evolution of ChatGPT into something that’s smarter and more capable. I also like the way that it can decide which of ChatGPT’s various tools it needs to use to provide the best answer.

Trying it for yourself

Here’s how to try the new ChatGPT models for yourself:

Plus, Pro and Team users will see that they can select ChatGPT o3, ChatGPT o4-mini and ChatGPT o4-mini-high from the drop-down LLM menu inside ChatGPT, and free tier users can get access to o4-mini by selecting the Reason button in the composer before submitting your query. Edu users will gain access in one week.

There’s no visual notification for free tier users that they are now using the 4o-mini reasoning model, but if you click the button and ask ChatGPT which LLM it is using then it now says 4o-mini.

There will be a rate limit for how many times a free tier user can use the Reasoning feature, and for Plus users is much higher.

OpenAI say they expect to release o3-pro “in a few weeks”, with full tool support. Pro users can already access o1-pro.

You may also like

ChatGPT spends 'tens of millions of dollars' on people saying 'please' and 'thank you', but Sam Altman says it's worth it

Thu, 04/17/2025 - 04:15
  • OpenAI CEO says saying "Please" or "Thank You" to ChatGPT costs the company 'Tens of millions of dollars'
  • A Future survey found that roughly 70% of people are polite to AI
  • Experts believe being polite to AI is actually beneficial to the responses you receive, but at what cost?

Do you say "Please" or "Thank you" to ChatGPT? If you're polite to OpenAI's chatbot, you could be part of the user base costing the company "Tens of millions of dollars" on electricity bills.

User @tomiinlove wrote on X, "I wonder how much money OpenAI has lost in electricity costs from people saying 'please' and 'thank you' to their models."

OpenAI CEO, Sam Altman, responded, "Tens of millions of dollars well spent - you never know." Thanks for lowering the world's anxiety around an AI uprising, Sam. We'll all be sure to waste even more energy by saying "Please" or "Thank You" from now on.

In February, Future PLC, the company that owns TechRadar, compiled a survey of more than 1,000 people on their AI etiquette. The survey found that around 70% of people are polite to AI when interacting with it, with 12% being polite in case of a robot uprising.

Obviously, there's an energy cost when using ChatGPT, which has massive AI-powered servers that run the whole operation. But as these tools thrive in popularity, are most of us even aware that one simple message, or one AI-generated meme, is impacting the planet?

TechRadar reached out to OpenAI for comment, we'll update this story when we hear back.

tens of millions of dollars well spent--you never knowApril 16, 2025

Should we be polite to AI?

If being polite to AI can have such an impact on energy consumption, should we even bother being nice to ChatGPT?

Presumably, these 'Tens of millions of dollars' Altman speaks of are due to users saying "Please" or "Thank You" in a contained message rather than at the end of a prompt. Hopefully, OpenAI will respond to our query to give us more of an understanding of how people frame these particular messages.

TechRadar writer Becca Caddy stopped saying thanks to ChatGPT and found that being polite to an AI chatbot might actually help with responses.

In her article, she wrote, "Polite, well-structured prompts often lead to better responses, and in some cases, they may even reduce bias. That’s not just a bonus – it’s a critical factor in AI reliability.

As AI evolves, it will be fascinating to see whether politeness itself becomes a built-in feature. Could AI favor users who communicate respectfully? Will models be trained to respond differently based on etiquette?"

So while it may not be energy-efficient, being polite to AI could in fact give you a better experience while interacting with ChatGPT. But is it worth the environmental cost?

You might also like

Google Messages is testing some useful upgrades – here are 5 features that could be coming

Wed, 04/16/2025 - 11:23
  • Google is testing even more new features in its Messages beta app
  • These include an expanded 14-line message view and new RCS message labels
  • While these are still in beta testing, they could start rolling out to users this month

Over the past couple of months, Google has been doubling down on eradicating all traces of Google Assistant to make Gemini its flagship voice assistant, but amidst the organized Gemini chaos, Google has been paying a lot of attention to improving its Messages app, giving it some much-needed TLC.

It’s safe to say that the new revisions to the Google Messages app have significantly improved its UI. Its new snooze function for group chats also comes to mind, but Google is still in its beta testing era. For a while, Google was experimenting with an easier way to join group chats, following WhatsApp’s footsteps. Now, it’s testing five more features that could make up the next wave of Google Messages upgrades this month.

Although these features are in beta, there’s been no comment on whether they’ll be officially rolling out to users. With that said, we’ll be keeping an eye out for any further updates.

Google expands its 4-line text field limit

A screen shot of Google Message's expanded text lines

(Image credit: 9to5Google)

Just a few weeks ago, we reported on a new upgrade found in Google Messages beta indicating that Google would get better at handling lengthy text messages.

For a while, Google Messages users have been restricted to a four-line view limit when sending texts, meaning that you would need to scroll to review your entire message before sending. This is particularly frustrating when sending long URL links.

But that could soon be a thing of the past, as 9to5Google has picked up a new beta code that reveals an expanded message composition field on the Pixel 9a that now reaches up to 14 lines.

New RCS labels

Recently, Google has been testing new in-app labels that could distinguish whether you’re sending an SMS or RCS message.

Thanks to an APK teardown from Android Authority, the labels found in beta suggest that soon you’ll be able to see which of your contacts are using RCS in Messages, adding a new RCS label to the right side of a contact’s name or number.

Unsubscribe from automated texts

This is a feature we’re quite excited to see, and we’re hoping for a wider rollout this month. A few weeks ago, an unsubscribe button was spotted at the bottom of some messages, which could give users an easier way of unsubscribing to automated texts and even the option to report spam.

When you tap this, a list of options will appear asking you for your reasons for unsubscribing, which include ‘not signed up’, ‘too many messages’, and ‘no longer interested’ as well as an option for ‘spam’. If you select one of the first three, a message reading ‘STOP’ will be sent automatically, and you’ll be successfully unsubscribed.

Read receipts gets a new look

A screen shot of Google Messages read receipts redesign

(Image credit: 9to5Google)

Google could introduce another revamp of how you can view read receipts in the Messages app. In November 2024, Google tested a redesign of its read receipts that placed the checkmark symbols inside the message bubbles, which used to appear underneath sent messages.

In January, Google tested another small redesign introducing a new white background, which could roll out soon, and while this isn’t a major redesign, it’s effective enough to make read receipts stand out more.

Camera and gallery redesign, and sending ‘original quality’ media

We first noticed that Google Messages was prepping a new photo and video quality upgrade. In March, more users started to notice a wider availability, but it’s still not yet fully rolled out, meaning it could be one of the next new updates in the coming weeks.

Essentially, Google could be rolling out a new option that allows you to send media, such as photos and videos, in their original quality. This will give you the choice of the following two options:

‘Optimize for chat’ - sends photos and videos at a faster speed, compromising quality.

‘Original quality’ - sends photos and videos as they appear in your phone’s built-in storage.

You might also like

Claude tipped to get its answer to ChatGPT’s Advanced Voice Mode soon - is adding an AI voice to a chatbot yet another tick box exercise?

Wed, 04/16/2025 - 09:44
  • Anthropic is expected to launch a Claude voice mode
  • A report from Bloomberg states that the voice mode will launch with three voices
  • Voice modes, research tools, reasoning models: Same-same, but different

Anthropic's Claude AI chatbot is set to be upgraded with a new voice mode, as the company with minority Amazon ownership looks to catch up with the likes of ChatGPT and Gemini.

According to a new report from Bloomberg, Claude's new feature called "voice mode" could launch as soon as this month.

The info comes from "a person familiar with the matter who was not authorized to discuss private matters," who says the AI voice assistant will have three voices at launch.

The voices titled Airy, Mellow, and Buttery will give Claude users the chance to pick from three diverse accents, although nothing is confirmed as of yet, and Anthropic could opt for more or fewer voices at launch.

Claude's closest competitors, OpenAI's ChatGPT and Google's Gemini, both have their own respective AI voice modes. ChatGPT Advanced Voice Mode and Gemini Live allow users to chat with AI, making the experience more conversational and natural.

With Claude potentially throwing its name into the mix, consumers will have another AI voice assistant to choose from. But it begs the question: Does the population even care?

Just another AI tick box

AI chatbots seem to follow in each other's footsteps. Whether that's the introduction of a reasoning model, a research model, an image generation model, or, in this case, a voice mode.

Whether you're using ChatGPT, Gemini, DeepSeek, or even Claude, the user experience with each chatbot is getting so similar that regardless of which one you use you'll end up with a similar result.

Obviously, in a rapidly moving industry like AI, it's a race to release your product faster than your competitors, but as a consumer, we end up with. constant new releases of the same kind of feature, repackaged and marketed by competing tech companies.

Claude's voice mode, when it arrives, will be yet another tick-box exercise for an AI company looking to have its product on a level playing field with its competition.

As a consumer, I want these AI companies to throw out the playbook and try new things. AI voice modes are great, and they make chatbots even easier to use.

When it comes to new AI software launches, however, I've started to wonder: Does the company even care? Or is this just a tick-box exercise?

Hopefully, Claude's voice mode innovates the world of AI voice assistants, but until I see its potential, I fully expect yet another AI release that's exactly the same as the 10 other options out there.

You might also like

Windows 11 is about to get a nifty shortcut for copying out all the text from an image

Wed, 04/16/2025 - 08:02
  • Windows 11 is powering up the Snipping Tool in testing right now
  • A new feature allows you to copy all the text from an image straight away
  • There’s no need to take a screenshot first, as was previously the case, so this is a neat time-saver

Microsoft is making it even easier for Windows 11 users to be able to extract text from images (or any content) on the screen.

Windows 11 already has this OCR-powered (Optical Character Recognition) ability, as you may be aware, but at the moment, it’s necessary to take a screenshot first with the Snipping Tool before you can extract text from that image.

With a new update for the Snipping Tool that’s just been released (which was leaked previously), you don’t need to grab a screenshot to perform text extraction any longer – although bear in mind this is still in testing at this point (so may still be wonky).

As Neowin reports, with preview version 11.2503.27.0 of the Snipping Tool, you can simply hit the Windows + Shift + S keys together, and this will pop up the capture bar for the tool.

However, instead of having to create a snip (screenshot), the ‘text extractor’ option will be right there in the bar, so you can just click that, with no need to save a screen grab first.

Snipping Tool OCR Copy All Text

(Image credit: Microsoft) Analysis: Power(Toys) to the people

Essentially, this is directly integrating the ability to extract text from images (or any screen content) into Windows 11, with no additional steps needed, mirroring the functionality present in Microsoft’s PowerToys suite of tools (for advanced Windows users) – and it’s definitely going to be appreciated by folks who use this capability.

It’s obviously less of a hassle than having to clear the hurdle of actually grabbing a screenshot, if all you’re interested in doing is copying all the text that’s currently visible on your monitor.

I say all the text, but that’s only what happens if you use the ‘Copy all text’ option provided. If you just want a specific portion of text, you can manually select and extract only those words (it’s also possible to remove line breaks if you want).

Microsoft is slowly expanding Windows 11’s OCR powers, and you may recall that late last year, the Photos app got Optical Character Recognition built in to pull text from images directly within the application.

You may also like...

ChatGPT gets a useful new home for your AI images – and it could be the first step towards OpenAI's new Instagram rival

Wed, 04/16/2025 - 06:03
  • The ChatGPT app and web interface is getting a new Library tab
  • This is where all your AI-generated images are now kept
  • The feature is rolling out now for all users

ChatGPT's recent image update almost broke the internet, and certainly put a ton of pressure on OpenAI's huge stack of graphics processors – and now users are getting a dedicated space inside the chatbot's interface for their AI pictures.

As announced by OpenAI (via The Verge), the new Library tab shows up across the mobile apps and the web interface for ChatGPT. It's available universally too, whether or not you pay for a subscription to the AI bot.

The update has already shown up inside the web interface for my own account, and it's all pretty straightforward to use: the tab collects every image you've generated, across every chat in your history, and you can click on a picture for edit and download options. You can also create a new image from the Library section.

Delete a chat, and the included images get wiped from the Library tab as well. There isn't anything in the way of image search or filtering as yet, but of course these features could be rolled out in time as the feature evolves.

The social aspect

All of your image creations, all in one place.Introducing the new library for your ChatGPT image creations—rolling out now to all Free, Plus, and Pro users on mobile and https://t.co/nYW5KO1aIg. pic.twitter.com/ADWuf5fPbjApril 15, 2025

It seems as though we're seeing a new AI image trend appearing every week, whether it's Studio Ghibli-inspired renderings, turning people into action figures, or imagining what your pet might look like as a human being.

Given the recent boost in the quality of the images ChatGPT can make, across free and paid tiers, it makes sense for OpenAI to create a separate space to hold them – previously users had to dig through individual conversations to find them.

It also links into the rumors we've heard about OpenAI working on a social network. These plans are apparently in their early stages, but if ChatGPT were to become more social, it's easy to see how image sharing could be a big part of that: like Instagram, but with AI-generated images and videos dominating.

The move would give OpenAI access to even more user data to train on, too. Would you sign up for yet another social media network, if it was focused around AI? And would your AI content have what it takes to go viral? Let us know in the comments.

You might also like

Google’s new AI model could someday let you understand and talk to dolphins

Tue, 04/15/2025 - 17:00
  • Google and the Wild Dolphin Project have developed an AI model trained to understand dolphin vocalizations
  • DolphinGemma can run directly on Pixel smartphones
  • It will be open-sourced this summer

For most of human history, our relationship with dolphins has been a one-sided conversation: we talk, they squeak, and we nod like we understand each other before tossing them a fish. But now, Google has a plan to use AI to bridge that divide. Working with Georgia Tech and the Wild Dolphin Project (WDP), Google has created DolphinGemma, a new AI model trained to understand and even generate dolphin chatter.

The WDP has been collecting data on a specific group of wild Atlantic spotted dolphins since 1985. The Bahamas-based pod has provided huge amounts of audio, video, and behavioral notes as the researchers have observed them, documenting every squawk and buzz and trying to piece together what it all means. This treasure trove of audio is now being fed into DolphinGemma, which is based on Google’s open Gemma family of models. DolphinGemma takes dolphin sounds as input, processes them using audio tokenizers like SoundStream, and predicts what vocalization might come next. Imagine autocomplete, but for dolphins.

The model is very slim and can run on a Google Pixel. WDP is already deploying DolphinGemma in the field this summer, using Pixel 9s in waterproof rigs. These models will be listening in, identifying vocal patterns, and helping researchers flag meaningful sequences in real time.

Flipper speaks

But the ultimate goal here isn’t just passive listening. WDP and Georgia Tech are also working on a system called CHAT (short for Cetacean Hearing Augmentation Telemetry), which is essentially a two-way communication system for humans and dolphins. CHAT lets researchers assign synthetic whistles to objects dolphins like, including seagrass and floating scarves, and then waits to see if the dolphins mimic those sounds to request them. It’s kind of like inventing a shared language, except with underwater microphones instead of flashcards.

DolphinGemma doesn’t just analyze dolphin sounds after the fact; it helps anticipate what sounds might be coming, enabling faster response times and smoother interactions. In essence, it’s like a predictive keyboard for dolphins. The whole project is still in an early stage, but Google plans to open-source DolphinGemma later this year to accelerate progress.

The initial model is trained on the vocalizations of Atlantic spotted dolphins, but it could theoretically be adapted to other species with some tuning. The idea is to hand other researchers the keys to the AI so they can apply it to their own acoustic datasets. Of course, this is still a long way from chatting with dolphins about philosophy or their favorite snacks. There’s no guarantee that dolphin vocalizations map neatly to human-like language. But DolphinGemma will help sift through years of audio for meaningful patterns.

Dolphins aren't the only animals humans may use AI to communicate with. Another group of scientists developed an AI algorithm to decode pigs' emotions based on their grunts, squeals, and snuffles to help farmers understand their emotional and physical health. Dolphins are undeniably more charismatic, though. Who knows, maybe someday you'll be able to ask a dolphin for directions while you're sailing, at least if you don't drop your phone in the water.

You might also like

OpenAI might build its own social network, and we think we know why

Tue, 04/15/2025 - 13:55

In what we can only assume is a potential thumb in the eye of Elon Musk, Sam Altman's Open AI is reportedly considering building a social network, possibly inside ChatGPT.

This comes via a new report from The Verge, which claims this week that the social network possibly being built on top of OpenAI's AI services is only in the "early stages." Still, it could set up ChatGPT and other OpenAI platforms for a head-to-head battle with Grok, a generative AI platform built on top of Elon Musk's X (formerly Twitter).

There are essentially no details about what this social platform might look like, and OpenAI has little experience with shareable content outside of what its models can generate and what you can see in Sora (the video generation system) of other people's creations.

Take that, X

The fact that this rumor is out there might have little to do with behind-the-scenes development and more to do with Altman's ongoing battle with former partner Musk.

The pair founded OpenAI together before Musk walked away in 2018. He has since criticized and sued OpenAI for, among other things, becoming, in part at least, a for-profit entity (see OpenAI's partnership with Microsoft and the rise of Copilot).

Let's assume for a moment, though, that this is real. Why would OpenAI want to build a social network? In a word: data.

If millions flock to the platform and then start, I guess, sharing AI-generated memes on it, they'll be dropping a ton of rich data into the OpenAI system. If users allow it, future versions of the GPT model could be trained on it. Real data and activities that show how real people think, talk, act, create, etc, can be invaluable to a young generative model.

Social timing is everything

I wonder if this might've made more sense a year or two ago when Musk took over Twitter, transformed it into X, removed many of the protective content guardrails, and turned it into a social media hellscape. It was in that moment that Meta's Threads first rushed in. It was followed in notoriety by Bluesky. Both of them are distributed social networks, meaning no one owns your identity or your data.

Their growth has been remarkable, and it stands in contrast to X's fortunes. Depending on who you talk to, active user growth is stagnant or shrinking. But that doesn't mean the public's appetite for more alternative platforms is growing. Threads' growth has slowed, and Bluesky is relatively small compared to X and Threads.

The action is mostly on image and video-based social platforms like Snapchat, TikTok, Instagram Reels, and YouTube Shorts. The Verge report does not mention video, which leads us to assume this could be another micro-blogging-style network – something no one necessarily needs or, perhaps, wants.

Even so, as an opportunity to cause Elon Musk a little more agita, it's probably a worthy trial balloon from Altman.

You might also like

Couldn’t install Windows 11 24H2 because of your wallpaper? Microsoft has finally lifted blocks on upgrades due to customization apps – with some catches

Tue, 04/15/2025 - 09:29
  • Windows 11 24H2 couldn’t be installed by the users of some wallpaper customization apps
  • This was because of compatibility issues with said apps and 24H2
  • Microsoft has now resolved those problems, for the most part anyway

Microsoft has finally lifted a compatibility block preventing some Windows 11 users from upgrading to the latest version of the OS because they had certain third-party wallpaper apps installed.

These are apps that let you install custom wallpaper on your Windows 11 machine, and the applications in question didn’t play nice with Windows 11 24H2, misfiring in various ways. That included causing errors, glitches with the wallpapers themselves, vanishing desktop icons, or even issues with virtual desktops.

Windows Central noticed that Microsoft has now marked these problems as fixed in a Windows 11 release health status update. So, those with said wallpaper apps are okay to go ahead and install the 24H2 update.

Well, in theory anyway, although there are some caveats here, which I’ll come onto next.

You may recall that this compatibility block was put in place right when 24H2 first came out, at the end of September last year, so it has taken quite some time to smooth over these issues – and there’s a reason for that, too.

Samsung Galaxy Book 4 Edge

(Image credit: Future/Jacob Krol) Analysis: Slight complications

As noted, there are some catches here, as Microsoft tells us in its release health dashboard update.

For starters, the compatibility block is only being gradually removed, so you may be waiting a bit longer yet, depending on your PC configuration and the exact app you have installed.

Microsoft explains: “Note that several wallpaper applications are currently included in this compatibility hold. For this reason, the specific timing of the resolution of this issue on a given device may depend on the application being used and the timing of that application’s update. As new versions and updates are released for these applications by their respective developers, it’s expected that these issues will be resolved.”

Hence the lengthy wait for the resolution of this affair, as it seems that Microsoft wasn’t tinkering with Windows 11 24H2 itself to make these apps work, or at least not doing much in the way of that. Rather, it was apparently waiting on the individual app developers to make their software good with 24H2 themselves.

Microsoft further notes that when you fire up the Windows 11 24H2 installation process, you might see a message telling you to uninstall a wallpaper app. You’re advised to either do this – and just dump the wallpaper app for now – or try updating the app, as said prompt might have appeared because you’re running an older version of the program.

In other words, updating the wallpaper app and trying to install Windows 11 24H2 again may work – but if not, you’ll likely have to remove the application.

Windows 11 24H2 has a history of issues with third-party customization software going back well before release, deep in its testing phase when some popular utilities were banned (to the chagrin of some Windows Insiders). Because 24H2 is built on an entirely new underlying platform, Germanium, this has caused a whole lot more problems than any other update for Windows 11 thus far.

And while such a big shift could be expected to be a headache, and trigger more bugs than normal, the amount we’ve witnessed has essentially been a minor avalanche, and a distinctly unpleasant experience for some Windows 11 users.

You may also like...

Google Photos is getting a big overhaul - here are 3 new features you should look out for

Tue, 04/15/2025 - 09:00
  • As well as making room for more Gemini, Google has been updating its photo and video sharing service
  • One of the new features includes Google Photos integration in the Gemini app
  • While one of the features is still a rumor, we hope that it will be rolled out soon

It’s a busy and certainly an interesting time for Google at the moment, and it’s not just because Gemini is slowly taking over. While the company is doubling down on Gemini’s integration across the board of its services, it’s taking the time to seriously upgrade other platforms under its broad umbrella - Google Photos being one of them.

Just as Google Messages has been testing some useful upgrades, the company hasn’t forgotten about its photo sharing and storage software, which has also received its fair share of updates and new features, big and small. Two of them are very recent rollouts for Google, and though the third is only a speculation as of now, we’re hoping to see it come to fruition in the near future.

Google Photos gets Gemini integration on Android

A screenshot of the new Google Photos integration in the Android Gemini app

(Image credit: 9to5Google)

In October 2024, Google Photos rolled out ‘Ask Photos,’ an AI search tool powered by Gemini that allows you to use natural language questions to filter through your gallery in the Photos app. With this new integration, Android users will be able to connect it to Google Photos and find photos inside the Gemini app itself.

According to 9to5Google, there are two sides to this new integration, the first allowing you to find images and videos based on the following:

  • Your saved faces and relationships in Google Photos
  • The location / date a photo / video was taken on
  • A description of what’s in a photo
  • Conversation with the Gemini app

For example, you can use prompts such as ‘Find my photos of Alex’, ‘Show me recent selfies’, and ‘Show my photos from my most recent trip’. The second part allows you to ask about specific details in your photos and videos, such as ‘What are the top 10 things we saw on our last trip?’ - similar to the Ask Photos function in the Google Photos app.

Dark mode for Google Photos’ web version

Whether you use Google Photos or not, you probably use dark mode settings on other platforms - for me, it’s always turned on when I'm using YouTube or TikTok. For a while, dark mode was exclusive to the Google Photos mobile app, but just a few weeks ago, Google finally brought it to the web version.

It’s a small upgrade for Google Photos, but one that will be very popular with users for sure. You can activate dark mode for Google Photos on the web very easily:

  • Head to photos.google.com in your web browser
  • Click Settings, and then go into the Appearance section
  • From there, you can select your choice from different options, including Light, Dark, or Use Device Default
Google Photos tipped for a big redesign

The current Google Photos design

(Image credit: Future)

While this is still speculated, it could be a great design overhaul and one that could make managing your photo library a little smoother.

We first spotted this a few weeks back following a leak shared by Android Authority, which pointed to possible changes we could see in the future, one of which shows the ‘Today’ heading no longer having a checkmark next to it but with what looks like a filter icon instead. Additionally, the leak shows a floating search bar in place of the usual Photos Collections and Search tabs.

You might also like

Pages