The chat app that millions of people have used to write terms, computer code, and fairy tales doesn’t just make words. ChatGPT, OpenAI’s artificial intelligence-powered tool, can also analyze images — describing what’s in them, answering questions about them and even recognizing the faces of specific people. The hope is that, eventually, someone might upload a picture of a broken car engine or a mysterious eruption and ChatGPT might suggest the solution.

What OpenAI doesn’t want ChatGPT to become is a facial recognition machine.

For the past few months, Jonathan Mosen has been among a select group of people with access to an advanced version of the chat that can analyze images. On a recent trip, Mr. Mosen, a chief executive of an employment agency who is blind, used the visual analysis to determine which dividers in a hotel room’s bathroom were shampoo, conditioner and shower gel. It went far beyond the performance of image analysis software he had used in the past.

“It told me the milliliter capacity of each bottle. It told me about the tiles in the shower,” said Mr Mosen. “It described all this in such a way that a blind person needs to hear it. And with one picture, I had exactly the answers I needed.”

For the first time, Mr. Mosen is able to “ask for pictures,” he said. He gave an example: Text accompanying an image he came across on social media described it as “a woman with blonde hair looking happy.” When he asked ChatGPT to analyze the image, the chat room said it was a woman in a dark blue shirt, taking a selfie in a full-length mirror. He could ask follow up questions like what shoes she was wearing and what else was visible in the mirror.

“It’s extraordinary,” said Mr. Mosen, 54, who lives in Wellington, New Zealand, and demonstrated the technology on a podcast he hosts about “living blindly

In March, when OpenAI announced GPT-4, the latest software model that powers its AI chat, the company said is “multimodal,” meaning it can respond to text and image prompts. While most users could only converse with the robot through words, Mr. Mosen was given early access to the visual analytics of Be My Eyes, a startup that typically connects blind users with sighted volunteers and provides accessible corporate customer service. customers Be My Eyes teamed up with OpenAI this year to test the “view” of the chatbot before releasing the feature to the general public.

Recently, the app stopped giving Mr. Mosen information about people’s faces, saying they were hidden for privacy reasons. He was disappointed, feeling that he should have the same access to information as a seer.

The change reflected OpenAI’s concern that it was building something with power it didn’t want to release.

The company’s technology can identify mainly public figures, such as people with a Wikipedia page, said OpenAI researcher Sandhini Agarwal, but does not work as comprehensively as tools built to find faces on the Internet, such as those from Clearview AI and PimEyes. . The tool can recognize OpenAI’s chief executive, Sam Altman, in photos, Ms. Agarwal said, but not other people who work at the company.

Making such a feature publicly available would push the boundaries of what was generally considered acceptable practice by American technology companies. It could also cause legal problems in jurisdictions, such as Illinois and Europe, that require companies to get citizens’ consent to use their biometric information, including facial recognition.

Additionally, OpenAI was concerned that the tool would say things it shouldn’t about people’s faces, such as judging their gender or emotional state. OpenAI is figuring out how to address these and other security concerns before widely releasing the image analysis, Ms. Agarwal said.

“We really want this to be a two-way conversation with the public,” she said. “If what we’re hearing is like, ‘We don’t actually want any of it,’ that’s something we’re very receptive to..”

Beyond the feedback from Be My Eyes users, the nonprofit arm of the company is also trying to come up with ways to get “democratic input” to help set rules for AI systems.

Ms. Agarwal said the development of visual analytics was not “unexpected” because the model was trained by looking at images and text collected from the internet. She pointed out that celebrity facial recognition software already existed, such as a a Google tool. Google offers opt out for known people who don’t want to be recognized, and OpenAI is considering that approach.

Ms. Agarwal said OpenAI’s visual analysis could produce “hallucinations” similar to what was seen through text instructions. “If you give it a picture of someone on the verge of being famous, it might hallucinate a name,” she said. “Like if I give it a picture of a famous tech CEO, it might give me a name of a different tech CEO.”

The tool once inaccurately described a remote control to Mr. Mosen, confidently telling him there were buttons on it that weren’t there, he said.

Microsoft, which has invested $10 billion in OpenAI, also has access to the visual analyzer. Some users of Microsoft’s AI-powered Bing chat have seen the feature appear in a limited launch; after uploading images to it, they received a message informing them that “privacy blurring hides faces from Bing chat.”

Sayash Kapoor, a computer scientist and PhD candidate at Princeton University, used the tool to decode a captcha, a visual security check meant to be understood only by human eyes. Even after cracking the code and recognizing the two obfuscated words provided, the chat noted that “captchas are designed to prevent automated bots like me from accessing certain websites or services.”

“AI is just blowing through all the things that are supposed to separate humans from machines,” said Ethan Mollick, an associate professor who studies innovation and entrepreneurship at the University of Pennsylvania’s Wharton School.

Since the visual analyzer suddenly appeared in Mr. Mollick’s version of Bing chat last month — making him, without any notice, one of the few people with early access — he hasn’t turned off his computer for fear of losing it. He gave it a photo of condiments in a refrigerator and asked Bing to suggest recipes for those ingredients. It coined “ship cream soda” and “creamy jalapeño sauce.”

Both OpenAI and Microsoft seem aware of the power – and potential privacy implications – of this technology. A Microsoft spokesperson said the company is not “sharing technical details” about the face blurring but is working “closely with our partners at OpenAI to support our shared commitment to the safe and responsible deployment of AI technologies.”

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *