The patient was a 39-year-old woman who came to the emergency department at Beth Israel Deaconess Medical Center in Boston. Her left knee hurt for several days. The day before, she had a fever of 102 degrees. It was gone now, but she still had chills. And her knee was red and swollen.

What was the diagnosis?

On a recent steamy Friday, Dr. Megan Landon, a medical resident, presented this real-life case to a room full of medical students and residents. They were brought together to learn a skill that can be fiendishly difficult to teach – how to think like a doctor.

“Physicians are terrible at teaching other doctors how we think,” said Dr. Adam Rodman, an internist, medical historian and organizer of the event at Beth Israel Deaconess.

But this time, they could call on an expert for help to reach a diagnosis – GPT-4, the latest version of chat released by the company OpenAI.

Artificial intelligence is transforming many aspects of the practice of medicine, and some medical professionals are using these tools to aid them in diagnosis. Doctors at Beth Israel Deaconess, a teaching hospital affiliated with Harvard Medical School, decided to investigate how chatbots could be used – and abused – for training future doctors.

Instructors like Dr. Rodman hope medical students can turn to GPT-4 and other chatbots for something akin to what doctors call a thorny consultation — when they pull a colleague aside and ask for an opinion on a difficult case. The idea is to use a chatbot in the same way that doctors turn to each other for suggestions and insights.

For more than a century, doctors have been portrayed as detectives who gather clues and use them to find the culprit. But experienced doctors actually use a different method – pattern recognition – to figure out what’s wrong. In medicine, it’s called a disease script: signs, symptoms, and test results that doctors put together to tell a coherent story based on similar cases they know of or have seen themselves.

If the disease script doesn’t help, Dr. Rodman said, doctors turn to other strategies, such as assigning probabilities to various diagnoses that might fit.

Researchers have tried for more than half a century to design computer programs to make medical diagnoses, but nothing has really succeeded.

Doctors say GPT-4 is different. “It will create something that is remarkably similar to a disease script,” Dr. Rodman said. In this way, he added, “it is fundamentally different from a search engine.”

Dr. Rodman and other doctors at Beth Israel Deaconess requested GPT-4 for possible diagnoses in difficult cases. In to study published last month in the medical journal JAMA, they found it fared better than most doctors on weekly diagnostic challenges published in the New England Journal of Medicine.

But, they learned, there is an art to using the program, and there are pitfalls.

Dr. Christopher Smith, the director of the internal medicine residency program at the medical center, said medical students and residents “definitely use it.” But, he added, “whether they learn anything is an open question.”

The worry is that they might rely on AI to make diagnoses just as they would rely on their phones’ calculator to do a math problem. That, Dr. Smith said, is dangerous.

Learning, he said, involves trying to figure things out: “That’s how we keep things. Part of learning is the struggle. If you outsource learning to GPT, that struggle is gone.”

At the meeting, students and residents split into groups and tried to figure out what was wrong with the patient with the swollen knee. They then turned to GPT-4.

The groups tried different approaches.

GPT-4 was used to perform an internet search, similar to how one would use Google. The chat spat out a list of possible diagnoses, including trauma. But when the team members asked it to explain its reasoning, the robot disappointed, explaining its choice by stating, “Trauma is a common cause of knee injury.”

Another group thought of possible hypotheses and asked GPT-4 to verify them. The chat list lined up with the group’s: infections, including Lyme disease; arthritis, including gout, a type of arthritis that involves crystals in joints; and trauma.

GPT-4 added rheumatoid arthritis to the top possibilities, although it was not high on the group’s list. Gout, teachers later told the group, was unlikely for this patient because she was young and female. And rheumatoid arthritis could probably be ruled out because only one joint was inflamed, and for only a few days.

As a board consultation, GPT-4 seemed to pass the test or, at least, agree with the students and residents. But in this exercise, it offered no insights, and no disease script.

One reason could be that the students and residents used the robot more as a search engine than a shore consultation.

To use the robot properly, the teachers said, they should start by telling GPT-4 something like, “You are a doctor seeing a 39-year-old woman with knee pain.” Then they should list her symptoms before asking for a diagnosis and follow up with questions about the robot’s reasoning, as they would with a medical colleague.

That, the teachers said, is a way to exploit the power of GPT-4. But it’s also important to recognize that chatbots can make mistakes and “hallucinate” – provide answers with no basis in fact. Using it requires knowing when it is wrong.

“There’s nothing wrong with using these tools,” said Dr. Byron Crowe, an internal medicine physician at the hospital. “You just have to use them in the right way.”

He gave the group an analogy.

“Pilots use GPS,” Dr. Crowe said. But, he added, airlines “have a very high standard for reliability.” In medicine, he said, using chatbots “is very tempting,” but the same high standards should apply.

“It’s a great thinking partner, but it’s no substitute for deep mental expertise,” he said.

When the session was over, the teachers revealed the real reason for the patient’s swollen knee.

It turned out to be a possibility that each group considered, and that GPT-4 proposed.

She had Lyme disease.

Olivia Allison contributed reporting.

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *