Balancing diagnostic procedure alternatives can already feel like a delicate tightrope for practitioners. The reluctance of many professionals to misuse pejorative diagnostic labels in behavioral health, along with the complexity of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), often results in practitioners limiting themselves to a small repertoire of Current Procedural Terminology (CPT) codes. This selective approach, while offering familiarity, might inadvertently jeopardize the breadth and depth of diagnoses. However, a new performer has entered this high-wire act—the emergence of Artificial Intelligence (AI). The model most conducive to looking at behavioral patterns to determine possible diagnostic categories is available through ChatGPT. This tool can craft text that mirrors human conception, seamlessly integrating factual contexts to yield a range of possible DSM-5 diagnoses. The spotlight now shines on the pivotal query: In this juncture of time, does harnessing a ChatGPT diagnosis tip the scales of legality or ethics? Addressing ethical challenges of AI implementation in healthcare, several researchers are calling for essential ethical guidelines to address the potential misuse of ChatGPT diagnosis in behavioral health.
Within the following discussion, I will summarize three recent scholarly articles delving into the application of ChatGPT diagnostics. Moreover, I will give you a sample of a behavioral ChatGPT diagnosis response extracted from ChatGPT Pro in response to a short behavioral symptom pattern. Then, I invite you, the reader, to share your thoughts about the legalities and ethics of using the response I obtained.
Recent Studies about ChatGPT Diagnosis
A medical study led by Mass General Brigham has examined ChatGPT’s clinical decision-making capabilities, revealing a 72% accuracy in medical specialties and a 77% accuracy in final diagnoses (2023). Published in the Journal of Medical Internet Research last week, the research assessed ChatGPT’s support across the entire patient care process, from initial evaluation to diagnosis and management. Corresponding author Marc Succi, MD, likened the accuracy level of ChatGPT to that of a recent medical school graduate. The study simulated patient scenarios, with ChatGPT suggesting diagnoses based on initial data and making care decisions. The chatbot performed best in delivering final diagnoses (77% accuracy) but struggled with differential diagnoses (60%) and clinical management decisions (68%). This suggests the potential for AI augmentation in diagnostics, but the authors stressed that further benchmarking and regulatory guidance is essential.
In a similar study published two weeks ago (2023), the capabilities of GPT-4, a Generative Pre-trained Transformer 4 program, were examined in improving diagnostic accuracy for older patients with delayed diagnoses. The small study, led by researchers at the Division of Geriatrics, Queen Mary Hospital, assessed GPT-4’s performance in suggesting likely diagnoses and differential diagnoses based on patient medical histories. The results indicated GPT-4’s accuracy of 66.7% in primary diagnoses and 83.3% when including differential diagnoses. This AI tool demonstrated its potential to offer diagnoses that clinicians might have overlooked while highlighting the need for comprehensive clinical information and human oversight.
Turning to behavioral health, significant progress has been made. For instance, a 2022 behavioral study highlighted individual clinicians’ experience limitations and the need for increased diagnostic precision. Kishimoto and colleagues published the development of a large Japanese speech dataset to help increase objectivity on the part of the clinician. They used natural language processing to quantify the linguistic features of mental and neurocognitive disorders to assist with ChatGPT diagnosis.
Lastly, Salih Tutun and colleagues published a paper examining the ethics of developing such datasets for clinicians and how this can be accomplished ethically (2023). The Springer Link publication concludes with this sobering statement:
Due to the lack of adequate tools to help mental health professionals, this study was motivated by the urgent need to develop innovative tools that can help professionals make improved clinical diagnostic decisions. Our paper developed a DSS called Psikometrist that can replace traditional paper-based examinations, decreasing the possibility of missing data and significantly reducing cost and time needed by patients and mental health professionals. The findings show how AI-based tools can be utilized to efficiently detect and diagnose various mental disorders. In addition, the study discussed the ethical challenges faced during AI implementation in healthcare settings and outlined the ethical guidelines that should be integrated.
Giving ChatGPT Diagnosis a Try
Lacking the sophisticated tools described in the brief literature review above and recognizing the possibility of significant bias in my sampling, I simply asked ChatGPT-4 two sequential questions related to a specific behavior pattern and asked for a possible DSM diagnosis. While the resulting brainstorming session below is hardly what I’d call a ChatGPT diagnosis, I will leave it to you to decide whether my brainstorming with ChatGPT helps broaden an everyday practitioner’s thinking about possible diagnoses and related areas needing further investigation.
If identifiable multicultural or diversity issues were present, they would need to be included for ethical purposes. Similarly, protected health information and other identifiable data have been omitted for the same reason.
Prompt #1:
What are possible DSM diagnoses for someone who gets angry quickly, is easily wounded, drinks alcohol to the point of being unable to show up for work, has poor boundaries, is marginally employed, is in debt, and is unsuccessful in romantic relationships?
The 1st ChatGPT Response
The following response was obtained verbatim. As you read the response, consider whether it would be helpful, legal, and ethical for you to be able to enter a behavioral prompt (question posed to ChatGPT) to get this type of response:
“It appears that the individual in question may be exhibiting several symptoms that could potentially align with various Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnoses. It’s important to note that making an accurate diagnosis requires a comprehensive assessment by a qualified mental health professional. However, based on the information provided, several potential DSM diagnoses could be considered:
- Intermittent Explosive Disorder (IED). Intermittent Explosive Disorder involves recurrent outbursts of anger that are disproportionate to the situation and often result in aggressive behavior or destruction of property. Individuals with IED struggle to control their anger impulses, and this may negatively impact their relationships and work life.
- Borderline Personality Disorder (BPD). BPD is characterized by emotional dysregulation, unstable relationships, poor self-image, and impulsive behaviors. Individuals with BPD may experience intense anger and have difficulty managing their emotions. Poor boundaries and struggles in romantic relationships align with BPD features.
- Alcohol Use Disorder (AUD). Excessive alcohol consumption, to the point of impairing daily functioning and causing absence from work, could indicate an Alcohol Use Disorder. This diagnosis would be based on criteria such as loss of control over drinking, unsuccessful attempts to cut down, and continued use despite negative consequences.
- Depressive Disorder. The individual’s difficulty in managing anger, poor boundaries, and struggles in relationships could be associated with a Depressive Disorder. Depression can manifest as irritability and impaired social functioning, impacting both work and personal life.
- Generalized Anxiety Disorder (GAD). Poor boundaries, being easily wounded, and difficulties in relationships might also be indicative of Generalized Anxiety Disorder. Chronic worry, heightened sensitivity to criticism, and interpersonal difficulties can be features of GAD.
- Unspecified Personality Disorder. The individual’s challenges in maintaining boundaries, unstable relationships, and difficulties in employment could suggest the presence of a personality disorder that does not neatly fit within one specific category.
It’s crucial to emphasize that these are preliminary considerations and not a substitute for professional evaluation. A comprehensive assessment by a licensed mental health practitioner is essential for accurate diagnosis and appropriate treatment. The provided descriptions should not replace personalized clinical advice, and any diagnostic conclusions should be made by a qualified professional after a thorough evaluation.”
ChatGPT-4
Prompt #2
Which characteristics would need to be evaluated to make a differential diagnosis between the above possible diagnoses?
The 2nd ChatGPT Response
To differentiate between the potential diagnoses mentioned earlier (Intermittent Explosive Disorder, Borderline Personality Disorder, Alcohol Use Disorder, Depressive Disorder, Generalized Anxiety Disorder, and Unspecified Personality Disorder), a thorough assessment of various characteristics and factors is essential. It’s important to consider a combination of clinical interviews, self-report assessments, collateral information from family or close contacts, and possibly psychological testing.
Here are some key characteristics that need to be evaluated to make a more accurate differential diagnosis:
- Frequency and Intensity of Anger Episodes. For Intermittent Explosive Disorder, the frequency, intensity, and context of anger outbursts need to be evaluated. Determining whether the anger reactions are disproportionate and recurrent is crucial.
- Emotional Instability and Impulsivity. Borderline Personality Disorder is characterized by emotional instability, impulsive behaviors, and unstable relationships. Assessing the pattern of emotional shifts, impulsive actions, and unstable sense of self is important.
- Alcohol Consumption Patterns. For Alcohol Use Disorder, an assessment of the individual’s alcohol consumption patterns, including frequency, quantity, and impact on functioning, is necessary. Evaluation of withdrawal symptoms, unsuccessful attempts to quit, and continued use despite negative consequences is important.
- Presence of Depressive Symptoms. Depressive Disorder involves a range of symptoms beyond irritability, such as low mood, changes in appetite, sleep disturbances, and feelings of worthlessness. A thorough assessment of depressive symptoms is needed to consider this diagnosis.
- Excessive Worry and Anxiety. In Generalized Anxiety Disorder, persistent and excessive worry about a variety of topics is a hallmark. Evaluating the extent of worry, physical symptoms of anxiety, and impact on daily life is crucial.
- Impaired Interpersonal Functioning and Boundaries. Assessing the individual’s ability to establish and maintain healthy boundaries, as well as the quality of their relationships, can help differentiate between various disorders. An examination of interpersonal dynamics is particularly relevant for personality disorders.
- Duration and Chronicity. Evaluating the duration and chronicity of symptoms is important. Some disorders, like personality disorders, involve longstanding patterns of behavior, while others might have more episodic presentations.
- Family and Developmental History. Gathering information about the individual’s developmental history, family dynamics, early life experiences, and any history of trauma or adverse events can provide insights into potential underlying factors.
- Response to Treatment. Observing how the individual responds to interventions, therapy, or medication can help in refining the diagnosis over time. Different disorders may respond differently to specific treatments.
- Rule Out Medical and Substance-Related Factors. It’s important to rule out any medical conditions or substance use that could mimic or exacerbate psychological symptoms. When evaluating an individual with behavioral and emotional concerns, considering potential medical conditions is a crucial aspect of a comprehensive assessment. Given the individual’s symptoms, it’s advisable to conduct a thorough medical evaluation to rule out any underlying medical conditions that could be contributing to the symptoms. Medical conditions such as hormonal imbalances, thyroid issues, neurological disorders, or even medication side effects can impact mood and behavior.
In a comprehensive assessment, both psychological and medical factors should be taken into account. If an individual is experiencing significant psychological distress, it’s recommended to consult both a mental health professional and a medical doctor to ensure a comprehensive understanding of the situation and to consider all potential contributing factors. Remember that an accurate diagnosis often requires a multidimensional approach, and it’s not uncommon for individuals to have overlapping symptoms or co-occurring disorders. A licensed mental health professional with experience in assessment and diagnosis should guide the process, integrating various sources of information to arrive at the most appropriate diagnosis.
ChatGPT-4
Discussion
Even a casual reading of the above experiment will show that the ChatGPT responses may help broaden a clinician’s thinking. From this point, the clinician can evaluate the suggested activities to tailor the list to the particular client’s needs. The integration of AI in healthcare diagnostics, particularly ChatGPT, holds promise for augmenting clinical practice but — comes with ethical and legal complexities. Such AI tools serve best as complements to, not replacements for, human expertise in diagnostic settings (Mass General Brigham, 2023; Zhang et al., Frontiers, 2022).
There is, therefore, an urgent need for regulatory frameworks to address the legal ambiguities, such as liability in cases of clinicians relying exclusively on ChatGPT diagnoses and not using them as a tool for brainstorming to consider alternatives. This article does not address sharing protected health information (PHI) but is most certainly worthy of better understanding.
The ethical dimensions of multiculturalism, diversity, and equity can be critical in mental health diagnostics. As Zhuo et al. (2023) noted, language models like ChatGPT can “exhibit social prejudice and toxicity,” emphasizing clinicians’ need to apply rigorous oversight. Sole reliance on AI can risk exacerbating societal inequalities and introducing toxicity into clinical settings.
Caution & Moral Imperative
In no instances should currently available ChatGPT models be used verbatim, that is, used “as is” from ChatGPT, as the sole source for rendering diagnoses. AI’s role in healthcare should be limited to augmenting clinical diagnostic procedures, particularly for complex diagnostics that demand a multi-faceted approach involving human expertise and machine learning algorithms. If in doubt as to the appropriateness of using ChatGPT diagnosis in your clinical care or research, you are encouraged to seek the direction of your state or national associations and the legal advice of a qualified attorney. A Telehealth.org CME and CE course is also available to give guidance.
Ensuring responsible use of ChatGPT diagnosis requires ongoing caution and scrutiny to mitigate its inherent biases and limitations. Ethical deployment in healthcare must remain a priority, with human judgment and robust regulatory guidelines at its core.
That’s my opinion. What’s yours?
References
- Mass General Brigham. (2023). Study on ChatGPT’s Clinical Decision-making Capabilities. Journal of Medical Internet Research.
- Division of Geriatrics, Queen Mary Hospital. (2023). Study on GPT-4 in Geriatric Care.
- Zhang et al. (2022). Mental Health Diagnosis Using AI. Frontiers.
- American Psychological Association (APA). (2020). Guidelines for the Practice of Telepsychology.
- Zhuo, T. Y., Huang, Y., Chen, C., & Xing, Z. (2023). Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and Toxicity. ArXiv. /abs/2301.12867.
Therapist AI & ChatGPT: How to Use Legally & Ethically
Immerse yourself in our highly-engaging eLearning program and delve into the uncharted territory of Artificial Intelligence (AI) in Behavioral Healthcare!
Evidence-Based Screening Strategies for Teletherapy
Quick, 1-hour introduction to telehealth and teletherapy screening, selection criteria and other considerations. Star-studded faculty of industry leaders.
Telehealth 101: Essential Telehealth Technology Orientation
In this 2.5 hour, basic technology training, you will find a well-organized discussion of relevant basic research along with practical suggestions for making foundational decisions about your digital practice with cloud storage, backups systems, security software such as VPNs, HIPAA compliance and software purchasing, synchronous and asynchronous technologies, and much more.
I too believe that human judgment and robust regulatory guidelines must be at ChatGPT’s core.