Skip to Main Content

AI Services


For appeals, questions and feedback, please email

AI Language to predict diagnosis codes from Clinical Notes Free Text

rgiljohannMar 1 2024 — edited Mar 1 2024

I have a data set containing four fields: FREE_TEXT (CLOB), SEQUENCE (NUMBER), ICD_CODE (Varchar2), ICD_DESCRIPTION (VARCHAR2).

I am wondering how I can use Language AI to predict the ICD_Codes for any new FREE_TEXT input. The way this works, is if you go to the doctor, the doctor or nurse will type notes into the system. Later on, someone will take those notes and code each diagnosis code (ICD_CODE). For example, the doctor or nurse might write patient has diabetes. The coder will then later code in E11.9 which is the ICD_CODE for diabetes.

I am wondering how I could create a model to predict each diagnoses code based on the FREE_TEXT clob (32,000 characters of text).

It would be very beneficial if the model could also give a confidence index, or probability score. Also, since we have the ICD_DESCRIPTION for each ICD_CODE for example, ICD_CODE = E11.9, ICD_DESCRIPTION = ‘Diabetes’, is it possible to have the model bring back the snippet of text in the FREE_TEXT column that helped make that specific prediction? This would be very beneficial so that someone could double check a certain response to see if that was accurate.

Is Language AI the best route to take? For reference, my data is in Autonomous Database.

Post Details
Added on Mar 1 2024