kunyin2
598-final-project

Repository



598 Reproducibility Project

Citation to the original paper:
Oleynik M, Kugic A, Kas  ́a ˇc Z, Kreuzthaler
M. Evaluating shallow and deep learning strategies
for the 2018 n2c2 shared task on clinical text clas-
sification. J Am Med Inform Assoc. 2019 Nov
1;26(11):1247-1254. doi: 10.1093/jamia/ocz149.
PMID: 31512729; PMCID: PMC6798565.

Original paper's repo:
https://github.com/bst-mug/n2c2

Code Dependencies

JDK8+
python3 (to run official evaluation scripts)
make (to compile fastText)
gcc/clang (to compile fastText)


Steps for using this code repo:


You need to get the original/input dataset. One way to get is from https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/ by submitting data access request.


Once you have the input dataset, you need to put the data under the folder: /data/test , /data/train separatelly. 70% of the data is for training, and 30% of the data is for testing.


Run the program SentenceDumper.java to generate the sentences.txt file.


Run the program VocabularyDumper.java to generate the vocab.txt file.


Install fasttext program to your working machine. You can follow this link to install fasttext: https://fasttext.cc/docs/en/supervised-tutorial.html


Copy sentences.txt file to the folder scripts, then run the script train_embeddings.sh which will generate n2c2-fasttext model.


Copy vocab.txt file to the folder scripts. Download BioWordVec_PubMed_MIMICIII_d200.bin from https://github.com/ncbi-nlp/BioSentVec. Run the script print_pre_trained_vectors.sh to generate pre_trained embedding. Run the script print_self_trained_vectors.sh to generate self_trained embedding.
Then, copy both the embeddings to the related folder as the generated class file by java.


To start the main program: navigate to ClassifierRunner.java program. Then, run the main method, and the program will start running. For my case, I disabled two Classifiers because the pre_trained embedding can not generated from my laptop due to my laptop has only 8 GB memory and failed to run the BioWordVec_PubMed_MIMICIII_d200.bin.


The program first loads all the classifiers, then it starts to parse the training/test dataset into list of patients class. For each classifier, it uses two validators: n2c2 official metrics and accuracy, fp/fn metrics to validate the model. Lastly, it writes the results under the stats folder, with -basic.csv and -official.csv as suffix.


Table of results:
I uploaded all outputs under original_output folder, including: Baseline model, RBC model, SVM model, LR model, LSTM model.

Overall F1 score per criterion on the test set, compared with the baseline, a majority classifier:


Criterion
Baseline
RBC
SVM
SELF-LR
SELF-LSTM


Abdominal
0.3944
0.872
0.6028
0.5959
0.5411


Advanced-cad
0.3435
0.7902
0.7281
0.7133
0.4538


Alcohol-abuse
0.4911
0.4881
0.4911
0.4911
0.4911


Asp-for-mi
0.4416
0.7095
0.6063
0.5962
0.4305


Creatinine
0.4189
0.8071
0.6532
0.7073
0.5855


Dietsupp-2mos
0.3385
0.9185
0.5814
0.6038
0.4267


Drug-abuse
0.4911
0.691
0.4911
0.4911
0.6546


English
0.4591
0.8644
0.4591
0.4591
0.4557


Hba1c
0.3723
0.9382
0.6267
0.5393
0.5216


Keto-1yr
0.5
0.5
0.5
0.5
0.5


Major-diabetes
0.3333
0.8369
0.7555
0.7643
0.4407


Makes-decisions
0.4911
0.4911
0.4911
0.4911
0.4911


Mi-6mos
0.4756
0.8752
0.6815
0.4756
0.4691


Overall (micro)
0.7608
0.91
0.8035
0.8031
0.73


Overall (macro)
0.427
0.7525
0.5899
0.5714
0.497


Overall accuracy per criterion on the test set, compared with the baseline, a majority classifier


Criterion
Baseline
RBC
SVM
SELF-LR
SELF-LSTM


Abdominal
0.651162
0.883720
0.651162
0.662790
0.569767


Advanced-cad
0.523255
0.790697
0.732558
0.720930
0.616279


Alcohol-abuse
0.523255
0.953488
0.965116
0.965116
0.965116


Asp-for-mi
0.790697
0.860465
0.755813
0.767441
0.767441


Creatinine
0.720930
0.837209
0.720930
0.755813
0.709302


Dietsupp-2mos
0.511627
0.918604
0.581395
0.604651
0.441860


Drug-abuse
0.965116
0.965116
0.965116
0.965116
0.9302325


English
0.848837
0.941860
0.848837
0.848837
0.837209


Hba1c
0.593023
0.9418604
0.651162
0.58139
0.511627


Keto-1yr
1.0
1.0
1.0
1.0
1.0


Major-diabetes
0.965116
0.837209
0.755813
0.7674418
0.523255


Makes-decisions
0.906976
0.965116
0.965116
0.965116
0.965116


Mi-6mos
0.906976
0.965116
0.930232
0.767441
0.965116


Overall (micro)
0.764758
0.912343
0.809481
0.808586
0.7495527


Overall (macro)
0.764758
0.91234
0.809481
0.808586
-