The emergence of deep learning algorithms in natural language processing has boosted the development of intelligent medical information systems. Firstly, this dissertation explores effective text encoding for clinical text. We propose a dilated convolutional attention network with dilated convolutions to capture complex medical patterns in long clinical notes by exponentially increasing the receptive field with the dilation size. Furthermore, we propose to utilize embedding injection and gated information propagation in the medical note encoding module for better representation learning of the lengthy clinical text. To capture the interaction between notes and codes, we explicitly model the underlying dependency between notes and codes and utilize textual descriptions of medical codes as external knowledge. We also adopt the contextualized graph embeddings to learn contextual information and causal relationships between text mentions such as drugs taken and adverse reactions.

We also conduct an empirical analysis on the effectiveness of transfer learning with language model pretraining to clinical text encoding and medical code prediction. We develop a hierarchical encoding model to equip the pretrained language models with the capacity to encode long clinical notes. We further study the effect of pretraining in different domains and with different strategies. The comprehensive quantitative analysis shows that hierarchical encoding can capture interactions between distant words to some extent.

Then, this dissertation investigates the multitask learning paradigm and its applications to healthcare. Multitask learning, motivated by human learning from previous tasks to help with a new task, makes full use of the information contained in each task and shares information between related tasks through common parameters. We adopt multitask learning for medical code prediction and demonstrate the benefits of leveraging multiple coding schemes. We design a recalibrated aggregation module to generate clinical document features with better quality and less noise in the shared modules of multitask networks.

Finally, we consider the task context to improve multitask learning for healthcare. We propose to use a domain-adaptive pretrained model and hypernetwork-guided multitask heads to learn shared representation modules and task-specific predictors. Specifically, the domain-adaptive pretrained model is directly pretrained in the target domain of clinical applications. Task embeddings as task context are used to generate task-specific parameters with hypernetworks. Experiments show that the proposed hypernetwork-guided multitask learning method can achieve better predictive performance and semantic task information can improve the generalizability of the task-conditioned multitask model.



Photo by Zhen Hu on Unsplash