The GatedТabТransformer. An Attention-Based Deep Learning Architecture For Tabular Modeling.

Some of the most common machine learning pipelines involve manipulation of tabular data. The current state-of-the-art solution for tabular modeling is the TabTransformer by Amazon from 2020. It incorporates a Transformer block to track relationships between categorical features and makes use of a standard multilayer perceptron to output its final logits. We propose modifications outperforming it on binary classification tasks for three benchmark datasets with more than 1% AUROC gains. We process categorical embeddings with an attention mechanism and then concatenate them with continuous values to be fed through multiple layers of gated MLP – a neural network originally introduced for language tasks. We also evaluate the importance of specific hyper parameters during training.