Retrieval & Interaction Machine for Tabular Data Prediction
表格数据预测的检索与交互机
表データ予測のための検索とインタラクションマシン
표 데이터 예측 검색 및 인 터 랙 션
Recuperación e interacción de la predicción de datos tabulares
Machine de recherche et d'interaction pour la prévision des données tabulaires
поиск и взаимодействие прогнозов табличных данных
Jiarui Qin 秦佳锐 ¹, Weinan Zhang 张伟楠 ¹, Rong Su ², Zhirong Liu ², Weiwen Liu ², Ruiming Tang 唐睿明 ², Xiuqiang He 何秀强 ², Yong Yu 俞勇 ¹
Prediction over tabular data is an essential task in many data science applications such as recommender systems, online advertising, medical treatment, etc. Tabular data is structured into rows and columns, with each row as a data sample and each column as a feature attribute. Both the columns and rows of the tabular data carry useful patterns that could improve the model prediction performance. However, most existing models focus on the cross-column patterns yet overlook the cross-row patterns as they deal with single samples independently.
In this work, we propose a general learning framework named Retrieval & Interaction Machine (RIM) that fully exploits both cross-row and cross-column patterns among tabular data. Specifically, RIM first leverages search engine techniques to efficiently retrieve useful rows of the table to assist the label prediction of the target row, then uses feature interaction networks to capture the cross-column patterns among the target row and the retrieved rows so as to make the final label prediction.
We conduct extensive experiments on 11 datasets of three important tasks, i.e., CTR prediction (classification), top-n recommendation (ranking) and rating prediction (regression). Experimental results show that RIM achieves significant improvements over the state-of-the-art and various baselines, demonstrating the superiority and efficacy of RIM.