“英特尔创新大师杯”深度学习挑战赛 赛道3:CCKS2021中文NLP地址相关性任务

Overview

ccks2021-track3

CCKS2021中文NLP地址相关性任务-赛道三-冠军方案

团队:我的加菲鱼- wodejiafeiyu

初赛第二/复赛第一/决赛第一

前言

19年开始,陆陆续续参加了一些比赛,拿到过一些top,比较懒一直都没分享过,这次比较幸运又拿了top1,打算分享下

分类的任务用这个框架基本都能top10,之前个人参加的《全球人工智能大赛-赛道一》和这个也差不多,只是有些具体任务的trick不同

比赛主页

https://tianchi.aliyun.com/competition/entrance/531901/introduction

环境

  • torch==1.6.0
  • transformers=3.0.2

预训练模型

- nezha-base 
- nezha-wwm
- macbert
  • 下载预训练模型,放到文件夹user_data/model_param/pretrain_model_param/下,文件夹和模型名字一一对应

全流程脚本

sh run.sh

核心思路

  • 预训练mlm中的mask策略使用ngram-mask,相比原始的动态mask,提升了预训练难度
  • 标签也包含语义信息,预训练部分融入标签信息,提升预训练效果
  • 混合精度预训练,损失一部分精度,提升整体的训练速度,实际测试结果精度损失不大,速度提升明显
  • 对抗训练,已经是一个比较常用的trick了
  • 后12层加上embedding的cls动态加权平均
  • multi-sample dropout
  • 推理的时候,阈值搜索
  • 三折交叉验证,每折使用不同的随机数种子使用dynamic pad
  • 加权平均,模型融合

总结

模块 提分点分析 提升
ngram-mask 相比于单个字的遮蔽,ngram-mask加大了预训练任务的难度,从而提升效果 2.6个千分点
融入标签信息预训练 标签也含有语义信息,模型学习的更多 1个千分点
后12层加上embedding的cls动态加权平均+multi-sample dropout 加权平均能增强向量的语义表征能力从而提升效果。multi-sample dropout能加速训练,增强泛化能力 3.3个千分点
对抗训练fgm 生成对抗样本,对抗样本的训练,增加模型泛化能力 2.2个千分点
多分类阈值搜索 缩放系数使得f1最优 0.7个千分点
模型融合 提升模型的鲁棒性和泛化能力 1.5个千分点
动态pad和预训练混合精度训练和每折使用不同的随机数种子 提升模型训练速度,不同的随机数种子加大了模型的差异,提升融合的效果
Owner
shaochenjie
shaochenjie
Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

Kalpesh Krishna 41 Nov 08, 2022
Machine learning library for fast and efficient Gaussian mixture models

This repository contains code which implements the Stochastic Gaussian Mixture Model (S-GMM) for event-based datasets Dependencies CMake Premake4 Blaz

Omar Oubari 1 Dec 19, 2022
TensorFlow for Raspberry Pi

TensorFlow on Raspberry Pi It's officially supported! As of TensorFlow 1.9, Python wheels for TensorFlow are being officially supported. As such, this

Sam Abrahams 2.2k Dec 16, 2022
The Unsupervised Reinforcement Learning Benchmark (URLB)

The Unsupervised Reinforcement Learning Benchmark (URLB) URLB provides a set of leading algorithms for unsupervised reinforcement learning where agent

259 Dec 26, 2022
Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

Enrico Fini 118 Dec 26, 2022
Real-time pose estimation accelerated with NVIDIA TensorRT

trt_pose Want to detect hand poses? Check out the new trt_pose_hand project for real-time hand pose and gesture recognition! trt_pose is aimed at enab

NVIDIA AI IOT 803 Jan 06, 2023
A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.

Karoo GP Karoo GP is an evolutionary algorithm, a genetic programming application suite written in Python which supports both symbolic regression and

Kai Staats 149 Jan 09, 2023
U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

Doron Adler 43 Oct 03, 2022
SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks (Scientific Reports)

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks Molecular interaction networks are powerful resources for the discovery. While dee

Kexin Huang 49 Oct 15, 2022
Convert Table data to approximate values with GUI

Table_Editor Convert Table data to approximate values with GUIs... usage - Import methods for extension Tables. Imported method supposed to have only

CLJ 1 Jan 10, 2022
xitorch: differentiable scientific computing library

xitorch is a PyTorch-based library of differentiable functions and functionals that can be widely used in scientific computing applications as well as deep learning.

24 Apr 15, 2021
Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Pytorch Code for VideoLT [Website][Paper] Updates [10/29/2021] Features uploaded to Google Drive, for access please send us an e-mail: zhangxing18 at

Skye 26 Sep 18, 2022
[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation (ICCV 2021) Introduction This is an official pytorch implemen

rongchangxie 42 Jan 04, 2023
Python implementation of "Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation"

MIPNet: Multi-Instance Pose Networks This repository is the official pytorch python implementation of "Multi-Instance Pose Networks: Rethinking Top-Do

Rawal Khirodkar 57 Dec 12, 2022
deep_image_prior_extension

Code for "Is Deep Image Prior in Need of a Good Education?" Project page: https://jleuschn.github.io/docs.educated_deep_image_prior/. Supplementary Ma

riccardo barbano 7 Jan 09, 2022
Code for "Typilus: Neural Type Hints" PLDI 2020

Typilus A deep learning algorithm for predicting types in Python. Please find a preprint here. This repository contains its implementation (src/) and

47 Nov 08, 2022
The open source code of SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation.

SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation(ICPR 2020) Overview This code is for the paper: Spatial Attention U-Net for Retinal V

Changlu Guo 151 Dec 28, 2022
Language models are open knowledge graphs ( non official implementation )

language-models-are-knowledge-graphs-pytorch Language models are open knowledge graphs ( work in progress ) A non official reimplementation of Languag

theblackcat102 132 Dec 18, 2022
NAS-Bench-x11 and the Power of Learning Curves

NAS-Bench-x11 NAS-Bench-x11 and the Power of Learning Curves Shen Yan, Colin White, Yash Savani, Frank Hutter. NeurIPS 2021. Surrogate NAS benchmarks

AutoML-Freiburg-Hannover 13 Nov 18, 2022
DrNAS: Dirichlet Neural Architecture Search

This paper proposes a novel differentiable architecture search method by formulating it into a distribution learning problem. We treat the continuously relaxed architecture mixing weight as random va

Xiangning Chen 37 Jan 03, 2023