Skip to content

A curated list of classic artificial intelligence paper

Notifications You must be signed in to change notification settings

minseok0809/classic-ai-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 

Repository files navigation

Classic AI Paper


Binary arithmetic
Godefroy-Guillaume Leibnitz. Explication de l’arithmétique binaire, qui se sert des seuls caractères O et I avec des remarques sur son utilité et sur ce qu’elle donne le sens des anciennes figures chinoises de Fohy (1703)


Boolean Algebra
George Boole. The Laws of Thought (1854)


Entropy
Ludwig Boltzmann. On the Relationship between the Second Fundamental Theorem of the Mechanical Theory of Heat and Probability Calculations Regarding the Conditions for Thermal Equilibrium (1877)
Claude E. Shannon. A Mathematical Theory of Communication (1948)


Anomaly Detection
K. Pearson. On lines and planes of closest fit to systems of points in space (Philosophical Magazine 1901)


Logic Gate
Claude E. Shannon. A Symbolic Analysis of Relay and Switching Circuits (1937)


McCulloch & Pitss Model
Warren McCulloch and Walter Pitss et al. A Logical Calculus of The Ideas Immanent in Nervous Activity (1943)


Gradient Descent (GD)
C. Lemarechal. Cauchy and the Gradient Method. Doc Math Extra, pp. 251-254. (2012)


Von Neumann Architecture
J Von Neumann. First Draft of a Report on EDVAC (1945)


Turing Machine
A. M. Turing. Intelligent Machinery (1948)


Turing Test
A. M. Turing. Computing Machinery and Intelligence (1950)


Artificial Intelligence
John McCarthy, Marvin L. Minsky, Nathaniel Rochester, and Claude E. Shannon. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence (1955)


Perceptron
Frank Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain (Psychological Review 1958)


LISP (List Processing)
By John McCarthy (1958)


Alpha-Beta Pruning
Arthur L. Samuel. Some studies in mahine learning using the game of checkers (1959)


Decision Tree
Morgan, J.N. & Sonquist, J.A. Problems in the analysis of survey data, and a proposal. (1963)


Iris
R.A. Fisher' et al. The Use of Multiple Measurements in Taxonomic Problems (1963)


Automata
J Von Neumann, AW Burks. Theory of self-reproducing automata. (1966)
Bingbin Liu. Transformers Learn Shortcuts to Automata. (ICLR 2023)


K-Nearest Neighbors (K-NN)
T. M. COVER. Nearest Neighbor Pattern Classification (1967)


Symbolic AI
Newell, J. C. Shaw, Allen Simon. Empirical explorations of the logic theory machine: a case study in heuristic (1957)
Newell, Allen Simon and Herbert A. Human problem solving (1972)
Newell, Allen Simon. Computer science as empirical inquiry: symbols and search (1976)


Rescorla–Wagner Model
Rescorla, R.A. & Wagner, A.R. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement (1972)


Emergent Ability
P. W. Anderson et al. More Is Different (1972)
Rylan Schaeffer et al. Are Emergent Abilities of Large Language Models a Mirage? (2023)


Eligibility Traces
A. Klopf. Brain Function and Adaptive Systems: A Heterostatic Theory (1972)
Satinder P.Singh & Rechard S.Sutton. Reinforcement learning with replacing eligibility traces (1996)


Beam Search
B. T. Lowerre. The harpy speech recognition system. Carnegie Mellon University. (1976)
PENG SI OW et al. Filtered beam search in scheduling. (1986)


Bayesian Optimization
J Mockus et al. The application of Bayesian methods for seeking the extremum (1978)
Jasper Snoek et al. Practical Bayesian Optimization of Machine Learning Algorithms (NeurIPS 2012)


Outliers Detection
D. Hawkins. Identification of Outliers (1980)


Temporal Difference Learning
Sutton, Richard S. Barto, Andrew G. Toward a modern theory of adaptive networks (Psychological Review 1981)


Shallow Learning(Least Squares)
Stephen M. Stigler. Gauss and the Invention of Least Squares. (1981)


Neuroscience
David Hubel, Torsten Wiesel. Receptive fields of single neurons in the cat’s striate cortex (1959) Lawrence Roberts. Machine perception of three-dimensional solids (1963)
David Mar. Vision: A computational investigation into the human representation and processing of visual information(1982)


Neocognitron
Kunihiko Fukushima. Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position (1980)


K-means Clustering
STUART P. LLOYD. Least square quantization in PCM (1982)


Hopfield Network
J J Hopfield. Neural networks and physical systems with emergent collective computational abilities (1982)


Boltzmann Machines
Geoffrey E. Hinton et al. A Learning Algorithm for Boltzmann Machines (1985)


Distributed representations.
Geoffrey E. Hinton et al. Distributed representations (1986)


Backpropagation
H. J. Kelley. Gradient Theory of Optimal Flight Paths. ARS Journal, Vol. 30, No. 10, pp. 947-954. (1960)
David E. Rumelhart et al. Learning representations by back-propagating errors (1986)


Katz's back-off model
Katz, S. M. Estimation of probabilities from sparse data for the language model component of a speech recognizer (1987)


Hidden Markov Models
Rabiner, L. A. Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.(Proceedings of the IEEE 1989)


Dyslexi
Geoffrey E. Hinton et al. Lesioning an attractor network: Investigations of acquired dyslexi (1991)


Mixture of Experts
Robert A. Jacobs et al. Adaptive Mixtures of Local Experts (MIT Press 1991)
Noam Shazeer et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (ICLR 2017)
Albert Q. Jiang et al. Mixture of Experts (2024)


Object Recognition
David Lowe. Object Recognition from Local Scale-Invariant Features. (1992)


Singular Value Decomposition
G. W. Stewart. Early History of the Singular Value Decomposition (1993)


Penn Treebank
Mitchell P. Marcus et al. Building a Large Annotated Corpus of English: The Penn Treebank (1993)


Word Co-occurrence Probabilities
Ido Dagan et al. Similarity-Based Estimation of Word Cooccurrence Probabilities (ACL 1994)


Maximum Entropy
Adwait R. A Maximum Entropy Model for POS tagging (1994)


Complementary priors
Geoffrey E. Hinton et al. A fast learning algorithm for deep belief nets (1994)


Kneser-Ney Smoothing
Reinhard Kneser and Hermann Ney. 1995. Improved backing-off for M-gram language modeling (ICASSP 1995)


BM25
Stephen Robertson et al. Okapi at TREC-3. In Overview of the Third Text REtrieval Conference(TREC-3). pages 109–126. (1995)


SVM(Support Vector Machine)
Corinna Cortes, Vladimir Vapnik. Support-vector networks (1995)


Statistical Machine Learning
Vladimir Vapnik. The Nature of Statistical Learning Theory (1995)


NER(Named-Entity Recognition)
Lance Ramshaw, Mitch Marcus. Text Chunking using Transformation-Based Learning (VLC-WS 1995)


TF-IDF
Thorsten Joachims. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization (1996)


LeNet
Yann LeCun et al. GradientBased Learning Applied to Document Recognition (IEEE 1998)


MNIST
LeCun et al. Gradient-based learning applied to document recognition (IEEE 1998)


MEMM
McCallum et al. Maximum Entropy Markov Models for Information Extraction and Segmentation (ICML 2000)


CRFs
J. Lafferty et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling SequenceData (ICML 2001)


DBSCAN
Martin Ester et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise (KDD 1996)


Adaboost
Yoav Freund, Robert E. Schapire. Experiments with a New Boosting Algorithm (1996)


Graph Neural Network
Alessandro Sperduti et al. Supervised neural networks for the classification of structures (1997)


LeNet
Yann LeCun Leon Bottou Yoshua Bengio, Patrick Ha�ner. Gradient-Based Learning Applied to Document Recognition (1998)


DNN
Yann LeCun Leon Bottou Yoshua Bengio, Patrick Ha�ner. Gradient-Based Learning Applied to Document Recognition (1998)


RNN
Rumelhart, David E; Hinton, Geoffrey E, and Williams, Ronald J. Learning internal representations by error propagation (Sept. 1985)
Jordan, Michael I. Serial order: a parallel distributed processing approach (1986)


DENDRAL
Edward A. Feigenbaum, Bruce G. Buchanan. DENDRAL and Meta-DENDRAL roots of knowledge systems and expert system applications (1993)


LSTM
S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. (1995)


EQP(Equation Prover)
William Mccune. Deep Blue. (1997)


Random Forest
Leo Breiman. Random Forests. Machine Learning, Volume 45, pages 5–32. (2001)


Deap Blue
M Campbell. Deep Blue. (2002)


CoNLL-2003
Sang et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition (NAACL 2003)


DRIVE (Digital Retinal Images for Vessel Extraction)
Joes Staal et al. Ridge-based vessel segmentation in color images of the retina (IEEE 2004)


Feature
David G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints (IJCV 2004)
Navneet Dalal, Bill Triggs. Histograms of Oriented Gradients for Human Detection (CVPR 2005)


Reconstruction
Noah Snavely, Steven M. Seitz, Richard Szeliski. Photo Tourism: Exploring Photo Collections in 3D (ACM 2006)


Connectionist Temporal Classification (CTC)
Alex Graves et al. Connectionist Temporal Classification, Labelling Unsegmented Sequence Data with RNN (ICML 2006)


Deep Belief Network (DBN)
Geoffrey E. Hinton et al. A fast learning algorithm for deep belief nets (2006)


Autoencoder
Reducing the dimensionality of data with neural networks (2006)


SLAM
Davison et al. MonoSLAM: Real-Time Single Camera SLAM (TPAMI 2007)


Knowledge Graph
Fabian M. Suchanek et al. YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia (WWW 2007)


t-SNE
Laurens van der Maaten et al. Visualizing Data using t-SNE (JMLR 2008)


Denoising Autoencoder
Pascal Vincent et al. Extracting and Composing Robust Features with Denoising Autoencoders (ICML 2008)


The Four-Color Theorem
Georges Gonthier. Formal Proof—The Four- Color Theorem (2008)


IEMOCAP (The Interactive Emotional Dyadic Motion Capture (IEMOCAP) Database)
Carlos Busso et al. IEMOCAP: interactive emotional dyadic motion capture database (2008)


Deformable Part Model
Felzenszwalb, David McAllester, Deva Ramanan. A discriminatively trained, multiscale, deformable part model. (2008)


Pubmed
Prithviraj Sen et al. Collective Classification in Network Data (AAAI 2008)


t-SNE
Laurens van der Maaten et al. Visualizing Data using t-SNE (JMLR 2008)


Relation Extraction
Mintz et al. Distant supervision for relation extraction without labeled data (ACL | IJCNLP 2009)


ImageNet
Jia Deng et al. ImageNet: A Large-Scale Hierarchical Image Database (CVPR 2009)


Domain Adaption
Shai Ben-David et al. A theory of learning from different domains (Mach Learn 2010)


ReLU
Vinod Nair and Geoffrey E. Hinton. Rectified Linear Units Improve Restricted Boltzmann Machines (ICML 2010)


PASCAL VOC
Mark Everingham et al. The PASCAL Visual Object Classes (VOC) Challenge (IJCV 2010)


Graphical Models
Sebastian Nowozin and Christoph H. Lampert. Structured Learning and Prediction in Computer Vision (2011)


CUB-200-2011 (Caltech-UCSD Birds-200-2011)
Wah et al. The Caltech-UCSD Birds-200-2011 Dataset (2011)


HMDB51
Hildegard Kuehne et al. HMDB: A large video database for human motion recognition (IEEE 2011)


SVHN (Street View House Numbers)
Netzer et al. Reading digits in natural images with unsupervised feature learning (2011)


Sicikit learn
Fabian Pedregosa et al. Scikit-learn: Machine Learning in Python (2011)
Lars Buitinck et al. API design for machine learning software: experiences from the scikit-learn project (2013)


Numpy
Stefan Van Der Walt et al. The NumPy array: a structure for efficient numerical computation (2011)
Charles R. Harris et al. Array Programming with NumPy (2020)


IMuJoCo
Emanuel Todorov et al. MuJoCo: A physics engine for model-based control (IEEE/RSJ IROS 2012)


CIFAR
Alex Krizhevsky et al. Learning Multiple Layers of Features from Tiny Images (2012)


NYUv2 (NYU-Depth V2)
Nathan Silberman et al. Indoor Segmentation and Support Inference from RGBD Images (LNIP 2012)


KITTI-360
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D (PAMI 2012)


UCF101 (UCF101 Human Actions dataset)
Soomro et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012)


KITTI
Andreas Geiger et al. Are we ready for autonomous driving? The KITTI vision benchmark suite (IEEE 2012)


LIDC-IDRI
Armato et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans (2011)


Random Search
J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization (2012)


CNN(Alexnet)
A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks (2012)
Matthew D. Zeiler and Rob Fergus. Visualizing and Understanding Convolutional Networks (ECCV 2014)


VAE
Diederik P Kingma, Max Welling. Auto-Encoding Variational Bayes (2013)


SST (Stanford Sentiment Treebank)
Socher et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank (EMNLP 2013)


Human3.6M
Ionescu et al. Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments (IEEE 2013)


ConvGNN
Joan Bruna et al. Spectral Networks and Locally Connected Networks on Graphs (2013)


R-CNN
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation (2013)


Word2Vec
T. Mikolov et al. Efficient estimation of word representations in vector space (2013)


Anomaly Detection
Charu C Aggarwal. An introduction to outlier analysis (2013)


Dropout
N. Srivastava et al. Dropout: A simple way to prevent neural networks from overfitting (2014)


Word Representation
Omer Levy et al. Neural Word Embedding as Implicit Matrix Factorization (2014)


Adam
D. Kingma and J. Ba. Adam: A method for stochastic optimization (2014)


COCO (Microsoft Common Objects in Context)
Tsung-Yi Lin et al. Microsoft COCO: Common Objects in Context (ECCV 2014)


Caffe
Yangqing Jia et al. Caffe: Convolutional Architecture for Fast Feature Embedding (ACM 2014)


GRU(Gated Recurrent Unit)
Kyunghyun Cho et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (EMNLP 2014)


PASCAL3D+
Yu Xiang et al. Beyond PASCAL: A benchmark for 3D object detection in the wild (IEEE 2014)


DeCAF
Boris van Breugel, Trent Kyono, Jeroen Berrevoets, Mihaela van der Schaar. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition (ICML 2014)


GAN
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets (2014)


FCN
Jonathan Long, Evan Shelhamer, Trevor Darrell. Fully Convolutional Networks for Semantic Segmentation. (2014)


DeepFace
Y. Taigman et al.DeepFace: Closing the gap to human-level performance in face verification (2014)


Seq2Seq
I. Sutskever et al. Sequence to sequence learning with neural networks. (2014)


DQN (Deep Q-Network)
John Schulman et al. Playing Atari with Deep Reinforcement Learning (NeurIPS 2014)
Volodymyr Mnih. Human level control through deep reinforcement learning (Nature 2015)


Robotics: OpenAI Gym
Matthias Plappert et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research (NeurIPS 2014)


GloVe
Jeffrey Pennington et al. GloVe: Global Vectors for Word Representation (EMNLP 2014)


Text Classification with CNN
Yoon Kim. Convolutional Neural Networks for Sentence Classification (EMNLP 2014)


CAM(Class-Activation Map)
Maxime Oquab et al. Is Object Localization for Free? - Weakly-Supervised Learning With Convolutional Neural Networks (CVPR 2015)


Unsupervised Domain Adaptation
Yaroslav Ganin et al. Unsupervised Domain Adaptation by Backpropagation (2015)


ResNet
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. Deep Residual Learning for Image Recognition. (2015)


Batch Normalization
S. Loffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. (2015)


YOLO
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi. You Only Look Once: Unified, Real-Time Object Detection (2015)


ArcFace
Jiankang Deng, Jia Guo, Jing Yang, Niannan Xue, Irene Kotsia, and Stefanos Zafeiriou. ArcFace: Additive Angular Margin Loss for Deep Face Recognition (CVPR 2015)


SUN RGB-D
Song et al. SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite (IEEE 2015)


MovieLens
F. Maxwell Harper et al. The MovieLens Datasets: History and Context (ACL 2015)


ModelNet
Wu et al. 3D ShapeNets: A Deep Representation for Volumetric Shapes (CVPR 2015)


LibriSpeech
Vassil Panayotov et al. Librispeech: An ASR corpus based on public domain audio books (IEEE 2015)


SNLI (Stanford Natural Language Inference)
Bowman et al. A large annotated corpus for learning natural language inference (EMNLP 2015)


Visual Question Answering (VQA)
Agrawal et al. VQA: Visual Question Answering (ICCV 2015)


ShapeNet
Chang et al. ShapeNet: An Information-Rich 3D Model Repository (2015)


Model Compression
Cristian Bucil˘a et al. Model Compression (ACM SIGKDD 2006)
O. Vinyals, J. A. Dean, G. E. Hinton. Distilling the Knowledge in a Neural Network. (2015)


CelebA (CelebFaces Attributes Dataset)
Liu et al. Deep Learning Face Attributes in the Wild (IEEE 2015)


ActivityNet
Heilbron et al. ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding (IEEE 2015)


IModelNet
Wu et al. 3D ShapeNets: A Deep Representation for Volumetric Shapes (CVPR 2015)


Deep learning
Yann LeCun, Yoshua Bengio, Geoffrey Hinton. Deep learning (NatureDeepReview 2015)


TRPO
Schulman, John et al. Trust Region Policy Optimization. (2015)


A3C (Asynchronous Advantage Actor Critic)
Volodymyr Mni et al. Asynchronous Methods for Deep Reinforcement Learning (ICML 2016)


MCTS
Silver, D et al. Mastering the game of Go with deep neural networks and tree search (Nature 2016)


DeepLab
Liang-Chieh Chen et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs (TPAMI 2016)


R-FCN
Jifeng Dai et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks (2016)


LIME(Local Interpretable Model-agnostic Explanations)
Marco Tulio Ribeiro et al. Why Should I Trust You?": Explaining the Predictions of Any Classifier (NAACL 2016)


Subword Model
Rico Sennrich et al. Neural Machine Translation of Rare Words with Subword Units (ACL 2016)


Monte Carlo Dropout
Yarin Gal et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (ICML 2016)


Graph Autoencoders
Thomas N. Kipf et al. Variational Graph Auto-Encoders (NeurIPS 2016)


Document Classification
Z Yang et al. Hierarchical Attention Networks for Document Classification (NAACL 2016)


Visual Intelligence
Brenden M. Lake et al. Building Machines That Learn and Think Like People (NeurIPS 2016)


Imini-Imagenet
Vinyals et al. Matching Networks for One Shot Learning (NeurIPS 2016)


IoU Loss
Jiahui Yu et al. UnitBox: An Advanced Object Detection Network (ACM MM 2016)


XGBoost
Tianqi Chen et al. XGBoost: A Scalable Tree Boosting System (KDD 2016)


DAVIS (Densely Annotated VIdeo Segmentation)
Perazzi et al. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation (IEEE 2016)


S3DIS (Stanford 3D Indoor Scene Dataset (S3DIS))
Armeni et al. 3D Semantic Parsing of Large-Scale Indoor Spaces (IEEE 2016)


Universal Dependencies
Nivre et al. Universal Dependencies v1: A Multilingual Treebank Collection (LREC 2016)


CheXpert
Irvin et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison (AAAI 2016)


VCTK (CSTR VCTK Corpus)
Veaux et al. CSTR VCTK corpus: English multi-speaker corpus for CSTR voice cloning toolkit (2016)


MIMIC-III (The Medical Information Mart for Intensive Care III)
Johnson et al. MIMIC-III, a freely accessible critical care database (2016)


SQuAD (Stanford Question Answering Dataset)
Rajpurkar et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text (EMNLP 2016)


Cityscapes
Cordts et al. The Cityscapes Dataset for Semantic Urban Scene Understanding (CVPR 2016)


MS MARCO (Microsoft Machine Reading Comprehension Dataset)
Bajaj et al. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset (EMNLP 2016)


Tensorflow
Martín Abad et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (2016)


OpenAI Gym
Brockman et al. OpenAI Gym (2016)


XGBoost
Tianqi Chen et al. XGBoost: A Scalable Tree Boosting System (KDD 2016)


Image Style Transfer
Leon A. Gatys et al. Image Style Transfer Using Convolutional Neural Networks (CVPR 2016)


Deep speech 2
Amodei, D et al. Deep speech 2: End-to-end speech recognition in english and mandarin. (ICML 2016)


Continual Learning
James Kirkpatrick et al. Overcoming catastrophic forgetting in neural networks (2016)


Random Erasing
Sachin Ravi et al. Optimization as a Model For Few-Shot Learning (2016)
Zhun Zhong et al. Random Erasing Data Augmentation (2017)


RetinaNet
Tsung-Yi Lin et al. Focal loss for dense object detection (2017)


Mask R-CNN
Kaiming He et al. Mask R-CNN for Object Detection and Segmentation (2017)


PPO
Schulman, John, et al. Proximal policy optimization algorithms (2017)


NAS(Neural Architecture Search)
Barret Zoph et al. Neural Architecture Search with Reinforcement Learning (ICLR 2017)


SHAP(Shapley Additive Explanations)
Scott Lundberg et al. A Unified Approach to Interpreting Model Predictions (NeurIPS 2017)


Graph Convolutional Networks(GCN)
Thomas N. Kipf et al. Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017)


Image Restoration
Dmitry Ulyanov et al. Deep Image Prior (2017)


Open-Domain Question Answering
Danqi Chen et al. Reading Wikipedia to Answer Open-Domain Questions. (ACL 2017)
Karpukhin et al. Dense Passage Retrieval for Open-Domain Question Answering. (EMNLP, 2020)


GraphSAGE
William L. Hamilton et al. Inductive Representation Learning on Large Graphs (NerulIPS 2017)


Loss Function
Dong Yu et al. Permutation Invariant Training of Deep Models Forspeaker-independent Multi-talker Speech Separation (IEEE 2017)


IPlaces
Zhou et al. Places: A 10 Million Image Database for Scene Recognition (IEEE 2017)


Meta Learning
Chelsea Finn et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks (ICML 2017)


Weight & Activation Quantizer
Shuchang Zhou et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients (IEEE 2017)


Opennmt
Guillaume Klein et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation (ACL 2017)


ICARLA (Car Learning to Act)
Dosovitskiy et al. CARLA: An Open Urban Driving Simulator (2017)


MobileNet
Andrew G. Howard et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (2017)


VoxCeleb1
Nagrani et al. VoxCeleb: a large-scale speaker identification datasett (Interspeech 2017)


Kinetics (Kinetics Human Action Video Dataset)
Kay et al. The Kinetics Human Action Video Dataset (2017)


ScanNet
Dai et al. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes (CVPR 2017)


AudioSet
Jort F. Gemmeke et al. Audio Set: An ontology and human-labeled dataset for audio events (2017)


Fashion-MNIST
Xiao et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms (2017)


Visual Genome
Krishna et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations (2017)


Alphago Zero
Silver, D et al. Mastering the game of Go without human knowledge (2017)


Text Style Transfer
Tianxiao Shen et al. Style Transfer from Non-Parallel Text by Cross-Alignment (NeurIPS 2017)


Transformer
A. Vaswani et al. Attention is all you need (2017)


BERT
J. Devlin et al. Bert: Pre-training of deep bidirectional transformers for language understanding (2018)


GPT
Alec Radford et al. Improving Language Understanding by Generative Pre-Training (2018)


GPT-2
Alec Radford et al. Language Models are Unsupervised Multitask Learners (2018)


RoBERTa
Yinhan Liu et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach (2018)


CornerNet
Hei Law et al. CornerNet: Detecting Objects as Paired Keypoints (ECCV 2018)


AutoEncoder-based Recommendation System
Dawen Liang et al. Variational Autoencoders for Collaborative Filtering. (WWW 2018)


VoxCeleb2
Chung et al. VoxCeleb2: Deep Speaker Recognition (ISCA 2018)


GLUE
Wang et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (EMNLP 2018)


GAT
Petar Veličković et al. Graph Attention Networks (ICLR 2018)


fastMRI
Zbontar et al. fastMRI: An Open Dataset and Benchmarks for Accelerated MRI (2018)


Speech Commands
Pete Warden et al. Warden in Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition (2018)


MultiNLI (Multi-Genre Natural Language Inference)
Williams et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference (NAACL 2018)


Session-based Recommendation System
Self-Attentive Sequential Recommendationr (ICDM 2018)


BLAS (Basic Linear Algebra Subprograms)
C. Nugteren, CLBlast: A tuned OpenCL BLAS library (2018)


Low Distortion & Good Perceptual Quality
Yochai Blau et al. The Perception-Distortion Tradeoff (CVPR 2018)


Albumentations
Alexander Buslaev et al. Albumentations: fast and flexible image augmentations (2018)


AlphaStar
Google Deepmind. AlphaStar: Mastering the real-time strategy game StarCraft II (2019)


EfficientNet
Mingxing Tan et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019)


Backdoor Attack
Tianyu Gu et al. BadNets: Evaluating Backdooring Attacks on Deep Neural Networks (NeurIPS 2019)


SentencePiece
Taku Kudo et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing (ACL 2019)


Specaugment
Daniel S. Park et al. Specaugment: A simple data augmentation method for automatic speech recognition (Interspeech 2019)


EDA
Jason Wei et al. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks (EMNLP-IJCNLP 2019)


Label Smoothing
Rafael Müller et al. When Does Label Smoothing Help? (NeurIPS 2019)


GIoU Loss
Hamid Rezatofighi et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression (CVPR 2019)


AutoAugment
Ekin D. Cubuk et al. AutoAugment: Learning Augmentation Policies from Data (CVPR 2019)


Scipy
Pauli Virtanen et al. SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python (2019)


Natural Questions
Kwiatkowski et al. Natural Questions: a Benchmark for Question Answering Research (TACL 2019)


CoLA (Corpus of Linguistic Acceptability)
Warstadt et al. Neural Network Acceptability Judgments (TACL 2019)


Data Augmentation
Jason Wei et al. EDA : Easy Data Augmentation Techniques for Boosting Performance on Text Classification (EMNLP-IJCNLP 2019)


SuperGLUE
Wang et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems (NeurIPS 2019)


Hugging face
Thomas Wolf et al. HuggingFace's Transformers: State-of-the-art Natural Language Processingt (EMNLP 2019)


GPU
Jeff Johnson et al. Billion-scale similarity search with GPUs (IEEE 2019)


IFFHQ (Flickr-Faces-HQ)
Karras et al. A Style-Based Generator Architecture for Generative Adversarial Networks (CVPR 2019)


T5
Colin Raffel et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (The Journal of Machine Learning Research 2019)


MoCo
Kaiming He et al. Momentum Contrast for Unsupervised Visual Representation Learning (2019)


Robotics: Vision and Touch Representation
Michelle A. Lee et al. Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks (ICRA 2019)


wav2vec
S. Schneider et al. wav2vec: Unsupervised pre-training for speech recognition (Interspeech 2019)
Alexei Baevski et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Neural Turing Machines (NeurlIPS 2020)


GPT-3, Prompt Tuning
Brown et al. Language Models are Few-Shot Learners (NeurIPS 2020)


UMAP(Uniform Manifold Approximation and Projection)
Leland McInnes et al. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction (2020)


DETR
Nicolas Carion et al. End-to-End Object Detection with Transformers (2020)


SimCLR
Chen Ting et al. Simple Framework for Contrastive Learning of Visual Representations (PMLR 2020)
Chen Ting et al. Big self-supervised models are strong semi-supervised learners. (2020)


UDA(Unsupervised Data Augmentation)
Qizhe Xie et al. Unsupervised Data Augmentation for Consistency Training (NeurIPS 2020)


MLIR (Multi Level Intermediate Representation)
C. Lattner et al. MLIR: A compiler infrastructure for the end of Moore’s law (2020)


GNN-based Recommendation System
Xiangnan He et al. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation (SIGIR 2020)


OGB (Open Graph Benchmark)
Hu et al. Open Graph Benchmark: Datasets for Machine Learning on Graphs (NIPS 2020)


nuScenes
Caesar et al. nuScenes: A multimodal dataset for autonomous driving (CVPR 2020)


CORD-19
Wang et al. CORD-19: The COVID-19 Open Research Dataset (ACL 2020)


NeRF(Neural Radiance Fields)
Mildenhall et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ECCV 2020)


Retrieval-Augmented Generation
Patrick Lewis et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (NeurIPS 2020)


Vision Transformer(ViT)
Alexey Dosovitskiy Gu et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2020)


Reinforcement Learning from Human Feedback (RLHF)
Nisan Stiennon et al. Learning to summarize from human feedback (NeurIPS 2020)
Long Ouyang et al. Training language models to follow instructions with human feedback (2022)


BART
Mike Lewis et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (ACL 2020)


Prefix Tuning
Xiang Lisa Li and Percy Liang. Prefix-tuning: Optimizing continuous prompts for generation. (ACL 2021)


Generative Artificial Intelligence(GAI)
R Bommasani et al. On the opportunities and risks of foundation models (2021)


Swin Transformer
Ze Liu et al. Hierarchical Vision Transformer using Shifted Windows (2021)


Reinforcement Learning
David Silver et al. Reward is enough (Artificial Intelligence 2021)


Text to Image Generation
Aditya Ramesh et al. DALL-E: Zero-Shot Text-to-Image Generation (JMLR 2021)


ROUGE
Chin-Yew Lin et al. ROUGE: A Package for Automatic Evaluation of Summaries (2021)


Stable Diffusion
Rombach et al. High-Resolution Image Synthesis with Latent Diffusion Models (2021)


AlphaFold
Richard Evans et al. Protein complex prediction with AlphaFold-Multimer. (2021)
John Jumper et al. Highly accurate protein structure prediction with AlphaFold (Nature 2021)


IMDb Movie Reviews
Andrew L. Maas et al. Learning Word Vectors for Sentiment Analysis (ACL 2021)


Pytorch
Adam Paszk et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library (NeurIPS 2021)


Stochastic Parrot
Emily M. Bender et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 (2021)


OPT
Susan Zhang et al. OPT: Open Pre-trained Transformer Language Models (2022)


InstructGPT
Long Ouyang et al. Training language models to follow instructions with human feedback (2022)


PaLM
Aakanksha Chowdhery et al. PaLM: Scaling Language Modeling with Pathways (2022)


Chain-of-Thought Prompting
Jason Wei et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022)


Non-language Task
Tuan Dinh et al. LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks (NeurIPS 2022)


3D Generator
Eric R. Chan et al. G3D: Efficient Geometry-aware 3D Generative Adversarial Networks (CVPR 2022)


Joint Embedding Predictive Architecture(JEPA)
Yann LeCun et al. A Path Towards Autonomous Machine Intelligence (2022)


GPT-4
Baolin Peng et al. Instruction Tuning with GPT-4 (EMNLP 2023)


LLAMA
Touvron et al. LLaMA: Open and Efficient Foundation Language Models (2023)


Alpaca, Instruction Tuning
Rohan Taori et al. Alpaca: A Strong, Replicable Instruction-Following Model (2023)


Large Language Model (LLM)
Tyna Eloundou et al. GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models (2023)
Brandon C. Roy et al. Predicting the birthe of a spoken word (2015)
Tom McCoy et al. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference (ACL 2019)
Qihuang Zhong et al. Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT (2023)
Lukas Berglund et al. The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" (2023)


Quantization
Hongyu Wang et al. BitNet: Scaling 1-bit Transformers for Large Language Models (2023)
Shuming Ma et al. The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits (2024)


DPO
Rafael Rafailov et al. Direct Preference Optimization: Your Language Model is Secretly a Reward Model (NeurIPS 2023)


Semantic Planner
Ma et al. Eureka: Human-Level Reward Design via Coding Large Language Models (ICLR 2024)


AlphaGeometry
Trieu H. Trinh et al. Solving olympiad geometry without human demonstrations (2024)


Habsburg AI
Ilia Shumailov et al. The Curse of Recursion: Training on Generated Data Makes Models Forget (2023)
Ilia Shumailov et al. AI models collapse when trained on recursively generated data (Nature 2024)


Machine Unlearning
Weijia Shi et al. Detecting Pretraining Data from Large Language Models (ICLR 2024)
Martin Pawelczyk et al. In-Context Unlearning: Language Models as Few-Shot Unlearners (ICML 2024)
Michael Duan et al. Do Membership Inference Attacks Work on Large Language Models (COLM 2024)


The Nobel Prize in Physics 2024
John J. Hopfield, Geoffrey E. Hinton


The Nobel Prize in Chemistry 2024
David Baker, Demis Hassabis, John M. Jumper




Reference

Annotated History of Modern AI and Deep Learning (Juergen Schmidhuber)
Classical Paper List on Machine Learning andNatural Language Processing (Zhiyuan Liu)
Award-winning classic papers in ML and NLP (Desh Raj)
Computer Vision: 10 Papers to Start (Chenxi Liu)
Awesome - Most Cited Deep Learning Papers (Terryum)
Papers With Code Machine Learning Datasets
야사와 만화로 배우는 인공지능 강의
The Nobel Prize in Physics 2024
The Nobel Prize in Chemistry 2024