CSci 8363 Fall 2025 -- Ordered List of Papers

full list of papers: https://www-users.cse.umn.edu/~boley/8363-25f/PaperList.htm
class canvas site
index.html (supplementary material)

Graphs and Random Walks

09/15 Monte Mahlum________________________
Learning from Labeled and Unlabeled Data on a Directed Graph
by Dengyong Zhou & Jiayuan Huang & Bernhard Schölkopf
in ICML '05:
https://www.semanticscholar.org/paper/Learning-from-labeled-and-unlabeled-data-on-a-graph-Zhou-Huang/df95ae968cb0b722143f6000fa0dc7ce21cc35e2
https://pure.mpg.de/rest/items/item_1791381_2/component/file_3175263/content
09/17 Adam Imdieke_______________________
Efficient Estimation of Word Representations in Vector Space
by Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean
https://arxiv.org/abs/1301.3781
09/17 Adam Imdieke_______________________
Attention Is All You Need
by Ashish Vaswani, Llion Jones, Noam Shazeer, Niki Parmar, Aidan N. Gomez, Jakob Uszkoreit, \L ukasz Kaiser, Illia Polosukhin
https://arxiv.org/abs/1706.03762

Machine Learning

09/24 (postponed from 09/22): Archisman Bandyopadhyay ________
Attention is a smoothed cubic spline
by Zehua Lai, Lek-Heng Lim, Yucong Liu
https://arxiv.org/abs/2408.09624
09/29 (postponed from 09/24): Brian Chung ________
Dataset Distillation
by Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, Alexei A. Efros
https://arxiv.org/abs/1811.10959
10/01 (postponed from 09/29) Aadit Munjal _______________________
ImageNet Classification with Deep Convolutional Neural Networks
by Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton
in NIPS 2012.
https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks
10/01 (postponed from 09/29) Aadit Munjal _______________________
Convolutional Networks and Applications in Vision
by Yann LeCun & Koray Kavukvuoglu & Clement Farabet
in Proc. International Symposium on Circuits and Systems (ISCAS'10) (IEEE), 2010
https://yann.lecun.com/exdb/publis/pdf/lecun-iscas-10.pdf
https://ieeexplore.ieee.org/document/5537907 (might require UofM login)
10/08 Monte Mahlum________________________
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
by beren, Sid Black (2022)
https://www.lesswrong.com/posts/mkbGjzxD8d8XqKHzA/the-singular-value-decompositions-of-transformer-weight
10/13 Adam Imdieke_______________________
Propagative Distance Optimization for Constrained Inverse Kinematics
by Yu Chen, Yilin Cai, Jinyun Xu, Zhongqiang Ren, Guanya Shi, Howie Choset
https://arxiv.org/abs/2406.11572
10/15 Archisman Bandyopadhyay ___________
Hodge Laplacian of Graphs
by Lek-Heng Lim
https://arxiv.org/abs/1507.05379
10/20 Brian Chung _______________________
Deep Neural Networks as Gaussian Processes
by Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein
https://arxiv.org/abs/1711.00165
10/22 Aadit Munjal ______________________
A ConvNet for the 2020s
by Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie
https://arxiv.org/abs/2201.03545
10/27 Monte Mahlum________________________
Sparse Autoencoders Find Highly Interpretable Features in Language Models
by Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, Lee Sharkey
https://arxiv.org/abs/2309.08600
10/29 Adam Imdieke_______________________
Denoising Diffusion Probabilistic Models
by Jonathan Ho, Ajay Jain, Pieter Abbeel
https://arxiv.org/abs/2006.11239
11/03 Archisman Bandyopadhyay _____
Generalization in Deep Learning
by Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio
https://arxiv.org/abs/1710.05468
11/05 Brian Chung _______________________
Denoising Diffusion Implicit Models
by Jiaming Song, Chenlin Meng, Stefano Ermon
https://arxiv.org/abs/2010.02502
11/10 Aadit Munjal ______________________
LOra: Low-Rank Adaptation Of Large Language Models
by Edward Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen
https://arxiv.org/abs/2106.09685
11/12 no class___________________________
11/17 Monte Mahlum_______________________
graph neural networks with transformers
Do Transformers Really Perform Bad for Graph Representation?
by Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, Tie-Yan Liu
https://arxiv.org/abs/2106.05234
ALSO: Hierarchical graph transformer with contrastive learning for protein function prediction
by Zhonghui Gu, Xiao Luo, Jiaxiao Chen, Minghua Deng, Luhua Lai
Bioinformatics, Volume 39, Issue 7, July 2023, btad410.
https://academic.oup.com/bioinformatics/article/39/7/btad410/7208864
11/19 Adam Imdieke_______________________
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Albert
by Albert Gu, Tri Dao
https://arxiv.org/abs/2312.00752
11/24 Archisman Bandyopadhyay____________
Discovering faster matrix multiplication algorithms with reinforcement learning
by Alhussein Fawzi, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Francisco J. R. Ruiz, Julian Schrittwieser, Grzegorz Swirszcz, David Silver, Demis Hassabis & Pushmeet Kohli
in Nature volume 610, pages 47–53 (2022)
https://www.nature.com/articles/s41586-022-05172-4