Monday, 19 December 2016

Machine learning four years after the turning point

In May 2012 I wrote a note titled "Machine at its turning point" to argue for the new wave of machine learning in that we do not need to worry about having a convex loss but rather be happy with non-convex ones. At the time I did not know about AlexNet and its record-breaking result on the ImageNet benchmark. It was published 7 months later in NIPS'12.

AlexNet was truely a turning point for machine learning. It declared the winning of deep neural nets over others, which were combination of clever manual feature engineering and some variants of SVMs or random forests. AlexNet is remarkable in many ways: Dropout, rectifier linear units, end-to-end training on massive data with GPUs, data augmentation and carefully designed convolutional nets.

It was the year that Yann LeCun posted his complaints about the computer vision community, but quickly retracted his boycott given the aftershock of AlexNet.

Recently, there has been an interesting comment floating around: In machine learning, we ask what we can do for neural networks, and in applied domains, we ask what can neural networks do for X. And the list of Xs keeps growing from cognitive domain to non-cognitive domains. Andrew Ng made an interesting point that for domains where humans can do well to map A to B in less than a second, it is ripe for machine automation.

This year also marks the 10th year after Deep Belief Nets, the model that announces the beginning of the current wave of neural nets. Early this year, AlhaGo of DeepMind defeated one of the best Go champions 4 to 1, officially ending the superiority of human on this ancient game. AlphaGo is a mixture of convolutional nets to read the board positions and evaluate the moves, and random tree-search moves.

Many things have changed since 2012. It is clear that supervised learning works if we have sufficient labels without pre-training. Unsupervised learning, after an initial burst with Boltzmann machines and Autoencoders, failed to deliver.  There are new interesting developments, however, with Variational Autoencoder (VAE) and Generative Adversarial Nets (GAN), both invented in 2014. At this point, GAN is the best technique to generate faithful images. It is considered by Yann LeCun as one of the best ideas in recent years.

The machine learning community has witnessed 10-15 year mini-cycles. Neural networks, graphical models, kernel methods, statistical relational learning and currently, deep learning. So what is up for deep learning? If we consider 2006 as the year of beginning of current deep learning, then it is already 10 years, enough for a mini-cycle. But if we consider 2012 as the true landmark, then we have 6 more years to count.

Like other methodologies, deep learning will eventually morph into something else in 5 years time. We may call it by other names. With programming becomes reasonably effortless and with the availability of powerful CPUs/GPUs designed specifically for deep learning, the low hanging fruits will soon be picked up.

Practice-wise, as feature engineering is an unsung hero of machine learning prior to 2012, architecture engineering is at the core of deep learning these days.

It is also time for the hardcores. Data efficiency, statistics, geometry, information theory, Bayesian and other "serious" topics. Like any major progresses in science and engineering, nothing really occurs over night. At this point, deep learning is already mixed with graphical models, planning, inference, symbolic reasoning, memory, execution, Bayesian among other things. All together, something fancy will happen, just like what I noted about Conditional Random Fields years ago, that it is the combination of incremental innovations that pushes the boundary of certain field to a critical point. It also concurs with the idea of emergence intelligence, where human intelligence is really the emerging product of many small advances over apes.

For a more comprehensive review, see my recent tutorials at AI'16 on the topic. Some incremental innovations were produced at PRaDA (Deakin University), listed below.

Work by us:
  • Multilevel Anomaly Detection for Mixed Data, K Do, T Tran, S Venkatesh, arXiv preprint arXiv: 1610.06249.
  • A deep learning model for estimating story points, M Choetkiertikul, HK Dam, T Tran, T Pham, A Ghose, T Menzies, arXiv preprint arXiv: 1609.00489
  • Deepr: A Convolutional Net for Medical Records, Phuoc Nguyen, Truyen Tran, Nilmini Wickramasinghe, Svetha Venkatesh, To appear in IEEE Journal of Biomedical and Health Informatics.
  • Column Networks for Collective Classification, T Pham, T Tran, D Phung, S Venkatesh, AAAI'17
  • DeepSoft: A vision for a deep model of software, Hoa Khanh Dam, Truyen Tran, John Grundy and Aditya Ghose, FSE VaR 2016.
  • Faster Training of Very Deep Networks Via p-Norm Gates, Trang Pham, Truyen Tran, Dinh Phung, Svetha Venkatesh, ICPR'16.
  • Hierarchical semi-Markov conditional random fields for deep recursive sequential data, Truyen Tran, Dinh Phung, Hung Bui, Svetha Venkatesh, To appear in Artificial Intelligence.
  • DeepCare: A Deep Dynamic Memory Model for Predictive Medicine, Trang Pham, Truyen Tran, Dinh Phung, Svetha Venkatesh, PAKDD'16, Auckland, NZ, April 2016. 
  • Neural Choice by Elimination via Highway Networks, Truyen Tran, Dinh Phung and Svetha Venkatesh,  PAKDD workshop on Biologically Inspired Techniques for Data Mining (BDM'16), April 19-22 2016, Auckland, NZ.
  • Tensor-variate Restricted Boltzmann Machines, Tu D. Nguyen, Truyen Tran, D. Phung, and S. Venkatesh, AAAI 2015
  • Thurstonian Boltzmann machines: Learning from multiple inequalities, Truyen Tran, D. Phung, and S. Venkatesh, In Proc. of 30th International Conference in Machine Learning (ICML’13), Atlanta, USA, June, 2013.


  1. Very beautiful blog..Thank you for sharing this amazing post.
    Visit Best Machine Learning Training in Jaipur

  2. Thanks for your effort to put this information here. I think its useful.for more information about machine learning go through this link. machine learning training in hyderabad

  3. I can't understand why MNIST is a common example to teach machine learning. Some other application like text or voice learning will be interesting.

    Machine Learning Training in Chennai | Machine Learning Training Course in Chennai

  4. very informative blog and useful article thank you for sharing with us , keep posting Data Science online Training Bangalore

  5. You do not know a method or one other because you solely have one information level to work with.This is great blog. If you want to know more about this visit our Cloud Certified Site.

  6. "• Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating IOT Online Training

  7. Really nice blog post.provided a helpful information.I hope that you will post more updates like this Data Science online Course Bangalore

  8. Nice information thank you,if you want more information please visit our link
    machine learning online training

  9. I'm glad to hear that, Data Science. Good luck to you. Blogging is a great thing, and you get better with practice. Data Science training in Hyderabad One of the best ways to grow is to read other people's blogs. See what they do, how they do things. It's always food for thought, and sometimes, it's downright inspiring.

  10. I think things like this are really interesting. I absolutely love to find unique places like this. It really looks super creepy though!!

    I shared here the Useful Guide for Machine learning Please check these too : Useful guide to learn Machine learning

  11. Nice to be visiting your blog again, it has been months for me. Well this article that i've been waited for so long. I need this article to complete my assignment in the college, and it has same topic with your article. Thanks, great share. artificial intelligence

  12. This is an informative post and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article.
    iphone app training course
    iphone training classes in bangalore
    iphone training


  13. Nice blog..! I really loved reading through this article. Thanks for sharing such a amazing post with us and keep blogging...

    Hadoop online training in Hyderabad

    Hadoop training in Hyderabad

    Bigdata Hadoop training in Hyderabad

  14. very good informative blog & useful to me
    thank you...keep posting
    Machine Learning Training

  15. Hi,nice information is there in your blog it is valuable i read so many regarding this .For additional information of machine learning you can visit us."Machine Learning"

  16. PCB Design Training in Bangalore offered by myTectra. India's No.1 PCB Design Training Institute. Classroom, Online and Corporate training in PCB Design
    pcb design training in bangalore

  17. IOT Training in Bangalore - Live Online & Classroom
    Students are made to understand the type of input devices and communications among the devices in a wireless media.
    IOT Training course observes iot as the platform for networking of different devices on the internet and their inter related communication.

  18. Thanks for the info....

    Real Trainings provide all IT-Training Course information in Hyderabad, Bangalore, Chennai . Here students can Compare all Courses with all detailed information. In Machine Learning Training we provide courses like Machine Learning, Machine learning online training etc...

  19. Deep Learning has gotten really popular and with this course candidates can gain knowledge about things like supervised and unsupervised learning. There are several institutes which provide Deep Learning in Chennai.

  20. Just found your post by searching on the Google, I am Impressed and Learned Lot of new thing from your post.


  21. Thank you for sharing information with us. nice blog...
    Machine learning course in Mumbai