Use cases for different frameworks - Automated machine learning: Evaluating AutoML frameworks

7. RESULTS

7.4 Use cases for different frameworks

During this research we needed to configure the tested software of these framework and even though most of them were run through the benchmark application we still tried these individually in some sense at least. Through these usages we created some use cases and will recommend and evaluate which frameworks would work in each use case. The use cases consider the experience of the user, time available for completing the task, the available hardware and possible monetary budgets.

Use case 1 here is defined as a very simple one. There is a machine learning model that has been created from scratch and you just want to compare it to one that is automati-cally built by an AutoML framework. In this user case there is a zero-budget in money spending and the user has a basic PC and is not too busy and has a day or two to get their task done. The user also has basic programming skills and very basic data science skills as well.

In this sort of case we would recommend using H20, Autokeras, Autosklearn or Gama.

All of them have very similar usage and these types of tasks can be done with four to ten lines of code. Usage of H20 is the best option if the user has skills with R language and wants to use exactly that because it is the only strictly R solution here. Gama is also easy to use but if the user is very strict on the performance it tends to take a little more time than the other options. Autokeras and Autosklearn comes down to user preference as they are very similar in usage aspects. Autokeras maybe has a bit stricter needs in terms of the input of the data as well as it did not seem to automatically deal with missing values but if the user has the necessary data science skills as in general prefers Keras stack to the scikit stack it is as good of an option. Good thing to remember is that Keras stack does not support so called classic machine learning algorithms and uses neural net-works. Unlike in normal Keras the Autokeras gives more pre and post data processing utilities if those are needed and in automated machine learning those tasks are auto-mated as much as possible.

Second use case that could be thought of as solvable with automated machine learning could be such that there is a more difficult task like detecting something like facial hair from images. Let us say that the user has good programming skills in Python and maybe in R as well as he is a data scientist. He also has a good PC and has a couple of days to run the task as it is quite a big dataset of images.

In the above defined second use case it is good that the user has programming skills.

This means we do not need to take into account the ease of use in the systems. In any

case Autokeras and Autosklearn are easy to recommend here. Especially Autokeras has good capabilities in specifically image recognition. But the same applies here as in the first case about the usage of Keras and Scikit stack as it is mainly a preference. H2O is also a good one here if the user wants to use R instead of Python. AutoGluon is one framework that could be used here as it is at its best with text, image and tabular data. If the task fell outside of these three one could think about excluding it. But as the results of all of the frameworks were quite similar it comes down to if some syntax is more to the users liking. The ones mentioned in my mind were the most consistent performers as well as the most comfortable to use so we would recommend those.

The third use case would be someone with no programming skills trying to do a simple machine learning task like predicting the price changes of apples or something similar.

From these tested frameworks Ludwig would be closest to this. It is possible to run ma-chine learning tasks with Ludwig with only YAML files. So that means no programming is needed however you still need some understanding of creating Yaml files so some computer knowledge is needed. Other than that, outside this research Google and Azure have their AutoML possibilities in their Cloud services so those and Ludwig would be needed. Of course, Cloud service machine learning tends to cost so if there is a budget Ludwig would be the answer.

One thing to assess was also what kind of data preprocessing is needed for the frame-works to be able to be run. Mainly this means what kind of input is needed and does the tool handle missing values or detect outliers. This turned out to be kind of redundant research because most of the framework required the same basic input in training and test data.

Some basic assessment of the viability of the frameworks is written here with some notes as well. H20AutoML is the best to use when there is only R programming language knowledge since it is the only one that supports R on a simple PC and only needs basic knowledge of data science and programming. Ludwig is on par with others, little worse performance, easy to use and gives out model characteristics.

Because of the performance shown and mostly because of the lack of ongoing updates and development AutoWeka was seen as an option that is not viable in a modern use case. MLPlan also had poorer performance than other frameworks but because it still had better results and it is possible to configure it to use scikit-learn instead of AutoWeka, like we had it configured in the tests, it can be still thought as an emergency option if there is absolute need to use Java as the language that uses AutoML.

In conclusion the Open AutoML Benchmark is a good tool and it is improving all the time.

Hopefully the development of the benchmark continues and more and more datasets and frameworks are added to it. We also hope that this research helps the ongoing re-search in the field of AutoML although further rere-search is still needed.

8. CONCLUSIONS

We set out in this research to find out the state of AutoML at this moment trying to find out could it be used instead of traditional machine learning. We also wanted to find out how efficient and accurate it is when used without optimization because that should be the whole point, so that machine learning tools would be available for people that have not studied machine learning, data science and are not experts in parameter optimization either. To achieve this, we compared the results of multiple AutoML frameworks to one of the most popular, yet simple, traditional machine learning methods with random forest predictor. We also tested if the framework was in anyway applicable with a constant predictor. In almost all cases the AutoML frameworks could beat the constant predictor.

The random forests results could also in almost every case be at least matched but not always beaten.

Another big research topic was to compare the different AutoML frameworks with each other to try and find out if there are some clear differences in the most used frameworks.

We measured them mostly on their accuracy testing them on multiple datasets with var-ying types of data. We also evaluated them on their ease of use and created some use cases which show what kind of users and what kinds of tasks could be achieved with specific frameworks. We observed that Autosklearn and Autokeras are a little bit ahead of the other frameworks because both performed well in all kinds of tasks even though they were not the best in each category. They were also easy to use and did not need much effort to get them working. Some of the frameworks had clearly specialized in spe-cific tasks and could outperform Autosklearn and Autokeras in those tasks but overall, these two were consistent with good results.

Our research suggests that there are multiple use cases where AutoML can be a per-fectly fine alternative, but for more serious tasks it is not quite accurate enough to replace machine learning models created by an expert. For someone just looking to play around and do some simple prediction from a linear data set AutoML is quite good and most likely simple to use depending on of course which framework is used. In general, you can get good performance from AutoML with little effort when you have a simple enough task and do not try to ask too much.

The AutoML frameworks themselves seem to be developing rapidly and during our time researching the topic most of the frameworks tested had multiple new releases in their Github repositories respectively. We believe this indicates that the AutoML community is

very active and that could lead into progression in the field. There is some genuine op-portunities for AutoML to become widely used if the frameworks become a little more accurate on more difficult tasks. The most progress is needed specifically in hyperpa-rameter optimization which is the most important part of complex data science tasks.

AutoML is not yet ready to replace data scientists and data engineers but with enough time it could very well be possible.

The downside to AutoML getting more widely used is that the “black box” of machine learning in general will get even more abstract and the techniques and science used to create for example neural networks would become rarer knowledge. This would in turn elevate the special knowledge of machine learning experts and data scientists because the techniques still need to be developed further even though this knowledge would not be needed for their usage.

One thing that would require further research in this topic is cloud services and their usage of AutoML. At least Google, Amazon web services and Microsoft Azure provide tools for AutoML. From our understanding their usage should be very simple and avail-able for all sorts of data that is stored or processed using each provider’s services. Un-fortunately, we had to cut this part out mainly because cloud services cost money and they require different kind of skills which would have to learn in order to test them out properly. The consensus among the community seems to be that these cloud services would have the most to gain from the development of AutoML because that would mean they could provide more accurate machine learning for their customers without a hard learning curve. Either way more research is needed on this topic to confirm these pre-dictions.

SOURCES

[1] Open Source AutoML Benchmark https://www.automl.org/wp-content/up-loads/2019/06/automlws2019_Paper45.pdf

[2] AutoML

[3] Bishop, C. M. Pattern recognition and machine learning 2006 [4] Mitchell, Tom (1997). Machine Learning. New York: McGraw Hill

[5] Goodfellow, I., Bengio, Y. and Courville, A. Deep Learning. MIT Press, 2016.

[6] Ethem Alpaydin (2020). Introduction to Machine Learning (Fourth ed.). MIT. pp.

xix, 1–3, 13–18.

[7] Friedman, Jerome H. (1998). "Data Mining and Statistics: What's the connec-tion?". Computing Science and Statistics

[8] Pavel Brazdil, Christophe Giraud Carrier, Carlos Soares, Ricardo Vilalta (2009).

Metalearning: Applications to Data Mining (Fourth ed.). Springer Science+Busi-ness Media. pp. 10–14

[9] Russell, Stuart J.; Norvig, Peter (2010). Artificial Intelligence: A Modern Ap-proach (Third ed.). Prentice Hall.

[10] Mohri, Mehryar; Rostamizadeh, Afshin; Talwalkar, Ameet (2012). Foundations of Machine Learning. The MIT Press.

[11] Jordan, Michael I.; Bishop, Christopher M. (2004). "Neural Networks". In Allen B.

Tucker (ed.). Computer Science Handbook, Second Edition (Section VII: Intelli-gent Systems). Boca Raton, Florida: Chapman & Hall/CRC Press LLC

[12] van Otterlo, M.; Wiering, M. (2012). Reinforcement learning and markov deci-sion processes. Reinforcement Learning. Adaptation, Learning, and Optimiza-tion.

[13] Hu, J.; Niu, H.; Carrasco, J.; Lennox, B.; Arvin, F. (2020). "Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforce-ment Learning". IEEE Transactions on Vehicular Technology.

[14] Kaelbling, Leslie P.; Littman, Michael L.; Moore, Andrew W. (1996). "Reinforce-ment Learning: A Survey". Journal of Artificial Intelligence Research.

[15] Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012) Foundations of Machine Learning, The MIT Press

[16] Hinton, Geoffrey; Sejnowski, Terrence (1999). Unsupervised Learning: Founda-tions of Neural Computation. MIT Press

[17] Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013). Auto-WEKA: Com-bined Selection and Hyperparameter Optimization of Classification Algorithms.

KDD '13 Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

[18] Har-Peled, S., Roth, D., Zimak, D. (2003) "Constraint Classification for Mul-ticlass Classification and Ranking." In: Becker, B., Thrun, S., Obermayer, K.

(Eds) Advances in Neural Information Processing Systems 15: Proceedings of the 2002 Conference, MIT Press

[19] Piryonesi S. Madeh; El-Diraby Tamer E. (2020-06-01). "Role of Data Analytics in Infrastructure Asset Management: Overcoming Data Size and Quality Prob-lems". Journal of Transportation Engineering, Part B: Pavements.

[20] Stehman, Stephen V. (1997). "Selecting and interpreting measures of thematic classification accuracy". Remote Sensing of Environment. 62

[21] Hastie, Trevor. (2001). The elements of statistical learning : data mining, infer-ence, and prediction : with 200 full-color illustrations. Tibshirani, Robert., Fried-man, J. H. (Jerome H.). New York: Springer.

[22] David A. Freedman (27 April 2009). Statistical Models: Theory and Practice.

Cambridge University Press

[23] Train, K. (1986). Qualitative Choice Analysis: Theory, Econometrics, and an Ap-plication to Automobile Demand. MIT Press.

[24] Fawcett, Tom (2006). "An Introduction to ROC Analysis" (PDF). Pattern Recog-nition Letters.

[25] "Detector Performance Analysis Using ROC Curves - MATLAB & Simulink Ex-ample". www.mathworks.com

[26] Powers, David M. W. (2011). "Evaluation: From Precision, Recall and F-Meas-ure to ROC, Informedness, Markedness & Correlation". Journal of Machine Learning Technologies

[27] Rosasco, L.; De Vito, E. D.; Caponnetto, A.; Piana, M.; Verri, A. (2004). "Are Loss Functions All the Same?" (PDF). Neural Computation. 16 (5): 1063–1076 [28] Shen, Yi (2005), Loss Functions For Binary Classification and Class Probability

Estimation (PDF), University of Pennsylvania

[29] Towards data science Solving a classification problem. Available: https://to- wardsdatascience.com/solving-a-simple-classification-problem-with-python-fruits-lovers-edition-d20ab6b071d2

[30] Frank Hutter, Lars Kotthoff – Automated machine learning 2018

[31] Melis, G., Dyer, C., Blunsom, P.: On the state of the art of evaluation in neural language models

[32] Snoek, J., Larochelle, H., Adams, R.: Practical Bayesian optimization of ma-chine learning algorithms. In: Bartlett et al

[33] Bergstra, J., Bengio, Y (2012).: Random search for hyper-parameter optimiza-tion. Journal of Machine Learning Research pp.145-156

[34] Montgomery, D.: Design and analysis of experiments. John Wiley & Sons, Inc, eighth edn. (2013)

[35] Makhoul, J. (1975). "Linear prediction: A tutorial review". Proceedings of the IEEE

[36] Kohavi, R., John, G.: Automatic Parameter Selection by Minimizing Estimated Error. In: Prieditis, A., Russell, S. (eds.) Proceedings of the Twelfth International Conference on Machine Learning, Morgan Kaufmann Publishers (1995)

[37] Lévesque, J.C.: Bayesian Hyperparameter Optimization: Overfitting, Ensembles and Conditional Spaces. Ph.D. thesis, Université Laval (2018)

[38] Brazdil, P., Soares, C., da Costa, J.P (2003).: Ranking learning algorithms: Us-ing IBL and metalearnUs-ing on accuracy and time results. Machine LearnUs-ing [39] Abdulrahman, S., Brazdil, P., van Rijn, J., Vanschoren, J.: Speeding up

Algo-rithm Selection using Average Ranking and Active Testing by Introducing Runtime. Machine Learning 107, 79–108 (2018)

[40] Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behavior and Brain Science 40 (2017)

[41] Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep con-volutional neural networks

[42] Bilalli, B., Abelló, A., Aluja-Banet, T.: On the predictive power of meta-features in OpenML. International Journal of Applied Mathematics and Computer Sci-ence (2017)

[43] Kim, J., Kim, S., Choi, S.: Learning to warm-start Bayesian hyperparameter opti-mization. (2017)

[44] Thrun, S., Pratt, L.: Learning to Learn: Introduction and Overview. In: Learning to Learn. Kluwer (1998)

[45] Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: A survey. (2018) [46] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment:

Learning Augmentation Policies from Data. (May 2018)

[47] 2020 ICML AutoML Workshop: On Evaluation of AutoML Systems.

https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_pa-per_59.pdf

[48] E. Frank, M. Hall, I. Witten: The WEKA Workbench (2016)

[49] L. Kotkoff, C Thornton, H Hoos, F. Hutter, K. Leyton-Brown: Auto-WEKA 2.0:

Automatic model selection and hyperparameter optimization in WEKA (2016) [50] Fabian Pedregosa; Gaël Varoquaux; Alexandre Gramfort; Vincent Michel;

Ber-trand Thirion; Olivier Grisel; Mathieu Blondel; Peter Prettenhofer; Ron Weiss;

Vincent Dubourg; Jake Vanderplas; Alexandre Passos; David Cournapeau; Mat-thieu Perrot; Édouard Duchesnay (2011). "Scikit-learn: Machine Learning in Py-thon"

[51] Auto-SciKit-learn official documentation Available: https://www.automl.org/au-toml/auto-sklearn/

[52] H2O official documentation. Available: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html

[53] AutoKeras official documerntation. Available: https://autokeras.com/

[54] MLBox official documentation. Available:

https://www.analyt-icsvidhya.com/blog/2017/07/mlbox-library-automated-machine-learning/]

[55] https://lightautoml.readthedocs.io/en

[56] https://aws.amazon.com/blogs/opensource/machine-learning-with-autogluon-an-open-source-automl-library

[57] Chengrun Yang, Yuji Akimoto, Dae Won Kim, Madeleine Udell. OBOE: Collabora-tive filtering for AutoML model selection. KDD 2019H2O official documentation.

Available: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html [58] “ML-Plan: Automated machine learning via hierarchical planning”, Machine

Learning, 2018

[59] GAMA user guide Available: https://pgijsbers.github.io/gama/mas-ter/user_guide/index.html#user-guide-index

[60] Ludwig, Github documentation. Available: https://github.com/ludwig-ai/Ludwig

In document Automated machine learning: Evaluating AutoML frameworks (sivua 50-58)