Data is the new oil, and to be a data scientist is the job of the future. We all have heard about this, but still companies have issues to hire for this position. How can we start working with machine learning when we don’t have enough experience?
The Machine Learning Challenge
Some of the use cases for machine learning are quite simple, such as image classification, price prediction or recognizing anomalies. But for all these cases, we need an expert data scientist with knowledge in Neuronal Networks who can help us improve and tune the results. Among their tasks, they have to:
- Preprocess and clean the data
- Select and construct appropriate features
- Select an appropriate model family
- Optimize model hyperparameters
- Postprocess machine learning models
- Critically analyze the results obtained
Depending on the business problem, it can take up to hundreds of experiments until we reach the solution. This is for experienced data scientists -- imagine for a non-expert!
What is AutoML?
The idea of AutoML is to use machine learning for improving our neuronal networks. It is going to be in charge of automatically training and tuning in a specific time-frame defined by the user. All of this complexity is hidden with a simple api or framework which is going to help us to create our machine learning model with just a few lines of code.
AutoML is going to provide many benefits to the enterprise:
- Faster time to market -- your data scientist can focus on the problem and not on the parameters and repetitive tasks
- Machine learning for everybody -- now you don’t need to be an expert to start working
This sounds great! There are some options from Google, but these services costs around $20 per hour! Why not check out the open source community?
Run Auto-Keras on Oracle Cloud
Auto-Keras is an open source python tool built on top of Keras; it was developed by Data Lab at Texas A&M University. Auto-Keras automatically searches for the right architecture and hyperparameters for your deep learning models. It is easy to install, easy to run, and it has a lot of examples with a growing community.
Let’s install it and use it with Oracle Cloud!
Why Oracle Cloud? There are many reasons -- even if you want to test or develop, or you want a production-ready environment, there is a solution for you. Oracle also provides machines with GPU for your production environment.
Oracle offers the Free Tier, where you have Autonomous Databases and Compute resources for an unlimited time! Yes, you read correctly, unlimited time! Get your free account here, and let’s start working.
In this example, we are going to use the MNIST database of handwritten numbers. It also includes a label for each digit. The goal is to be able to recognize the numbers. We can do a comparison between Keras and Auto-Keras. For Keras, we need 71 lines of code (you can check it here).
Let’s run it with Auto-Keras. We are going to use the free tier for running this example with a simple compute machine.
We have to be aware that Auto-Keras is still on a pre-release version, hoping to have the version 1.0 very soon.
First, we need to install python 3.6 as it is a requirement.
$ yum install python36
Installing Auto-Keras is very easy with pipe, just run the following:
$ pip3 install autokeras
Now that we have everything ready, we have this simple code for running the example:
from tensorflow.keras.datasets import mnist
from autokeras.image.image_supervised import ImageClassifier
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))
clf = ImageClassifier(verbose=True, augment=False)
clf.fit(x_train, y_train, time_limit=1 * 60 * 60)
clf.final_fit(x_train, y_train, x_test, y_test, retrain=True)
y = clf.evaluate(x_test, y_test)
print(y * 100)
We will see how it is working and the results while comparing and tuning the models:
Saving model.
+--------------------------------------------------------------------------+
| Model ID | Loss | Metric Value |
+--------------------------------------------------------------------------+
| 0 | 0.14653808772563934 | 0.9875999999999999 |
+--------------------------------------------------------------------------+
+----------------------------------------------+
| Training model 1 |
+----------------------------------------------+
Epoch-1, Current Metric - 0: 13%|███▋ | 60/465 [00:55<06:12, 1.09 batch/s]
Epoch-2, Current Metric - 0.98: 75%|██████████████████▊ | 350/465 [05:49<01:57, 1.03s/ batch]
Epoch-4, Current Metric - 0.992: 52%|████████████▍ | 240/465 [04:08<03:58, 1.06s/ batch]
Finally, we got an accuracy of 98.65. And that’s all! Let us know on the comments if you want to know more about this!
To learn more about AI and machine learning, visit the Oracle AI page.
No comments:
Post a Comment