Monday, 2 April 2018

Deep Learning capable machine using Amazon Web Services

These are instructions for a complete newbie to setup a cloud computing machine that you can use to explore deep learning.  Instructions are for someone is using Windows.

Amazon Web Services (AWS) provide cloud computing i.e. a server you can run deep learning algorithms on without the hassle of actually having a machine or having to maintain it.

AWS seems to be a standard way people perform deep learning so I thought I'd give it a go.

Deep Learning AMI

AWS give you your choice of server to run, but you need to choose what OS to run on it, and what applications it should have installed on it.  Amazon have an number of combinations of these already setup, installed and ready to roll.  They are loaded as an 'image' and are know as 'Amazon Machine Image's or AMIs.  There are specific Deep Learning AMIs (DLAMI).

Looking online the DLAMIs seem to change names quite rapidly. Certainly I couldn't even find the one the Amazon documentation recommended.

Generally people seem to use Ubuntu, and from the Amazon documentation you should avoid the 'base' version, so I think the one they want everyone to use is:



The Deep Learning documentation includes details on setting everything up and getting started.

Launch Instance

AWS is obviously setup to run loads of parallel jobs based on very similar setups.  Each one of these is called an 'instance'.  I just want to run one instance running the DLAMI.  The stages are:
  1. create an AWS account
  2. setup instance
  3. setup a pair of keys (private/public) 
  4. launch instance
I used these detailed instructions to help me follow these steps.  However note that the AMI described doesn't exist any longer; the instructions to setup the keys are good (great actually).  However the instructions for setting up Jupyter didn't work for me (more details in the next section).

1. Create AWS account
You need to create an AWS account (giving your credit card details).  

2. Setup instance
I was hoping to be able to run this all under AWS' 'free tier' but it seems that I need 75Gb storage to run the DLAMI and you only get 30Gb under the 'free tier', so I am expecting to pay $4.95 if I run the instance for a whole month (current pricing) for the extra storage I need.

Follow the AWS instructions or a user's instructions, but this is a summary:
  • https://console.aws.amazon.com/ec2
  • Launch instance
  • Select 'Deep Learning AMI (Ubuntu) Version 6.0' (actual AMI to use will inevitably change with time, see above)
  • Select 'Free tier eligable' instance type to start off with (cheapest while you're getting to grips with everything)
  • Review and Launch
  • Edit Security Groups
  • Add rule / Custom TCP Rule / TCP / 8888 / Anywhere (this works, but you should change it to 'my IP', but I haven't tested that) [This allows you to access the DLAMI instance remotely from your machine]
  • Review and Launch
  • Launch (prompts you for the key pair)
3. setup a pair of keys (private/public) and
 4. Launch
You need to create a pair of keys to securely access the instance remotely.  If you're repeating this stage and you've set up a pair already you can use them.  Otherwise,
  • Create a new key pair
  • Give a name for the pair it will get a .pem suffix as it downloads
  • 'Launch Instances' actually launches the instance
Note: you are paying for the service from the moment you launch until you terminate.

Connecting to Instance

You now have a DLAMI instance running on AWS, but you now face the challenge of connecting to it so you can use it.

Everyone seems to use Jupyter Notebook as a sort of IDE for running deep learning applications, so that's what I'm going to do.  Once setup you access it through a browser.

The setup stages are:
  1. Download PuTTY
  2. Create PuTTY readable version of the private key
  3. Connect to DLAMI Instance using PuTTY and the private key
  4. Start Jupyter notebook on AWS instance
  5. create a 'tunnel' on PuTTY to allow us to connect to the Jupyter server from the Windows machine.
  6. Access Jupyter Notebook in a browser
However, only follow the 'Converting Your Private Key Using PuTTYgen' and 'Starting a PuTTY Session' of that page. (I didn't need to follow the 'transfer files' instructions, but they may prove useful later).

4. Start Jupyter notebook on AWS instance
Now that the AWS instance can be accessed via PuTTY, Jupyter notebook can be started by typing this command onto the PuTTY terminal: 
   jupyter notebook --no-browser --port=8888
It takes a minute or so and eventually returns the prompt to you, but unnervingly displays 'Killed'

5. Creating a 'tunnel' on PuTTY to allow us to connect to the Jupyter server from the Windows machine
We want to access jupyter notebook via the windows browser.  When we setup the instance we allowed TCP access on port 8888.  Now we need to use PuTTY (running on Windows) to allow the windows browser to access that port.

Follow the AWS instructions to setup the SSH tunnel.  This is a summary:
  • Right click on PuTTY banner | Change settings ...
  • Category | Connections | SSH | Tunnels
    • Source Port: 8888
    • Destination: localport:8888
    • Add
    • Apply
6. Access Jupyter Notebook in Browser
  • https://localhost:8888
  • Waive security checks
  • Note that you can't access it using http, you must use https
  • Enter password 'machinelearningisfun'
    • I think this should work, I setup a different password before I got Jupyter going so I can't confirm it works.
  • You're ready to roll!

Don't forget to Terminate your instance otherwise you will pay Amazon quite a bit.


No comments: