I have recently started my journey in deep learning from fastai course to improve my skills.If you are a beginner looking to delve into deep learning, you should check out the above mentioned course.
Here is the link for the below notebook
Lets begin
!pwd
/home/studio-lab-user/pranith/fastai_part1/Lesson1/Lesson
check the directory in which you are in
!pip install -Uqq fastai --quiet #checking the latest version of fastai
Step 1: Download images of birds and non-birds
from fastcore.all import *
import time
from duckduckgo_search import ddg_images
from fastcore.all import *
def search_images(term, max_images=30):
print(f"Searching for '{term}'")
return L(ddg_images(term, max_results=max_images)).itemgot('image')
# If you get a JSON error, just try running it again (it may take a couple of tries).
urls = search_images('bird photos',max_images=1)
urls[0]
Searching for 'bird photos' '2.bp.blogspot.com/-LZ4VixDdVoE/Tq0ZhPycLsI/..
you can view the URL by downloading it:
from fastdownload import download_url
dest = 'bird.jpg'
download_url(urls[0], dest, show_progress=False)
from fastai.vision.all import *
im = Image.open(dest)
im.to_thumb(256,256)
The same can be said about forest photos":
download_url(search_images('forest photos', max_images=1)[0], 'forest.jpg', show_progress=False)
Image.open('forest.jpg').to_thumb(256,256
Searching for 'forest photos'
We seem to be getting reasonable results, so let's grab some examples of "bird" and "forest" photos, and save them in separate folders (I'm also trying to capture various lighting conditions):
searches = 'forest','bird'
path = Path('bird_or_not')
from time import sleep
for o in searches:
dest = (path/o)
dest.mkdir(exist_ok=True, parents=True)
download_images(dest, urls=search_images(f'{o} photo'))
sleep(10) # Pause between searches to avoid over-loading server
download_images(dest, urls=search_images(f'{o} sun photo'))
sleep(10)
download_images(dest, urls=search_images(f'{o} shade photo'))
sleep(10)
resize_images(path/o, max_size=400, dest=path/o)
Searching for 'forest photo' Searching for 'forest sun photo' Searching for 'forest shade photo' Searching for 'bird photo' Searching for 'bird sun photo' Searching for 'bird shade photo'
Step 2: Train our model
Some photos might not download correctly which could cause our model training to fail, so we'll remove them:
failed = verify_images(get_image_files(path))
failed.map(Path.unlink)
len(failed)
1
To train a model, we'll need DataLoaders, which is an object that contains a training set (the images used to create a model) and a validation set (the images used to check the accuracy of a model -- not used during training). In fastai we can create that easily using a DataBlock, and view sample images from it:
dls = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=[Resize(192, method='squish')]
).dataloaders(path, bs=32)
dls.show_batch(max_n=6)
Here what each of the DataBlock parameters means:
blocks=(ImageBlock, CategoryBlock),
The inputs to our model are images, and the outputs are categories (in this case, "bird" or "forest").
get_items=get_image_files,
To find all the inputs to our model, run the get_image_files function (which returns a list of all image files in a path).
splitter=RandomSplitter(valid_pct=0.2, seed=42),
Split the data into training and validation sets randomly, using 20% of the data for the validation set.
get_y=parent_label,
The labels (y values) is the name of the parent of each file (i.e. the name of the folder they're in, which will be bird or forest).
item_tfms=[Resize(192, method='squish')]
Before training, resize each image to 192x192 pixels by "squishing" it (as opposed to cropping it).
Now we're ready to train our model. The fastest widely used computer vision model is resnet18. You can train this in a few minutes, even on a CPU! (On a GPU, it generally takes under 10 seconds...)
fastai comes with a helpful fine_tune() method which automatically uses best practices for fine tuning a pre-trained model, so we'll use that.
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(3)
Step 3: Use our model (and build your own!)
Let's see what our model thinks about that bird we downloaded at the start:
is_bird,_,probs = learn.predict(PILImage.create('bird.jpg'))
print(f"This is a: {is_bird}.")
print(f"Probability it's a bird: {probs[0]:.4f}")
This is a: bird. Probability it's a bird: 1.0000 Good job, resnet18. :)
So, as you see, in the space of a few years, creating computer vision classification models has gone from "so hard it's a joke" to "trivially easy and free"!
It's not just in computer vision. Thanks to deep learning, computers can now do many things which seemed impossible just a few years ago, including creating amazing artworks, and explaining jokes. It's moving so fast that even experts in the field have trouble predicting how it's going to impact society in the coming years.
One thing is clear -- it's important that we all do our best to understand this technology, because otherwise we'll get left behind!