Car classifier
While reading fastbook chapter 2, I built a bear classifier. Since it was very easy to make and fun, I am making another one with cars.
Instead of using Bing, I am using Duckduckgo search engine.
from fastbook import *
urls = search_images_ddg('Toyota car', max_images=100)
len(urls),urls[0]
download_url(urls[0], 'images/bear.jpg')
im = Image.open('images/bear.jpg')
im.thumbnail((256,256))
im
Here are types of cars I am trying to classify. I put m4 sherman tank and go kart for fun. They are very distinct from others and easy to classify. However, it will be harder to classify among toyota camry, kia forte, and tesla model x. The biggest challenge will be classifying between toyota camry and kia forte as they are similiar than others.
car_types = 'toyota camry', 'kia forte', 'go kart', 'tesla model x', 'm4 sherman'
path = Path('cars')
After training a model with 100 images each category, the model confused a lot between toyota camry and kia forte. Even after cleaning data and using deeper architectures such as resnet34 and resnet50, it did not perform much better. Therefore, I am trying to gather 200 images for just toyota camry and kia forte and see what happens.
def download_more_images(title, data_types, path, num_imgs=100, more_imgs=None, more_num_imgs=200):
if not path.exists():
path.mkdir()
for o in data_types:
dest = (path/o)
dest.mkdir(exist_ok=True)
if more_imgs != None and o in more_imgs:
max_imgs = more_num_imgs
else:
max_imgs = num_imgs
results = search_images_ddg(f'{o} {title}', max_images=max_imgs)
download_images(dest, urls=results)
download_more_images('car', car_types, path, more_imgs=['toyota camry', 'tesla model x'], more_num_imgs=300)
fns = get_image_files(path)
fns
We clean up images that cannot be opened.
failed = verify_images(fns)
failed
failed.map(Path.unlink);
First, we create a template for our data, which is DataBlock
. From here, we specify what kind of data for what purpose (image for classification), how to get items (get image files), how big is the validation set (20% with random seed), how to label data (name of parent directory), and how to transform our data (resize them into 128 pixels).
cars = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=Resize(128))
Now that we have a datablock, we feed in data to build a dataloader.
dls = cars.dataloaders(path)
We can check images with show_batch
. Oops! There is a problem with our images. After resizing images, we lost some details of them. It is easy to fix it by squishing those images into our desired size.
dls.valid.show_batch(max_n=10, nrows=2)
We fit all the detail in each image, but some images do not reflect how they actually look.
cars = cars.new(item_tfms=Resize(128, ResizeMethod.Squish))
dls = cars.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)
Padding zeros for the border allows us to have our images look as they actually are, but it is a waste of computation.
cars = cars.new(item_tfms=Resize(128, ResizeMethod.Pad, pad_mode='zeros'))
dls = cars.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)
So, as images get randomly cropped, we get a partial representation of vehicles.
cars = cars.new(item_tfms=RandomResizedCrop(128, min_scale=0.3))
dls = cars.dataloaders(path)
dls.train.show_batch(max_n=4, nrows=1, unique=True)
With that, we can come up with a data augmentation, which allows us to look at images with different angles. That means, how they actually look in the real world. With this technique, we can use little data and still get a great result.
cars = cars.new(item_tfms=Resize(128), batch_tfms=aug_transforms(mult=2))
dls = cars.dataloaders(path)
dls.train.show_batch(max_n=8, nrows=2, unique=True)
cars = cars.new(
item_tfms=RandomResizedCrop(224, min_scale=0.5),
batch_tfms=aug_transforms())
dls = cars.dataloaders(path)
learn = cnn_learner(dls, resnet101, metrics=error_rate)
learn.fine_tune(5)