In an earlier post I made, I discussed how regular expressions could be used. Now, I will show you how to implement them in your own C++ program.
The regex Library
The regular expression library was added in C++ 11, so you should have support for it by now. We can start with some basic boilerplate, importing our regular expression library (along with some other libraries that will make our lives easier) and the standard IO library.
In C++, regular expressions must be compiled before they are used. When I say I am going to be passing a regex string or a regex to a function, I am actually going to be talking about this compiled expression. All regexes must be compiled before use.
It is actually very easy to compile regexes, and below, we are compiling regex <html>.+</html> and assigning the compiled expression to a variable called re.
Then, we will do the same thing I mentioned above but naming the expression reg. The reason I am doing this twice is because I want to show the two methods you can use, assigning by value or assigning using the regex class’s constructor.
Determining if a Regular Expression Matches an Entire String
The regex_match function will determine whether an entire string is matched by a certain regex. For example, if we pass the regex hi to it and match it with the string hi, the function will return true, as the regular expression provided matches the entire target string of hi.
However, if we kept the regex the same but changed the target string to shi, the function would return false because while shi contains hi, the regex hi does not match the entirety of shi.
Let’s use an example. I have given one below.
C++
#include<iostream>#include<string>#include<vector>#include<regex>usingnamespacestd;intmain(){ string reStr; cout<<"Enter a regular expression to use the regex_match function on:\n>"; cin>>reStr; string target; cout<<"Enter a target string to use the regex_match function on:\n>"; cin>>target; regex reCompiled =regex(reStr); // Compiling our regex // Actual matching processif(regex_match(target,reCompiled)){ cout<<"\nRegex Matched Entirely!\n";return0;}else{ cout<<"\nRegex Did Not Match Entirely!\n";}return-1;}
A Quick Note: Capturing Groups
Capturing groups in regexes are denoted by parenthesis and are often returned as lists. To make things simpler, let’s use the regex (sub)(mar)(ine). Here we can see that sub, mar, and ine each have their own capturing groups.
Now if we were to use this on the text submarinesubmarine, the regex would match on both submarines separately, so we would get two matches.
Let’s take a closer look at the matches.
These matches would end up having three submatches each due to these capturing groups. If we were to visualize this in hierarchy, we would get the following:
In C++, matching with capturing groups is represented as a list of matches containing lists of each capturing group for each match. For example, if we wanted to get match one, capturing group one, of a list of matches (you will learn about the smatches type in the next section), we would use the code below:
C++
string m1c1 = matches[0][0];
A Quick Note: The smatches Type
The smatches type is used for storing a list of strings as regex matches. It is sort of like a vector, but the shape is fixed to either vector<string> without capturing groups, or vector<vector<string>> with capturing groups.
Determining if a Regular Expression Matches any Substrings
Remember how above, I said that the regex_match function only tells you whether the entire string is matched by a regex? Well, if we want to include substrings, it can get a little more complicated (this is coming from someone with a Python background, where we are pampered with the re library).
For this part of the guide, we will be using the regex_search function, which will tell you if
The regex_search function typically takes three to four arguments. Let’s look at the first method of calling it.
For this method, the function takes three parameters and outputs a Boolean. The parameters are below.
Target (std::string) – This is the string you want to match the regex against
Match Results (std::smatch) – This is the variable of type smatch that will store match results. We will not be using it in this example
Regex (std::basic_regex) – This is the compiled regex that the target is being matched against
The function will return true if any substring of the target string matches the regex, and false otherwise.
C++
#include<iostream>#include<string>#include<regex>usingnamespacestd;intmain(){ string s="this variable is called s"; smatch m; regex e =regex("s");if(regex_search(s,m,e) /* Will return true */){ cout<<"Matched! (but not always the entire string)"<<endl;}return0;}
We can also call regex_search using another method, whose parameters are listed below. In this method, we are not only telling the user whether the program is
String Begin and String End – Tells the function to only search the substring in between the string beginning and string ending
Match Results – This is the smatch that will store match results
Regex – The compiled regex that will be used to match against the target string
The function will return true using the same conditions I stated in the previous method, but here what we care about is the fact that the match results are being stored.
The code below will print the first match of the regex, check if there are any matches other than the one it returned, and print the first capturing group. It will keep doing this until there are no other matches. I highly recommend you read the comments in the code below for a better understanding of what it’s doing.
C++
#include<iostream>#include<string>#include<regex>usingnamespacestd;intmain(){ string target ="submarine submarine submarine"; regex re =regex("(sub)(mar)(ine)"); smatch m;string::const_iterator searchFrom =string::const_iterator(target.cbegin()); // Begin iteratingwhile(regex_search(searchFrom,target.cend(),m,re)){ // We don't want to keep returning the same match every time, so the code below will exclude this match from the future iterations searchFrom =m.suffix().first; // It is important to know that m[0] would return the entire string ("submarine"), so m[1] will return the first capturing group ("sub") cout<<"We have got ourselves a match! \""<<m[1].str() /* First capturing group of match */<<"\"\n";}}
Regular Expression Find and Replace
The regex_replace function will find and replace all sequences that match the regex.
In the example below, we are telling it to replace all words (including the spaces around them) with “and”
We are also giving it three parameters.
Target – The text that will be replaced accordingly
Regex – The compiled regular expression that will be used on the target
Replace With – The text to replace the matches of the regex with against the target
C++
#include<iostream>#include<string>#include<regex>usingnamespacestd;intmain(){ regex re("([^ ]+)"); // Matches every word cout<<"ORIGINAL: this is text\n"; cout<<regex_replace("this is text",re,"and"); // prints "and and and"return0;}
You can also use formatters to incorporate exactly what was replaced using the table below.
Formatter
Example
Explanation
$number(where “number” is replaced by any positive number less than 100)
$2
Replaced with the match of the numberth capturing sequence that triggered the replace (starting from 1, such that $1 will get the first capturing group, not $0) at runtime
Example: Replacing regex matches of “(sub)(.+)” with “2nd CG: $2” using a target string of “submarine” will yield a result of “2nd CG: marine”
$&
$&
A copy of the entire original string, regardless of capturing groups.
Example: Replacing regex matches of “(sub)(.+)” with “String: $&” using the same target string above will result in “String: submarine”
$`
$`
Replaced with whatever came before the match at runtime
Example: When we have a regex of “sub” with target string “a submarine goes underwater”, “$`” will get replaced with “a “
$’
$’
Replaced with whatever came after the match at runtime
Example: When we have a regex of “sub” with target string “a submarine goes underwater”, “$’” will get replaced with “marine goes underwater”
$$
$$
I wouldn’t call it a formatter exactly; it’s more of an escape sequence. Used when you don’t want the compiler to mistake the literal character “$” with a formatter.
Used when you want to literally type “$” as the text to replace, type “$$”
For example, the code below will put the letters “t” and “e” in parenthesis.
C++
// regex_replace example#include<iostream>#include<string>#include<regex>usingnamespacestd;intmain(){ regex re("([te])"); // Matches either "t" or "e" cout<<"ORIGINAL: thetechmaker.com\n"; cout<<regex_replace("thetechmaker.com",re,"($&)"); // Prints "(t)h(e)(t)(e)chmak(e)r.com"return0;}
In this guide, I will show you how to use the BeautifulSoup library to make a simple program that notifies you when a product on an online site drops in price.
This library runs in the background, scraping static online e-commerce sites of your choice and notifying you when a product drops in price.
Prerequisites
This guide assumes that you have Python installed, pip added to your system’s PATH, along with a basic understanding of Python and HTML.
Installing Required Components
First, let’s install BeautifulSoup and Requests. The Requests library retrieves our data, but the BeautifulSoup library actually analyzes our data.
We can install those two required components by running the command below:
BAT (Batchfile)
pip install beautifulsoup4 requests
Note that depending on what your system’s setup is, you might need to use pip3 instead of pip.
Grabbing Our Sample: Price
In this step, we will be telling BeautifulSoup what exactly to scrape. In this case, it’s the price. But we need to tell BeautifulSoup where the price is on the website.
To do this, navigate to the product you want to scrape. For this guide, I will be scraping an AV channel receiver I found on Amazon.
Then, use your browser’s DevTools and navigate to the price. However, make sure that you have a very “unique” element selected. This is an element that shows the product’s price but is also very specifically identified within the HTML document. Ideally, choose an element with an id attribute, as there cannot be two elements with the same HTML ID. Try to get as much “uniqueness” as you can because this will make the parsing easier.
The elements I have selected above are not the most “unique” but are the closest we can get as they have lots of classes that I can safely assume not many other elements have all of.
We also want to ensure that our web scraper stays as consistent as possible with website changes.
If you also don’t have an element that is completely “unique”, then I suggest using the Console tab and JavaScript DOM to see how many other elements have those attributes.
Like, in this case, I am trying to see whether the element I selected is “unique” enough to be selected by its class.
In this case, there is only one other element that I need to worry about, which I think is good enough.
Basic Scraping: Setup
This section will detail the fundamentals of web scraping only. We will add more features as this guide goes on, building upon the code we will write now.
First, we need to import the libraries we will be using.
Python
import requests as rqfrom bs4 import BeautifulSoup
Then, we need to retrieve the content from our product. I will be using this AV receiver as an example.
If the content you want to scrape is locked behind a login screen, chances are you need to provide basic HTTP authentication to the site. Luckily, the Requests library has support for this. If you need authentication, add the auth parameter to the get method above, and make it a tuple that follows the format of ('username','password').
For example, if Amazon required us to use HTTP basic authentication, we would declare our request variable like the one below:
If that authentication type does not work, then the site may be using HTTP Digest authentication.
To authenticate with Digest, you will need to import HTTPDigestAuth from Request’s sub-library, auth. Then it’s as simple as passing that object into the auth parameter.
Python
from requests.auth import HTTPDigestAuthrequest = rq.get("https://www.amazon.com/Denon-AVR-X1700H-Channel-Receiver-Built/dp/B09HFN8T64/",auth=HTTPDigestAuth("replaceWithUsername","replaceWithPwd"))
If the content you want to scrape requires a login other than basic HTTP authentication or Digest authentication, consult this guide for other types of authentications.
Amazon does not require any authentication, so our code will work providing none.
Now, we need to create a BeautifulSoup object and pass in our website’s response to the object.
When you use the Requests library to print a response to the console, you generally will want to use request.text. However, since we don’t need to worry about decoding the response into printable text, it is considered better practice to return the raw bytes with request.content.
Basic Scraping: Searching Elements
Now we can get to the fun part! We will find the price element using our sample we got earlier.
I will cover two of the most common scenarios, one where you need to find the price based on its element’s ID – the simplest, or one where you need to find the price based on class names and sub-elements – a little more complicated but not too difficult, assuming you have a “unique” enough element.
If we wanted to refer to an element based on its ID with BeautifulSoup, you would use the find method. For example, if we wanted to store the element with the ID of pricevalue within a variable called priceElement, we would invoke find() with the argument of id set to the value "pricevalue".
Python
priceElement = parser.find(id="pricevalue")
We can even print our element to the console!
Python
print(priceElement.prettify())
Expected Output (may vary)
<divid="pricevalue"><p>$19.99</p></div>
The function prettify is used to reformat (“pretty-print”) the output. It is used when you want to be able to visualize the data, as it results in better-looking output to the console.
Now we get to the tougher part – making references to element(s) based on one or more class names. This is the method you will need to use for most major e-commerce sites like Amazon or Ebay.
This time, we will be using the find_all function. It is used only in situations where it is theoretically possible to get multiple outputs, like when we have multiple classes as the function gives the output as a list of strings, not a single string.
If you are not sure, know that you can use find_all even when the query you give it only returns one result, you will just get a one item list.
The code below will return any elements with the classes of priceToPayorbig-text.
The select function is just like that of the find function except instead of directly specifying attributes using its function parameters, you simply pass in a CSS selector and get a list of matching element(s) back.
The code above selects all elements with the class of both price-value and main-color. Although many use the find or find_all functions, I prefer select as I am already familiar with CSS selectors.
If, and this is not much of a good idea when finding elements, we would like to filter by element type, we will just call find_all with a single positional argument, the element’s type. So, parser.find_all("p") will return a list of every single paragraph (“p“) element.
An element type is one of the broadest filters you can pass into the find_all function, so this only becomes useful when you combine it with another narrower filter, such as an id or class.
Python
parser.find_all("h1",id="title")
That would return all h1 elements with an ID of title. But since each element needs to have its own unique ID, we can just use the find function. Let’s do something more realistic.
Python
parser.find_all("h1",class_="bigText")
This code would return all h1 elements that had a class of bigText.
Below are a few reviews of what we know so far and some other, rarer methods of element finding.
Python
"""Never recommended, but returns a list of ALL the elements that have type 'p'"""typeMatch = parser.find_all("p")"""Finds element with the ID of 'priceValue' using a CSS selector"""idSelMatch = parser.select("#priceValue")"""Finds element with the ID of 'priceValue', except with the BeautifulSoup-native find function and not with a CSS selector"""idMatch = parser.find(id="priceValue")# Same as above"""Extremely rare, but returns a list of elements containing an ID of 'priceValue' OR 'price'"""orIdMatch = parser.find_all(id=["priceValue","price"])"""Returns a list of elements that have the class 'price' OR 'dollarsToPay'. I do not know of a CSS selector that does the same"""orClassMatch = parser.find_all(class_=['price','dollarsToPay'])"""Returns a list of elements that have the class 'price' AND 'dollarsToPay'. I do not know of a find_all argument that does the same"""andClassMatch = parser.select(".priceValue.dollarsToPay")"""Returns the element that has a class of 'v' INSIDE the element of class 't'. This can also be done with ID attributes, but this function only works when the first function is .find(...) or when you are grabbing an element by index after calling .find_all(...). Because .find(...) only returns one element, it will only be returning the first instance of that class name. The code below return the same thing, however 'inMatch3' returns a list"""inMatch = parser.find(class_="t").find(class_="v")# Most basic way to do itinMatch2 = parser.find_all(class_="t")[0].find_all(class_="v")[0]# Because .find_all(...) works on the final element, the '[0]' is unnecessary, we just do it so we don't get a one-element listinMatch3 = parser.find_all(class_="t")[0].find_all(class_="v")# Returns a one-element list
Now that we know how to search elements, we can finally implement this in our price drop notifier!
Let’s see if our request is successful. We will be printing out the entire file to check.
Python
print(parser.find("html").prettify())
And we are not.
Hmmm, so we have to bypass Amazon’s CAPTCHA somehow, so let’s try adding headers that mimic a normal browser!
I will be adding headers to rq.get(). Make sure to replace my AV channel receiver link with the product you want to scrape.
Nope. Still nothing. Well, time for plan B, ditching requests completely and using selenium.
Sign up for our newsletter!
Basic Scraping: Implementation of Selenium
Firstly, it is important to know that Selenium has its own methods for finding elements in a HTML document, but for the sake of this guide, we will just be passing the source code of the website to our parser.
Think of Selenium as a browser running in the background with some selection abilities. Instead of sending the requests to the website by crafting our own headers, we can use Selenium to spin up an invisible browser that crafts the headers for us. We should no longer get a CAPTCHA screen because Amazon shouldn’t be suspicious that a robot is browsing the page – we are technically using a legitimate browser, but with parsing capabilities.
Installation of Selenium can be done with the command below. We will also be installing win10toast so you get a proper toast notification whenever a price drop is detected.
BAT (Batchfile)
pip install seleniumpip install win10toast
If you are looking for how you can uninstall Requests because you don’t need it anymore, think twice because Selenium depends on Requests anyways.
Now, clear your entire Python file because we are going to need to do a short and quick rewrite of our code to use Selenium.
Like always, we will start by importing the required modules. Make sure you replace chrome with the name of a browser you have installed on your system, preferably the most resource efficient one.
Python
from selenium import webdriverfrom bs4 import BeautifulSoupfrom selenium.webdriver.chrome.options import Options # Imports the module we will use to change the settings for our browserimport time # This is what we will use to set delays so we don't use too many system resourcesfrom win10toast import ToastNotifier # This is what we will use to notify if a price drop occurs.notifier =ToastNotifier()# Assign our notifier class to a variable
Then, we will need to set some preferences for the browser we are about to start. Let’s start by declaring an Options class and using it to make the browser invisible or run it in “headless” mode. While the arguments below are for specific browsers, I would just execute them all because I have not tested each argument individually.
Python
browserOptions =Options()browserOptions.headless=True# Makes Firefox run headlessbrowserOptions.add_argument("--headless=new")# Makes newer versions of Chrome run headlessbrowserOptions.add_argument("--headless")# Makes older versions of Chrome run headlessbrowserOptions.add_argument("--log-level=3")# Only log fatal errors
Now, we will initiate the browser in the background. Again, make sure you replace Chrome with whichever browser you want to use for this project.
Then, we can use what we already know about BeautifulSoup to grab the price of our element. Remember to replace the code below with one tailored to your sample.
Next, let’s strip the $ symbol from the price and convert it into a floating-point decimal.
Python
price =float(price.strip("$"))
Then, we can set a variable to compare with the current price.
Python
previousPrice = price
Now, we loop infinitely to see whether the price changed.
Python
whileTrue:
Insert a new line and then indent the code we will write from this point forward.
Now, every two minutes (120 seconds), we refresh the page and compare the price we just got to our previous price.
Python (place each line indented inside while loop)
browser.refresh()# Refreshes the browser# Now that we may have a new price, we have to redfine our parser and price variables to adapt to that new page codeparser =BeautifulSoup(browser.page_source,"html.parser")price = parser.select(".a-price.aok-align-center.reinventPricePriceToPayMargin.priceToPay")[0].find_all(class_="a-offscreen")[0].textprice =float(price.strip("$"))# Next, we compare the two prices. If we find one, we alert the user and update our price threshold. We will also be looking for price increases.if(price<previousPrice):print(f"Price DECREASED from ${previousPrice} to ${price}!") notifier.show_toast("Price Drop!",f"The price decreased from ${previousPrice} to ${price}!")elif(price>previousPrice):print(f"Price INCREASED from ${previousPrice} to ${price}!") notifier.show_toast(":(",f"The price increased from ${previousPrice} to ${price} :(")# Now, we can tell the user we refreshedprint(f"Refreshed! Previous Price: ${previousPrice}, and new price ${price}")previousPrice = price# And then we wait for two minutestime.sleep(120)
And just like that, you are finished! I hoped this project was useful to you!
This guide is a part two to a previous guide I made, called The Simple Guide to AI and Machine Learning With Python. This guide is simply how you can improve accuracy to the model you made in that guide, meaning that I’m going to assume you have already completed the previous guide before going on to follow this guide.
In the previous guide, we learned how you can use dense neural networks to make a program that recognizes handwriting. Well, that neural network was not exactly very accurate, as it had a tendency to get numbers wrong unless it was specifically modified for those numbers. As you probably know by now, you would probably want the neural network to recognize any number you give it without having to optimize the network for every single number that comes to it.
Convolutional neural networks were made to solve this problem. Rather than training off of the overall image, convolutional neural networks recognize tiny features in the image and learns those. For example, rather than focusing on the entire image of a hand-drawn three, the network will learn that a three has two curves that are stacked vertically, which will help it recognize any other threes in the future, no matter how it was drawn or whether the neural network was optimized for the number three.
Step One: Initial Setup
For this step, we can just use the code that we used in the previous tutorial to prepare the MNIST dataset.
Python
import tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras.datasets import mnistfrom tensorflow.keras import backend as Kimport numpy as npimport matplotlib.pyplotas plt%matplotlib inline# helper functionsdefshow_min_max(array,i): random_image = array[i]print("min and max value in image: ", random_image.min(), random_image.max())defplot_image(array,i,labels): plt.imshow(np.squeeze(array[i])) plt.title(" Digit "+str(labels[i])) plt.xticks([]) plt.yticks([]) plt.show()defpredict_image(model,x): x = x.astype('float32') x = x /255.0 x = np.expand_dims(x,axis=0) image_predict = model.predict(x,verbose=0)print("Predicted Label: ", np.argmax(image_predict)) plt.imshow(np.squeeze(x)) plt.xticks([]) plt.yticks([]) plt.show()return image_predictimg_rows, img_cols =28,28num_classes =10(train_images, train_labels),(test_images, test_labels)= mnist.load_data()(train_images_backup, train_labels_backup),(test_images_backup, test_labels_backup)= mnist.load_data()print(train_images.shape)print(test_images.shape)train_images = train_images.reshape(train_images.shape[0], img_rows, img_cols,1)test_images = test_images.reshape(test_images.shape[0], img_rows, img_cols,1)input_shape =(img_rows, img_cols,1)train_images = train_images.astype('float32')test_images = test_images.astype('float32')train_images /=255test_images /=255train_labels = keras.utils.to_categorical(train_labels, num_classes)test_labels = keras.utils.to_categorical(test_labels, num_classes)print(train_images[1232].shape)
Expected Output
(60000, 28, 28)(10000, 28, 28)(28, 28, 1)
Now that we have already put in the initial setup of our code, we can jump straight to creating our network.
Creating Our Network
Similar to what we did with the densely connected network, we are still going to have epochs, or the amount of times the network goes through the entire set over again.
With that explanation out of the way, we can define our model.
Python
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D, Dropoutepochs =10model =Sequential()
Now, let’s start adding the layers of our neural network.
Explaining Convolutional Layers
With our previous network, we added three dense (fully connected) layers. With our new network that uses convolutional neural networks, the layers work differently.
Convolutional layers consist of groups of neurons called filters that move across the image and activate based on the pixels they read. Those groups will then learn how to recognize features in the data.
It is possible to adjust amount and size for filters in your neural network, which we will change to our liking. Bigger filters can observe larger parts of the image at once, while smaller filters gather finer details about the image. A higher amount of filters means that the neural network can recognize a wider range of image features.
There are many advantages of having layers and filters work this way. For one thing, smaller filters can be more computationally efficient by only examining a small part of the image at once. Furthermore, as filters are moved across the entire image, the neural network will not be affected by feature displacement (occurs when a feature is common to two images, but in different spots of an image). Just like reality, filters focus on a small area of the image, so they are not distracted by the other parts of an image.
We will be using multiple convolutional layers to complete our new-and-improved handwriting recognition software.
Implementing Convolutional Layers
When we use Keras, we can easily take advantage of its functionality to easily create convolutional layers that we will then use in our model. We will use the Conv2D function to create the first layer of out neural network.
In the case below, we will have 32 filters, a kernel size of (3,3), an input shape – which we saved to the input_shape variable when we ran the setup code at the beginning – of (28,28,1), and an activation function of ReLU. I go more in-depth into what ReLU is in my previous guide.
The Conv2D function creates 2D convolutional layers, meaning that they scan across flat data, like images.
Explaining Pooling Layers
When you use convolutional layers, things can get quite computationally intensive, which is where pooling layers come in. Increasing the number of neurons will increase the number of computation time required. Pooling layers are essentially filters that move in specified strides across the image, simplifying each of the filters’ contents into a single value. This, based on the size and stride of the filter, shrinks the output image.
For this scenario, we will have a 2×2 filter with a size of 2. This halves the image’s row and column count, simplifying the data without too much loss of specificity.
Python
model.add(MaxPooling2D(pool_size=(2,2)))
Most networks have at least one set of alternating convolutional and pooling layers.
More Convolutional Layers
Convolutional layers are designed to examine the low-level features of an image. If we add more, we may be able to start working with higher-level features.
We define the layer the same way we defined the previous one, but now we have 64 filters, not 32. We also do not specify the input shape, as it is inferred from the previous layer.
Dropout layers are layers that take a percentage of all input neurons and deactivate them randomly. This forces other neurons to adapt to the task. When larger and more complicated networks lack a dropout layer, the network risks being too dependent on a single set of neurons rather than all neurons learning. This is called Overfitting and can change your network output for the worse.
Below, we will have our dropout layer take 30%, or 0.3 neurons to deactivate randomly.
Python
model.add(Dropout(rate=0.3))
Dense and Flatten Layers
After all the convolutional and pooling layers, we will need a layer to help make our final decision. This will be a regular, fully connected dense layer. Before we connect this layer, we will need to flatten the image’s filters.
We can start by flattening the image using the Keras Flatten layer.
Python
model.add(Flatten())
Now, we can add a dense layer with ReLU activation and 32 neurons.
Python
model.add(Dense(units=32,activation='relu'))
Sign up for our newsletter!
Output Layers
Similar to the fully connected neural network we made in the previous guide, we will need a layer to shrink the previous dense layer down to just the number of classes. Also similar to before, the final output is decided by using the class with the highest weight.
Below, we will add a dense layer to be our output layer. The number of neurons should be 10 because there are ten possible output classes, and the activation should use Softmax.
Python
model.add(Dense(units=10,activation='softmax'))
Model Summary
Now, we can print out our model summary:
Python
model.summary()
Expected Output (Lines Providing no Useful Data are Blurred)
Now we will compile the network. The loss and metric will be the same as the ones that we use in the previous guide, Categorical Cross Entropy and accuracy respectively. However, we will use RMSProp (Root Mean Squared Propagation) as our training algorithm. RMSProp is one of many training algorithms that Keras can use to teach the network how to actually improve, optimizing the loss to make it as small as possible. We will achieve this using RMSProp.
The fit function is the one that actually does the training.
Now we can look at the parameters of the training function.
train_images and train_labels state the data that this neural network model will be trained on. The images are the pieces of data given to the network, and the network tries to find out the appropriate label
batch_size allows us to put the network’s data into batches. We can always change it later, but for now we have set it to 64
epochs defines the number of epochs (times the network reiterates on the training data) the network should use
validation_data defines the data the model is testing itself on
We have turned shuffle on so Keras shuffles the training data after every epoch and isn’t relying on the order of the data to train on
Now, we have to test the model on data it hasn’t seen yet. To do this, we will use the evaluate function. Loss and accuracy are percentages returned in decimal format.
In this guide, you will learn how to create an AI that recognizes handwriting with Python using Dense neural networks and the MNIST dataset. This guide will use TensorFlow to train your AI, and basic knowledge of linear algebra used in AI is strongly recommended. You can refer to this guide to understand the linear algebra used in AI. In the next part, we upgrade the neural network’s accuracy using convolutional neural networks.
Prerequisites
To do this, you will first need to install Python and add Pip to the .bashrc file for Linux or the Environment Variables in Windows or Mac. Then, run the command below to install the required libraries:
Note: If installing TensorFlow does not work, you can run pip install tensorflow. This will function like normal, but it will not be able to utilize your GPU.
Writing The Code
In a new Python file, we will first import the dataset and import the libraries needed:
Python
import tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras.datasets import mnistfrom tensorflow.keras import backend as Kimport numpy as npimport matplotlib.pyplotas pltfrom tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Flatten
We then define some functions that will help us visualize the data better later on in the code. I will not go over how they work, but they are not a necessity, just there to help us visualize the data better:
In the MNIST Data set (the dataset that we will be using), there are 60,000 training images and 10,000 test images. Each image is 28 x 28 pixels. There are 10 possible outputs (or to be more technical, output classes), and there is one color channel, meaning that each image is stored as a 28 x 28 grid of numbers between 0 and 255. It also means that each image is monochrome.
We can use this data to set some variables:
Python
img_rows =28# Rows in each imageimg_cols =28# Columns in each imagenum_classes =10# Output Classes
Now, we will load the train images and labels and load in another set of images and labels used for evaluating the model’s performance after we train it (these are called test images/labels).
What Are Images and Labels?
These can also be data and labels. The data is the context that the computer is given, while the labels are the correct answer to predicting based on data. Most of the time, the model tries predicting labels based on the data it is given.
The next step is not required, and we don’t make use of it throughout the code, however it is recommended, especially if you are using a Python notebook.
The next step is to create a duplicate, untouched version of the train and test data as a backup:
Now, we test to see if we loaded the data correctly:
Python
print((train_images.shape, test_images.shape))
Expected Output
((60000, 28, 28), (10000, 28, 28))
Why Are They Those Shapes?
The images are 28×28, so that explains the last two dimensions in the shape. Because the data is stored as a long matrix of pixel values (this is not readable to our neural network, by the way; we will fix this later), we do not need to add any more dimensions. If you remember what I said earlier, you will know that there are 60000 training images and 10000 testing images, so that explains the first dimension in the tensor.
The whole purpose of this tutorial is to get you comfortable with machine learning, which is why I am going to let you in on the fact that data can be formatted one way or another, and it is up to you to understand how to get your datasets to work with your model.
Because the MNIST dataset is made for this purpose, it is already ready-to-use and little to no reshaping or reformatting has to go into this.
However, you might come across data you need to use for your model that is not that well formatted or ready for your machine learning model or scenario.
It is important to develop this skill, as in your machine learning career, you are going to have to deal with different types of data.
Now, let’s do the only reshaping we really need to do, reshaping the data to fit in out neural network input layer by converting it from a long matrix of pixel values to readable images. We can do this by adding the number of color channels as a dimension, and because the image is monochrome, we only need to add one as a dimension.
What is a Shape in Neural Networks?
A shape is the size of the linear algebra object you want to represent in code. I provide an extremely simple explanation of this here.
What is a Neural Network?
A neural network is a type of AI computers use to think and learn like a human. The type of neural network that we will be using today, sequential, models the human brain, consisting of layers of neurons that pass computed data to the next layer, which passes it’s computed data to the next layer, and so on, until it finally passes through the output layer, which will narrow the possible results down tohowever many output classes (desired amount of possible outcomes) you want. This whole layer cycle begins at the input layer, which will take the shape and pass it through to the rest of the layers.
Python
train_images = train_images.reshape(train_images.shape[0], img_rows, img_cols,1)test_images = test_images.reshape(test_images.shape[0], img_rows, img_cols,1)# Adding print statements to see the new shapes.print((train_images.shape, test_images.shape))
Expected Output
((60000, 28, 28, 1), (10000, 28, 28, 1))
Now, we define the input shape, to be used when we define settings for the model.
What is an Input Shape?
An input shape defines the only shape that the input layer is capable of taking into the neural network.
We will begin data cleaning now, or making the data easier to process by the model.
First, let’s plot the digit 5 as represented in the MNIST dataset:
Python
plot_image(train_images,100, train_labels)
This should output the following plot:
Now, let’s see what the numbers representing pixel intensity look like inside the image:
Python
out =""for i inrange(28):for j inrange(28): f =int(train_images[100][i][j][0]) s ="{:3d}".format(f) out +=(str(s)+"")print(out) out =""
Expected Output (Lines Providing no Useful Data are Blurred)
In order to help us visualize the data to another degree, let’s run the function below to show what the minimum and maximum values of the data are (the largest and smallest value in the data):
Python
show_min_max(train_images,100)
Expected Output
0 255
Now we can start the actual data cleaning. As you saw above, the data in the image is represented as an integer between zero and 255. While the network could learn on this data, let’s make it easier for the network by representing these values as a floating point number between zero and one. This keeps the numbers small for the neural network.
Sign up for our newsletter!
First thing’s first, let’s convert the data to a floating-point number:
Now that the data can be stored as a floating point number, we need to normalize the data all the way down to 0 to 1, not 0 to 255. We can achieve this by using some division:
Python
train_images /=255test_images /=255
Now we can see if any changes were made to the image:
Python
plot_image(train_images,100, train_labels)
The code above should output:
As you could see, no changes were made to the image. Now we will run the code below to check if the data was actually normalized:
Python
out =""for i inrange(28):for j inrange(28): f =(train_images[100][i][j][0]) s ="{:0.1f}".format(f) out +=(str(s)+"")print(out) out =""
Expected Output (Lines Providing no Useful Data are Blurred)
As you can see, the image is not affected, but the data is easier for the neural network to deal with.
If we don’t want to have to stifle through all those numbers but still check to see if we have cleaned the data correctly, let’s look at the minimum and maximum values of the data:
Python
print("The min and max are: ")show_min_max(train_images,100)
Expected Output (Lines Providing no Useful Data are Blurred)
The min and max are: 0.0 1.0
We could start building the model now, but there is a problem we need to address. MNIST’s labels are simply the digits 1 to 9 because, well, the entire dataset is just handwritten digits 1 to 9. However, due to the nature of neural networks, they inherently believe that the data is ordered (i.e. 1 is more similar to 2 than 7, when in reality 7 looks more like the number 1, but they do this because from a mathematical perspective 1 is more similar to 2), which is wrong. To do this, convert the data to a categorical format, one that Keras won’t think is ordered, making it view each number independently:
Training done on datasets are called epochs. Each epoch is one complete pass over the entire dataset. Generally speaking, most epochs yeild more accurate results, but take a longer time to train. Finding the balance between reasonable time and good results is important when developing an AI model.
For now, we are just going to be training the model with ten epochs, but this number can be adjusted as you wish.
Python
epochs =10
In this tutorial, we will be making a sequential model. In the future, you may need to make other types of models.
Defining our model:
Python
model =Sequential()
Now, we need to add the first layer (also called the input layer, as it takes input):
Python
model.add(Flatten(input_shape=(28,28,1)))
That layer is a flatten layer. It will convert the data into a long string of numbers, but in a way that the neural network can understand. We prepared the data for this earlier. Because it does not know what shape the data is stored as, we have to specify it in the input_shape parameter.
Now, we can add the layers needed.
We will add a Dense layer below, which will perform predictions on the data. We can configure a lot here, and in the future as a machine learning engineer, you will need to learn what the proper configurations for your scenario are. For now, we are going to use the activation function ReLU and put 16 neurons in this layer.
What is ReLU?
ReLU is an activation function that stands for Rectified Linear Unit. It uses the property of nonlinearity to properly rectify data sent through it. For example, if a negative number is passed through it, it will return 0.
Python
model.add(Dense(units=16,activation='relu'))
Finally, we will add the output layer. It’s job, as implied in the name, is to shrink the amount of possible outputs down to the number of output classes specified. Each output from this layer represents the AI’s guess on how likely one of its guesses is to be correct (in computer vision terms, this is known as the confidence).
We will make sure that the neural network shrinks this down to ten output classes (as the possible outputs are the digits zero to nine) by putting ten neurons into it (as you probably guessed, one neuron will output its guess on how likely it is that it’s correct), and by using the Softmax activation function to do so.
What is Softmax?
Softmax is an activation function that distributes the outputs such that they all sum to one. We are using it as the activation function for the final layer because our neural network is outputting something that could be interpreted as probability distribution.
Python
model.add(Dense(units=10,activation='softmax'))
Now, we can see an overview of what our model looks like:
Python
model.summary()
Expected Output (Lines Providing no Useful Data are Blurred)
As you saw above, our model is sequential, has three layers that reshape the data, and already has 12,730 parameters to train. This means that the network is going to change 12,730 numbers in a single epoch. This should be enough to correctly identify a hand-drawn number.
Now, we have to compile the network and provide data to TensorFlow such that it compiles in the way that we want it to.
What do All the Arguments Mean?
The Optimizer is an algorithm that, as you probably guessed from the name, optimizes some value. Optimizing a value can mean either making it as big as possible or as small as possible. In a neural network, we want to optimize the loss (or how many times the neural network got the data wrong) by making it as small as possible. The optimizer is the function that does all this math behind the scenes. There are many functions for this, each with their own strengths or weaknesses. We will use Adam, a popular one for image recognition as it is fast and lightweight.
The Loss is the difference between a model’s prediction and the actual label. There are many ways to calculate this, which is why it is important to choose the right one. The loss function you need varies based on the how your neural network’s output should look like. For now, we should just use Categorical Cross Entropy.
The Metrics. For convenience purposes and to better visualize the data, TensorFlow allows the developer to choose which additional metrics it should show to supplement the metrics already shown during training. Accuracy, or what percent of input images the model guessed correctly, is one metric that can be visualized during training. It is similar to loss, but is calculated in a separate way, so accuracy and loss won’t necessarily add up to 100% or be direct inverts of each other.
Once our model is compiled, we can fit the model to the training data that we prepared. We will use the actual training data to train the model in a way that lets it recognize numbers.
The train_images is the dataset that will be the inputs given to the model, while the train_labels will be like the answer to the questions, helping us keep track of if the network’s guess was correct or not. The epochs will be the amount of epochs it needs to run. This will be set to the variable we defined earlier.
You can notice how, as the epochs progress, the loss goes down and the accuracy goes up. This is what we want!
However, knowing the labels to all the data basically makes those metrics useless – after all, you are just giving the model an answer – so we need to evaluate the model to see how well it could really do. We can achieve this by evaluating the model on test data – data the model has never seen before.
The <model>.evaluate function takes the testing data, as well as the trained model, and evaluates the model, producing a set of metrics (also called scores) that show how well the model really did on unforeseen data.
Although the function is taking the test labels, the function never shows this data to the neural network, only using it to grade the neural network on how well it did.
As you saw above, both the loss and accuracy seem to be pretty low. This is because both the loss and accuracy are stored as precents in the form of decimals. This means that, for the output above, the loss is 16.57% and the accuracy is 95.28%. That is pretty good.
Using Our Model
First download this imageto the same folder as the Python file, and name it test.jpg.
Now, run the code below to predict our image using <model>.predict:
It probably got the answer wrong. This is because it’s used to inverted images, meaning light handwriting on dark paper. To do this, we simply need to invert the image colors:
It probably got the answer correct. You have successfully built a neural network!
Exporting The Model
To do this, simply run the code below (which saves it to a file called my_model.h5:
Python
model.save('my_model.h5')
Now if you ever want to refer to it again in another file, simply load in the sequential model:
Python
model = keras.models.load_model("my_model.h5",compile=False)
Flaws in Our Code
There are flaws in out model. Firstly, if you tried evaluating it on multiple images, you may have noticed that it was not accurate. This is because if we want it to recognize an image, we have to optimize it for that image.
Because all of the training images were white on black, it has to do a lot of guessing when it gets confused on an image that is black on white.
We can fix this with convolutional neural networks.
It recognizes the small parts and details of an image, will be much more accurate, and will be better with more general data.
Follow along for the next part, where I teach you how to optimize this with convolutional neural networks.
Before we start this project, you need to know a little bit about GPS. Satellites will send signals on their distance from the module. Once four or more satellites are connected and giving out this data, the receiver can then use the data to figure out the exact location of the user.
The receiver will then present this data in the form of NMEA sentences. This is a standard communication developed by the National Marine Electronics Association. We will be using TinyGPS++ to parse these.
Using the GPS with Arduino
For this part, you don’t need the LCD. This will show you how to log the GPS output to the serial monitor. We will then parse this data using TinyGPS++.
Preparations
First, open Arduino IDE and you will be greeted with a blank sketch.
At the top toolbar, click Ctrl + Shift + I to bring up the library manager and type “TinyGPSPlus” and install the top result.
Sign up for our newsletter!
Code
Now that we are all prepared, lets start writing the code. First, we include the library that helps us communicate with the GPS.
#include <SoftwareSerial.h>
Next, we include the library that parses NMEA sentences.
#include <TinyGPSPlus.h>
Now, declare the communication between the Arduino and the GPS and then the parser.
SoftwareSerial ss(4,3);
TinyGPSPlus gps;
After that, we go inside the void setup() function and we initiate the communication between the computer and the Arduino and the GPS and Arduino.
Serial.begin(9600);
ss.begin(9600);
Next, we go into void loop() and specify that whatever is below this line of code should only happen when the Arduino receives a signal.
while (ss.available() > 0)
Then, we encode the data in a format that the GPS can then parse.
gps.encode(ss.read());
Then, we create an if block so the serial monitor only displays our data when the data the GPS is outputting is valid.
if (gps.location.isUpdated()) {
}
Now, inside the if block, we can access all the data and print it to the serial monitor.
Serial.print("Latitude= ");
Serial.print(gps.location.lat(), 6); //6 for 6 decimal places
Serial.print("Longitude= ");
Serial.print(gps.location.lng(), 6); //6 for 6 decimal places
Your full code should look like this:
#include <SoftwareSerial.h>
#include <TinyGPSPlus.h>
SoftwareSerial ss(4,3);
TinyGPSPlus gps;
void setup() {
// put your setup code here, to run once:
Serial.begin(9600);
ss.begin(9600);
}
void loop() {
// put your main code here, to run repeatedly:
while (ss.available() > 0)
gps.encode(ss.read());
if (gps.location.isUpdated()) {
Serial.print("Latitude= ");
Serial.print(gps.location.lat(), 6); //6 for 6 decimal places
Serial.print("Longitude= ");
Serial.print(gps.location.lng(), 6); //6 for 6 decimal places
}
}
Wiring
The wiring is shown below:
GPS RX > Digital 4 on Arduino
GPS TX > Digital 3 on Arduino
GPS VCC > Power 3.3V on Arduino
GPS GND > Power GND on Arduino
Uploading
Now, with the Arduino IDE opened and the code ready, press Ctrl + U on your keyboard. The code will write and start outputting to the serial monitor, which you can access by pressing Ctrl + Shift + M on your keyboard or by going to the top toolbar and clicking Tools > Serial Monitor. The GPS will take a couple minutes to get its location. You may want to stick the antenna outside for this, as it will take a long time to get its location indoors.
Soon, you will be able to view the data coming in.